It only generates a hash of the first, what, 128 bytes of the page, so any
pages with the same leading 128 bytes will 'collide' (in the sense that
the first 128 bytes are identical, but the rest are not).
and it does "for sure" make sense since it's just a way to speed up matching, and just using 128 bytes means much less loading from memory. Tuning the 128 might make sense, potentially.