Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Take the first 8 bytes, treat it as a 64-bit long integer, and then convert to a 64-bit floating point number and divide by 2^64 to get a number in the range 0.0-1.0.

Multiply this by the number 'n' hashes in your file, and it'll very closely approximate its ordinal in the list, because as you said hashes are equidistributed very well.

Now multiply this ordinal by 20 and load a small slice of the file a few KB to either side of where you're aiming so that there's essentially a 100% chance of finding the hash in one "random read" operation.

If not, you now have a block of hashes that is close, but not quite what you want. You can use the floating point conversion trick to see "how far away" you are, calculate a more accurate estimate and do a second read. It's very unlikely you'd need a third read step.



Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: