Hacker News new | past | comments | ask | show | jobs | submit login

It still is pretty clever and non trivial to come up with.

I bet most people who haven't thought about it beforehand or heard the solution, would fail an interview question as "So, we have a database of 100s Million passwords and we want the user to check if his password is inside; without the user actually sending his password.. or his hash; without us outright revealing all our passwords/hashes to the user."

Their solution is a pretty good tradeoff imho.




This is pretty much the case for probabilistic membership functions. It's kind of a broken bloom filter where you take only the front few bits without hashing. I wouldn't expect someone to come up with that under the interview stress, but if you know a few basic algorithms, this should be pretty simple/familiar.

It's also similar to how we spread mail folders in the past to avoid large directories. Or how Debian splits repo by package prefix: http://ftp.us.debian.org/debian/pool/main/


Exactly and good examples. I believe the general concept is just sharding, for anyone that wants to google it further. Typically naive sharding on the first few bytes is useless because it results in uneven distributions but (and this is what I mean when I said SHA was designed for this) a cryptographic hash should result in a uniform random-appearing distribution across all its bits, meaning the resulting hash is uniquely suitable for basic sharding by prefix.

But then they wouldn’t have been able to give it a fancy name if they called it what it is.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: