Hacker News new | past | comments | ask | show | jobs | submit login
Why Bloom filters work the way they do (2012) (michaelnielsen.org)
99 points by BIackSwan on Feb 1, 2014 | hide | past | favorite | 6 comments



He missed the best way to generate hashes. Most people do: http://spyced.blogspot.com/2009/01/all-you-ever-wanted-to-kn...


Great to see people getting into this useful data structure.

But this blog post makes a simple topic look super complicated. When all the math formulas are not needed to make the explanation work, they come off as just a gratuitous attempt to impress the reader.

By contrast, Jon Bently, in his Programming Pearls 2nd Edition, page 145-146, has a much more concise and understandable description. The other nuances not mentioned fall out intuitively from what he describes there.

I think needlessly complex descriptions of bloom filters are one of the reasons the approach seems less known than it should be.


Reminds me of "Smaller than Bloom filters"[1] by Adam Langley (who's working on Chrome).

[1] https://www.imperialviolet.org/2011/04/29/filters.html


Every time I speak to a group, I take a quick survey as to how many people know what a bloom filter is. Even at the most nerdiest of tech talks, it's rarely above 50%.

It amazes me that this is not something that is taught in school these days (or maybe it is now?), considering it is such a powerful tool in a distributed environment.


Really?

Maybe it's environment specific, but bloom filters seem to be the "an aglet is the plastic thing at the end of a shoelace" of CS, ie., the "little know fact" that everyone knows.


People forget things. I probably remember about 30% of CS class.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: