Why Bloom filters work the way they do (2012)

jbellis · on Feb 1, 2014

He missed the best way to generate hashes. Most people do: http://spyced.blogspot.com/2009/01/all-you-ever-wanted-to-kn...

natch · on Feb 1, 2014

Great to see people getting into this useful data structure.

But this blog post makes a simple topic look super complicated. When all the math formulas are not needed to make the explanation work, they come off as just a gratuitous attempt to impress the reader.

By contrast, Jon Bently, in his Programming Pearls 2nd Edition, page 145-146, has a much more concise and understandable description. The other nuances not mentioned fall out intuitively from what he describes there.

I think needlessly complex descriptions of bloom filters are one of the reasons the approach seems less known than it should be.

pflanze · on Feb 1, 2014

Reminds me of "Smaller than Bloom filters"[1] by Adam Langley (who's working on Chrome).

[1] https://www.imperialviolet.org/2011/04/29/filters.html

jedberg · on Feb 1, 2014

Every time I speak to a group, I take a quick survey as to how many people know what a bloom filter is. Even at the most nerdiest of tech talks, it's rarely above 50%.

It amazes me that this is not something that is taught in school these days (or maybe it is now?), considering it is such a powerful tool in a distributed environment.

akavi · on Feb 1, 2014

Really?

Maybe it's environment specific, but bloom filters seem to be the "an aglet is the plastic thing at the end of a shoelace" of CS, ie., the "little know fact" that everyone knows.

jiggy2011 · on Feb 1, 2014

People forget things. I probably remember about 30% of CS class.