Hacker News new | past | comments | ask | show | jobs | submit login
Why Bloom filters work the way they do (michaelnielsen.org)
95 points by michael_nielsen on Sept 26, 2012 | hide | past | favorite | 7 comments



If you like blooms filters, you may like many variations of it, and some of its applications:

A Garden Variety of Bloom Filters[1]

[1] http://matthias.vallentin.net/blog/2011/06/a-garden-variety-...


A great post. I wish we saw more like it on Hacker News.


Very well written. It's the first technical CS article I've read in a while and really enjoyed it.


Interesting, I read the introduction and could only think of three steps myself. Apparently you and I think of hashes in a different way. I went directly from 'list of urls' to 'hash table of urls' (with a very minor note of 'we only need to store yes/no so make the bucket size one bit'). This skips a large part of your train of logic entirely. From there it's the same 'use a couple hashes to avoid collisions', 'use overlapping storage because it's more efficient'.


Thanks! I enjoyed the way you approached the explanation by exploring the problem instead of the solution.


Thanks! Yeah, sometimes solutions alone are opaque -- it's often illuminating to explore other approaches, and to understand why they don't work so well, in order to understand why a solution is the way it is.


I believe LevelDb is getting bloom filter to alleviate the key not-found performance issue soon.

Very cool article.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: