
What are Bloom filters, and why are they useful? - nikcorg
https://sc5.io/posts/what-are-bloom-filters-and-why-are-they-useful/
======
TTilus
Bloom filters are awesomely nerdy stuff. But seriously, blog with no comments
section? D'oh.

Would have liked to ask how do you decide how long to "teach" so that you
don't degrade your filter? And also when speaking about efficiency the blog
post totally omits the overhead cost of "teaching" (or "warming up" if you
have cache mindset).

~~~
maxpagels
Ideally, you'd initialise the filter with all the elements at startup.
Obviously, the complexity for this is dependent on the number of elements to
be added + a constant overhead for the number of hash functions k. So, for n
elements, it would be O(kn) or O(n).

Adding elements degrades the filter and and deciding when to stop adding
elements is purely down to what worse-case false positive probability you are
willing to accept. The equation for this can be readily found online, but put
simply, the probability p is a function not only of the number of bits in the
filter's array, but also the number of elements already added to the set.

