I also highly recommend this blog:
Mooers was annoyed that the computer scientists decades after him re-discovered he principles he first worked on, but without citing him. Knuth points out the connection.
Basically I'm curious in plain English, what kind of application would use this and for what? Also, am I understanding correctly that this is simply a sort of low fidelity hash table?
I was not familiar with cockoo filters (I am familiar with cockoo hash though), using them as a 'better bloom' seems to be a relatively new development. A quick googling found 'Cuckoo Filter: Practically Better Than Bloom' [https://www.cs.cmu.edu/~dga/papers/cuckoo-conext2014.pdf].
Shipping a full list of malicious URLs would be too expensive, but a bloom filter fits in a fraction of the space and can still eliminate a vast majority of the API calls.
Everything related to caching when working with a huge dataset is a good use-case. Another case would be when the dataset does not fit into memory, but you need very fast lookups.
Edit: To clearify, you are obviously not caching the data itself, but a flag whether to use the cache/do nothing or compute something.
from the original research paper and the repository itself, thank you for asking!
Suppose I have a secret list of websites from whom you shouldn't load ads. I can then just publish a bloom filter which you can use in your browser to see if a web request should be allowed or not, without having access to the original list. I know this is contrived, but hopefully you get the idea.
Is the entire useful concept patented at this point?
// tries deleting 'bonjour' from filter, may delete another element
// this could occur when another byte slice with the same fingerprint
// as another is 'deleted'