

How Chrome uses a Bloom filter as a quick malicious-site check - derwiki
http://blog.alexyakunin.com/2010/03/nice-bloom-filter-application.html

======
zck
>Bloom filter allows Chrome to use precise verification service practically
only when the user actually goes to a malicious web site.

Not quite true. This ignores priors
(<http://en.wikipedia.org/wiki/Prior_probability>) How many websites does an
average Chrome user visit in a day? Let's pull a number out of the air --
1000? If, as the article suggests, 1% of them are false positives, the bloom
filter will have 10 false positives per day.

How often does a user hit an actual malicious site? Once a day? a week? Let's
say once a day. So 1 true positive malicious sites, 10 false positives. Over
90% of positives are false! So most of the time when you need to precisely
verify the maliciousness of a site, the site is safe.

Of course, given that you only need to do this check a handful of times per
day, this seems like a valid tradeoff, but Bloom filters here are no panacea.

~~~
pudquick
Your math is a bit topsy-turvy. You're ignoring the 990 sites it _didn't_ have
to check.

That's 10 sites to check out of 1000 - 1% of the load of checking (remotely)
all 1000 sites (or 0.9% if you only count the false positives) wasted.

~~~
zck
I'm ignoring the 990 sites it didn't have to check because it wasn't relevant
to the quote I was discussing:

>Bloom filter allows Chrome to use precise verification service practically
only when the user actually goes to a malicious web site.

This is saying that most of the time the Bloom filter returns a positive (and
therefore Chrome needs to use precise verification), it's a true positive.
That's clearly not true.

~~~
corin_
But you've missed the entire point, which is that normally the browser would
have to run the slow check on the 1000 sites, now it only has to run the slow
check on 10 of them. That's a huge improvement, even if 9 of them turn out to
be false positives.

~~~
vecter
You're both right, and I'm pretty sure zck understands and appreciate the time
savings that you're talking about. He's just pointing that the author's
statement that "only malicious sites needed to be tested" is blatantly wrong.

~~~
boucher
Except, that's not what the author said. He said "practically" that.

------
vilda
Anyone who's interested in Bloom filters check a great blog post by Adam
Langley about its variants with links to papers:
<http://www.imperialviolet.org/2011/04/29/filters.html>

~~~
paulirish
context: Adam Langley is a Chrome developer.

------
bdb
Here's a link to the relevant part of the Chromium source tree:
[http://src.chromium.org/viewvc/chrome/trunk/src/chrome/brows...](http://src.chromium.org/viewvc/chrome/trunk/src/chrome/browser/safe_browsing/)

