

Anonymous Browser Fingerprinting in Practice - SanderMak
http://valve.github.io/blog/2013/07/14/anonymous-browser-fingerprinting/

======
fingerprinter
When these things come up on HN I pipe up a bit.

I wrote several implementations of passive device fingerprinting for
fraud/security firms and they are implemented across a broad range of
industries and companies. I know quite a bit about the shortcomings of these
and the challenges faced, in case someone is interested.

Also, just FYI, the big use these days is actually NOT in security or fraud,
though most of the companies selling these still say that is the case. The use
is in advertising. Most of the companies selling these still call themselves
"security" or "fraud" companies, but the revenues from advertising are far
outpacing the former categories.

~~~
charonn0
Have the fraud and security markets shrunk, or has the advertising market
expanded?

~~~
fingerprinter
I think is has to do more with the role of security/fraud in companies vs
advertising.

For instance, security doesn't get much attention and people view it as a
necessary evil. Fraud is about loss prevention in most cases, so the way you
show your value is how much you save the company in loses. Your revenue is
typically a fraction of those savings.

Most companies using tracking in advertising are still operating at a loss or
have fuzzier ideas of "revenue". TBH, advertising is weird. People spend money
like water without really knowing what they are getting in return. This
obviously won't last, but there is money to be made in the meantime. At least
that is my experience.

------
gnosis
And for anyone who hasn't tried it yet, here's EFF's site for testing how
unique your browser's fingerprint is:

[https://panopticlick.eff.org](https://panopticlick.eff.org)

------
ericcholis
Just a side note, I was momentarily confused regarding the author of this
article. I had thought that it was Valve Software's github account. It took
visiting [https://github.com/Valve/](https://github.com/Valve/) to realize
that it was a private account.

------
Geee
> Very few users have a staggering amount of fingerprints, for example 20-25.
> I don’t know if they have a lot of devices, use different browsers or
> something else.

They probably use mostly public / school computers to access the website.

~~~
phaer
Or maybe they are just changing their browser windows size by hand, which
would lead to a lot of different resolutions.

~~~
CiaranMcNulty
He's hopefully using screen width (window.screen.availHeight) rather than
browser width (window.screen.height)

------
Amadou
I've been thinking about developing a browser plugin that would spoof
fingerprints. Based on the URL in the address bar it would keep a discreet
cookie-store and fuzzed version of the variables that fingerprinting looks at.

For example - if you are on espn.com then espn.com and all the trackers
referenced from their pages would see a set of cookies and variables specific
to espn.com. Browse over to icanhascheezeburger.com and even if the lolcats
use the same trackers as espn.com does they would see a different set of
cookies and a slightly different set of values - like browserid incremented by
a .1 or a couple of non-essential plugins missing from the list of installed
plugins.

I welcome any _technical_ feedback on the idea.

------
ojilles
The operational problem with these things is how do you deal with time? Over
time folks upgrade plugins/browsers etc, making it not match the previous
fingerprint. Therefore you won't be able to stitch two sessions together as
belong to one person (who did the upgrade in between). Sounds to me directly
opposing the core idea behind these techniques. (E.g. bringing it from unique
identifier to persistent unique identifier)

~~~
Homunculiheaded
If you read the actual paper referenced [0] you'll see that they explicitly
mention this. Since they assumed that users were interactively trying to
change their foot print, the looked at uses who had changed their settings
within 1-2 hours:

"We ran our algorithm over the set of users whose cookies indicated that they
were returning to the site 1-2 hours or more after their first visit, and who
now had a different fingerprint. Excluding users whose fingerprints changed
because they disabled javascript (a common case in response to visiting
panopticlick.eff.org, but perhaps not so common in the real world), our
heuristic made a correct guess in 65% of cases, an incorrect guess in 0.56% of
cases, and no guess in 35% of cases. 99.1% of guesses were correct, while the
false positive rate was 0.86%. Our algorithm was clearly very crude, and no
doubt could be significantly improved with effort." (any typos here are likely
the result of copy/pasting from the pdf)

0\. [https://panopticlick.eff.org/browser-
uniqueness.pdf](https://panopticlick.eff.org/browser-uniqueness.pdf)

~~~
ojilles
Missed that, but basically confirms the point (not being able to track ~35% of
the cases is quite a big gap)

------
Renaud
One way to counter browser fingerprinting could be to add a plugin that simply
report random data when queried. This would invalidate the hashing.

Of course, if the plugin is well known, it could be removed by the
fingerprinting code, unless it can disguise itself as something else -more or
less random- when it is installed.

~~~
fingerprinter
I just have a couple of personal plugins that will change my reported UA
slightly, my fonts collection slightly and some other random bits. Just screws
it up enough so I know I'm not tracked.

~~~
ilikepi
Would you consider releasing these, or are you concerned if they were more
common they would become less effective?

------
ihsw
I'm curious as to how many fingerprints match multiple users, which could be
used to detect users logging into multiple accounts. This could be useful for
detecting compromised accounts, or for banning all of the accounts belonging
to a user trying to evade bans.

~~~
morpher
Wouldn't all default install browser installations on the same model of fresh-
from-the-box laptops be identical? They quoted a 10-15% false match rate
(excluding mobile phones which have a much higher false match rate due to
lower configurability).

~~~
D9u
The first thing I do when I get a new laptop is wipe the hdd and install my
preferred OS, which entails also installing my favorite browsers. (My OS
choice doesn't come with a browser, or much else it's all up to the end user)

As for mobile devices, here too I always install my preferred software, so I
don't see how mobile devices have lower configurability. That said, there
exists databases with which one may at least identify the mobile device in
order to serve up content appropriate for that device.

Alas, my tendencies make my devices uniquely identifiable!

(From Panopticlick) Your browser fingerprint appears to be unique among the
3,140,429 tested so far.

Currently, we estimate that your browser has a fingerprint that conveys at
least 21.58 bits of identifying information.

That's for FreeBSD running Xombrero.

~~~
morpher
Sure, but average computer users probably have a much more default
configuration, and thus the probability of unique match goes down.

That being said, people who tend to be worried about security also tend to
have more unique computing configurations and thus would be more identifiable.
Interesting.

------
jamesbrennan
I don't feel like this could accurately identify a return user as browsers
such as Google Chrome have an auto updater that goes so often. The version
number could likely be different when returning a few days later. Does anyone
have thoughts about this?

------
excitom
Security firms that sell this kind of service tout its accuracy, like it's
going to prevent fraud by uniquely identifying everyone. Interesting that the
author notes they stopped using it because of poor results.

------
gregod
Related project matching against a database of >3Mio. fingerprints:
[https://panopticlick.eff.org/](https://panopticlick.eff.org/)

~~~
alcuadrado
Did you even read the article? They link that in the first sentence.

------
stesch
NoScript to the rescue, again.

~~~
fingerprinter
In practice this is not effective, for as someone else noted, tracking can and
is done at server level as well.

Most implementations, though, make the assumption that if they get JS info,
that info is good. Better to give them wrong/inaccurate/changing JS info than
no info.

So, I leave JS on, use some personal plugins to change variables and give them
bad data on each request. They won't know and will just get bad fingerprints
that are not seen again.

See this comment:
[https://news.ycombinator.com/item?id=6046810](https://news.ycombinator.com/item?id=6046810)

~~~
bincat
Would it be possible have the source of those plugins open?

