It's nothing new (someone correctly pointed the EFF project) but I wanted to make a real world demo out of it.
The demo doesn't store each bits of info separately, it simply creates an hash out of them. If I stored the data separately I could for example identify small user agent updates or screen resolution changes or newly installed plugins and so on.
Also, many of the info can be gathered without JS and actually browsing with NoJS puts you in a very restricted niche making you even more trackable ;)
The demo is far from perfect but I believe that even a 90% reliability is alarming. Anyway I'll put all the source code on Github for you to review. I hope to be able to add the NoScript code as well soon.
Commercial implementations of this sort of tracking include the user's IP in their dataset, but track a number of other datapoints so they can tell if e.g. everything but a user's IP matches an entry in their database, it's probably the same person connecting from a different location.
But really, browsers should stop allowing scripts to access the full list of available fonts. What use does a website have for that data, anyway? Any site that doesn't want to use one of the standard fonts should be using webfonts nowadays.
Regardless, it's a very interesting idea, and also picturing how difficult and counter-intuitive security can be if you do not study such issues, as an API designer I would surely have a hard time foreseeing that exposing the screen size or fonts list can turn out to be a security issues for the users.
May not even track from one browser to the next. Camino and Safari generates different fingerprints on my machine.
So its another demonstration of flash being ridiculously insecure. These guys did it better, even defeating tor to reveal the origin IP. http://dl.packetstormsecurity.net/0610-advisories/Practical_...
*The following is your unique fingerprint on the web:*
You'd be surprised.
Currently, we estimate that your browser has a fingerprint that conveys 11.9 bits of identifying information.
The amount of information conveyed by the HTTP_ACCEPT headers is especially preoccupying. There is nothing in there, apart from maybe the language, that should leak any info, on a modern browser. And certainly not 10 or 16 bits of info.
Huh, that's awesome ... in a bad way. According to that page , both my system fonts and browser plugin details are unique among the browsers they've tested thus far.
TL;DR serverside code can fingerprint you
Would it not make sense for my OS to sandbox which fonts can be accessed by my browser? If a webpage wants to use a special font-family, I could be prompted to allow/block access to my greater font library.
3rd step profit
That's just my point of view, of course, but how many others will share it? I'd wager a good number would.
I built a demo a year ago that let you store personal data and exposed a postmessage API for storing and sharing permissions and personal data with sites as kind of the beginnings of a poor man's client-side only Oauth.
After typing "meow" and hitting enter. The copy paster url had this to say about me "It seems you didn't save the word. Go to lab.cubiq.org/underpants first."
The unique fingerprint is also different. "93615388f7f54cd79d2f806ac3795c182217aa9b" somehow became "f37ec3fdd05c27c13cbb7fcdef95cc004297f62d" after copy-pasting.
Other than that technical glitch for me (Linux, Chrome latest unstable version), I still think this is actually a pretty good idea. But will websites use it now that the ones we actually want to worry about are injected into every website via Tweet and Like and + and whatever buttons.
Google in particular is everywhere with their gAnalytics tracking code.
edit: now that I think about it, I may have misunderstood the point. Was it a proof of concept of providing cross-site tracking without tying to a personal identity?
If not, insecurities in cross-site whatever hardly matter when I am logged into every little tidbit that is loaded via iframe and appears on almost every website. Even porn sites have like buttons these days.
Getting all pages on the web to remove the old image based cookie tracking in support of JS etc... also will never happen unless extreme circumstances occur. Most of the people running ad sites have no idea what you are talking about anyway in realms outside of Cookie and Tracking.
Sorry, I have CS3 installed and fully licensed. But still a nice demo.
I'm guessing you're on Mac OSX? Can you have fonts available that aren't apparent to the browser somehow?
It is not the technology, but the evil application of said technology that is evil.
When I went I saved the word "what" against the fingerprint of 0e24f67890fb99dfd6fa147adc5634224e6cf509
Then, I opened a new tab, copied and pasted the url to ghosttouch and was given this fingerprint: 0373164e6053f6d4d1e0cea156be83e5a45e13d4
So, out of curiosity I copied and pasted the url in the same tab where I had "saved" my word:
It generated the same code.
So then I went back and re-saved my word, went to the ghosttouch site again and this time it loaded my word.
Something wonky in there but I'm not sure what.
This is scary not because it enables them to provide more relevant ads, but because it enables them to sell personal information to organizations like corporations and the government. Imagine your boss being able to buy a package that tells him what type of porn you like, how often you view porn, your most visited subreddits, etc.
And often from this information, these companies (based on aggregate data) can make more sweeping generalizations (that are often incorrect, but also often right on the mark) like income bracket, ethnicity, drug use habits, sexual orientation, etc. These approximations can also be bought.
Imagine that your 'package' has some information in it deducing that you regularly use cocaine. Even if this is not true and the person observing this information knows it might not be true, the fact that it has been stated might be enough to lose you some important opportunity.
I don't know how common it is for someone to buy this information, but I know that the information is already out there and the potential for things like this is very large.
Or is the point that disabling cookies is not sufficient to avoid being tracked? Everyone has cookies enabled (or many sites don't work), so if that's all it is, nbd..
Just to emphasize that this is meant to demonstrate privacy risks, not to be taken as a feature suggestion...
Ipad2, jailbroken with AdBlocket running, no idea if that was a factor or not.
"Going incognito doesn't affect the behavior of other people, servers, or software. Be wary of:
- Websites that collect or share information about you"
Incognito only prevents information from being stored on your computer, not anyone else's.
- Enter word in normal browser window
- In an incognito window, go to the URL and retrieve your word.
Or should we regulate how analytics software handles data instead?
No flash FTW!
Followed by, "It seems you didn't save the word." on the two connecting websites.
Safari 5.1.5, OS X 10.6.8, no add-ons installed. I do, however, have Flash configured to not allow any websites access to local storage unless I specifically say so.
Step 1: Collect underpants
Step 2: ???
Step 3: Profit!!!