The math behind it is quite simple and very reliable for many datasets, which makes it very easy to build robust fingerprints based on browsing / location / behavior data. In my opinion, this is what most big companies rely on today for identifying users, as this is more robust than cookie-based mechanisms, which become more ineffective as the use of multiple devices and blockers increases.
Here's the link to the video (you can choose the language in the menu, by default it's German but the talk is also available in English and French):
Correct, another thing this is used for is to identify mobile/pc pairs belonging to the same person.
It might be similar to what happened to passwords. Passwords are still important, but they are just one signal among multiple others.
So I opened an incognito window and went through the process, no problems. Closed all incognito windows. Opened another, went to the site, and my quote was up, in a new incognito tab.
It was jarring because it meant that they were tracking me somehow, obviously not through the standard mechanisms as I was incognito.
Most likely, it is only your ip.
What I didn't understand: how was this data captured? 3rd party tracking through ad-networks and alike? (Would I be safe with no-script or even some privacy aware adblock?)
the tl;dw would probably amount to: 1. extensions like Web of Trust, CrossSite Requests (JS) and lesser sources like cookies if available.
isnt this fully blocked by a pi-hole (dns server with tracker/advertising rules)? even the extension tracking should be completely killed by that...
> Individuals behave differently in the world when they are in different contexts. The way they act at work may differ from how they act with their family. Similarly, users have different contexts when they browse the web. They may not want to mix their social network context with their work context. The goal of this project is to allow users to separate these different contexts while browsing the web on Firefox. Each context will have its own local state which is separated from the state of other contexts
Google is a firefox profile, facebook / whatsapp is another, main browser is another profile
I looked here: https://developer.mozilla.org/en-US/docs/Mozilla/Command_Lin...
Do not accept or send remote commands; implies -new-instance.
Differential Privacy uses the concept of adding 'noise' to a dataset to make it statistically provable, that, if queries are supposed to only be A, B, C... then the adversary can only tell the target from noise with some probability - The point of restricting access to a subset of queries isn't really the main point of how DP improves privacy, but rather, restricting access to a subset of queries makes the formal proofs of things doable.
I agree with you on the first part of you post, but this part is a little off the mark. In the original paper, Cynthia Dwork confronts the issue you point out head on; they actually start with an impossibility proof that show no treatment of the data will get you the property "access to a statistical database should not enable one to learn anything about an individual that could not be learned without access". The impossibility result relies on the existence of outside datasets.
DP instead tries to quantify the probability of identification, and adds differing amounts of Laplace noise to get this. The idea is that the dataset shouldn't look "too different" with or without your information in it. If your participation doesn't change the dataset much, how could someone tell if you are in it or not, or moreover link you to a data point in it?
One can provide differentially private access to web histories that provably mask the presence/absence of individuals in that dataset, even when combined with arbitrary exciting side information, like social networks.
Even with something as simple as differentially private counting, you can pull out correlations between visits, finding statistically interesting "people who visit page X then visit page Y surprisingly often" nuggets, which are exciting to people who don't have their own search logs.
It is extremely hard not leak personally identifying information and combining datasets it is fairly easy to identify unique individual.
Think Google must have been aware of the issue of intermingling Search with Social as a pretty short time later they scrapped the search query referers from (organic) Google Search.
Note: It was just a proof of concept to see if I could. Also: Nobody used G+ anyway.
Its still possible to get "aggregate" information about keywords via Google Webmaster tools.
Google Search Console data doesnt let you connect google search queries to sessions.
Compartmentalization mitigates that threat. I have multiple online personas, and Mirimir is the only one who goes on about privacy, anonymity, etc. The only one who visits HN, Wilders, etc. Mirimir and other online personas also share no contacts. However, Mirimir does use pseudonyms ;)
Makes me think someone should make a Chrome extension that will re-write everything someone posts based on the concepts they are trying to get across. To avoid being identified via stylometry.
And then there's the general 'vibe' of the forum which shapes how a pseduonym writes. If I were to write the same thing on HN versus IRC I would hope that the styles would be very different.
What's really difficult is comparmentalising information, so that even on the same website two pseudonyms don't demonstrate that they have the same knowledge.
Is it worth the bother? Possibly not, but I find it also makes me concentrate on what I'm writing.
That said, deanonymization methods stack up; add geotemporal correlation of activities and one could presumably connect your various identities together ;).
There's no geo, for me. And I'm not at all organized, so not much temporal, either.
Bot or astronaut?
Of course if a couple of grad students could do this in a few months with only publicly available data (plus donated browsing history), then I'm sure ad networks could easily do this too, and given how many ad networks have parts of your browsing history I would say that it's scary that it's so "easy."