There are a bunch of these exploits — I remember one a few weeks ago that posed as a mind-reading survey — and I think they can only be well and truly solved by a same-domain policy for :visited links. In short, don't apply :visited styling to a link unless that link is the same domain as the host page. This is the general security model on the rest of the web and it'd work here.
or better, have separate visited database per origin (i.e. rather than tracking state of `url` track state of `(url, origin)` tuples). This way e.g. links visited from HN would be marked as visited on HN and on HN only.
It wouldn't disclose any information the site can't already get by tracking clicks.
It still removes a large part of the utility of the feature. I want to be able to see if I've visited a link, even if that link is on some random blog that I've never visited before.
Isn't the point of this that the script code doesn't need access to anything via getComputedStyle? It's the user telling the script what color something is.
Alternatively, we could keep :visited, and remove most of the expressive power of css/javascript that allow these exploits to exist. It would be like time-travelling back to the mid-90s, when visited-link-coloration was invented!
The minimal "expressive power" required to pull off the simplest exploit is just a way to capture click events. That would be quite restrictive indeed.
Considering the work needed by the website to convince the user to give away the data, and even with approaches like described with the article, we may be overestimating what websites could learn of us by checking if we've visited some random 2, or 4, or 15 sites.
Yes, it's an invasion of privacy and has to be sanitized, but it's not like that websites can see all of your history, view it in chronological order, or know if you've visited this link 6 months ago or today. And plus, you need to make the user somehow disclose what he sees on the screen, which may often look suspicious.
And what would an adversarial website do with these {visitedlinks, IP} tuples? Hit me with personalized ads or sell that modicum of my history to some ad company? Big shit, I hit the reset button on my router, and I get a new dynamic IP address from the ISP. The site now knows nothing.
These work more as proof-of-concepts. The inconvenience they require to be collected, paired with the limited utility of the results, makes for an unattractive attack vector.
I agree that if someone wants to target specifically you and knows something about you, they can put this class of exploits to a more threatening use, such as (if you're at work) seeing if you've visited some company LAN URL. Or perhaps they can see if you've accessed the admin pages on some website they're targeting, so they can determine if you have admin rights there.
> Considering the work needed by the website to convince the user to give away the data, and even with approaches like described with the article, we may be overestimating what websites could learn of us by checking if we've visited some random 2, or 4, or 15 sites.
It's true, history hacks were a lot more "fun" when you could still check 10-100s of URLs per second without any user interaction necessary.
You could pull off some very cool and unexpected tricks by applying such knowledge in a clever manner. The important thing to realize is that what you can use history information for goes way beyond "I know where you surfed last summer". That's just a mild privacy problem.
Targeting specific end-points, profile pages, stuff like that, you can leak a lot more interesting information than just someone's browsing habits. Right up to the point where you could (for certain services, under certain circumstances) cook up someone's session key and it suddenly became an actual security exploit.
Still, a couple of those tricks (or perhaps new ones) might still be worth it even if you can only query 10-20 sites. I don't know. It's not a lot of information. Though 33 bits is enough to identify any human on Earth.
> you need to make the user somehow disclose what he sees on the screen, which may often look suspicious.
Well that's the thing what this article is about, right? How it works is, there is a 32x32 grid of "close"-buttons. But you only see one of them, all the others are CSS hidden. That one you click. By virtue of which button it was, the site now knows 10 history-visited-states.
As the article says, if that looked like a "close"-button on a pop-up ad, where's the suspicion?
I use an adblocker. I get a popup, it has nothing but a "close"-button in it. I figure, "guess it only blocked part of the ad or something" and click the "close"-button without even thinking. The pop-up closes, and I've already forgotten that it happened.
The only suspicion I see could be that the "close"-button might appear in a weird, off-centre place. But that matches perfectly with the wonky-adblocker-behaviour theory in my head, so I wouldn't think twice about it.
The canonical use of history stealing is to find exploits in a bunch of websites (e.g. online banks), then use history stealing to find customers of that site.
A number of attacks against, e.g. banks seem implausible ("but how would the attacker know that I bank at Wells Fargo?") until you learn about history stealing.
Why do you need history stealing for that. If you have an attack against Wells Fargo, why not launch it on everyone you can. You will get at least as many Wells Fargo users as you would if you pre-filtered (probably more because of false negatives).
But a phishing site such as "HTTPS-SECURE.required-bank-security-check.gmkla4lwgi.com" [hypothetical] could easily check for the most popular bank institutions in a user's history using this exploit and present them with the appropriate, familiar login dialog they'd expect.
You'd easily increase the amount of victims (conversions?) several times over.
You could also use it to identify high-value targets. Make a popular webpage, and make a record of all users who have visited internal.nsa.gov and the like.
>Big shit, I hit the reset button on my router, and I get a new dynamic IP address from the ISP. The site now knows nothing.
You don't do that often, though. The dynamic IP address is still one from a pool, that greatly reduces the randomness and there are other ways of identifying users.
Oh, now YouPorn knows I've been visiting PornHub too. So much valuable info! Too bad I use them both just to look at chicks being banged.
Seriously though, I'd like to see what the other 45 websites are and what they do, but the PDF linked 404's. And I reckon this is the getComputedStyle approach, and that was, AFAIK, fixed. The new ways of exploiting the styled :visited links are going to require more work.
Oh neat, I thought this was broken at first since it said that I hadn't visited news.ycombinator.com. Then I remembered that after seeing a similar (though less clever) exploit a few weeks ago, I'd changed Firefox to not show visited styles. I'd call that a success.