As an update to that post from 2011, I never did get the browsers to update their parsers.
This E-Tag bug has been known for over a decade. As part of my post I tried to find the earliest written record of it and found a post in 2003:
It seems to get 'rediscovered' at least a couple of times a year.
Another criticism "It has already being used by numerous websites but few people know of it" without any more information. If you know of which websites, you should state so, if you don't, you should at least say why you know "numerous websites" use it. Also the "few people know of it" bit, a bit presumptive if you were referring to the engineering community, and flatout meaningless if you were referring to everyday people, of which many don't even know what a browser is.
I did like the working example, but maybe tone down the hyperbole and more informative claims when the claims are also accusations.
> Your tone is definitely one pushing it as a new and ominous discovery.
I'm sorry if the tone gave a wrong impression, that wasn't my intention (English isn't my native language). It was quite a discovery to me though, perhaps that's why I made it sound like that without really meaning to.
> Another criticism "It has already being used by numerous websites but few people know of it" without any more information
Oh that's right! I had a source link elsewhere but forgot to move it when rearranging text. I'll add this again now.
> Also the "few people know of it" bit, a bit presumptive if you were referring to the engineering community
This was mostly my personal experience. I don't know anybody knowing about these etags being used for tracking, while I know that many friends and classmates could give me details on how cookies work or how localstorage works.
Edit: Why are people downvoting the person I replied to?
* Someone's grumpy and/or a jerk and/or an idiot
* Someone's reading HN on a touch device, wanted to
upvote, touched the wrong icon.
Can't find the HN story now, but it was a recurring story for a few days as people argued back and forth whether it was ethical.
Zalewski's "I know which sites you visit" asteroids clone is probably from the funkier end: http://lcamtuf.blogspot.fi/2013/05/some-harmless-old-fashion...
That is a simple summary, you can find more in the complain.
I just uploaded a copy of it here:
KissMetrics settled and paid $500,000. They also let users opt out and were better about disclosing the practice on their website.
Edit: I just looked up the Belgian "Telecommunication law" and article 129 talks about "The storage of information or accessing information that's already stored in the device of a user"  (Loose translation). So I guess it's very broad.
which is more general than just cookies and would cover caching too.
"When considering alternatives to cookies it is important to look at the broader privacy context. Focusing solely on cookies is missing the point."
Source (page 25): http://www.ico.org.uk/for_organisations/privacy_and_electron...
Personally I think the world would have been better served if they just put the law on browser makers and get them to provide a clear and consistent interface to whether a web site is tracking the user.
Edit: not really, the article explains it but I hadn't finish reading it.
I believe the incognito mode has more potential than it's currently being used as. For example multiple parallel sessions would be not only nice and handy, if people learned to use it, the isolation would also enhance browser security. As I mentioned on the page, it single-handedly eliminates a number of https attacks and tracking methods.
Edit: Oh you mean that bug in this demonstration? Yes that's a local issue here. In practice the two sessions could not be linked, as you can see when you press f5 in the incognito window. Sorry about that, it's mostly meant as a tech demo where people can understand and learn from the code, not a finished product!
Chrome actually got this feature recently. Check Settings > Users.
> When you visit a page where you don't have an ETag (like incognito mode), your session will be emptied.
This unfortunately also throws away your other browser session at the same time. In practice a website would probably issue a new etag for the incognito session, and your existing browser session will simply continue.
You could probably have submitted a bug to the Chrome team and get a bounty for this thing. It kind of allows to track the incognito session linking it to a normal session.
1) in plain non-incognito window, enter text, hit "Store"
2) open incognito window, go to same URL, the text is there!
that means the sessions can be linked now. where does this info come from? does Chrome (and Opera, which I also tested) share ETags between non-incognito and incognito windows?
For example, could there be less bandwidth consumption using this method vs cookies?
Also this is more of a hack and sneaky tracking method than a legitimate way of identifying users. Whenever someone's cache is full or gets cleaned, the "cookie" (etag) will be lost.
While we're on the topic of browser caches, do any browsers let you easily store your cache in RAM instead of disk? Without resorting to other 'hacks' like setting the cache path to a ramdisk, that is.
HTTPOnly should take care of this.
As far as I'm aware, it would not mitigate BREACH. Can anyone shed any light on why it would?
The demo is actually just identifying users by hashing the REMOTE_ADDR and USER_AGENT, HTTP headers.
So it appear to work, when it doesn't really. Users with dynamic-ip or via proxies etc will often fail.
This is why it appears to work cross incognito windows. Chrome sends the same useragent incognito or not.
... or could set the etag on the page itself, and use the fact that the browser will send a If-None-Match on the next request. But only works for the one single uri, not all pages on the domain. The code appears it COULD be used to do that. But it never sets ETag http header on itself.
/me wanders off to wipe egg off my face.
This seems like a pretty big issue to me. It defeats private mode.
EDIT 1: I also checked "Clear history when Firefox closes" and included "Cache" in the definition of "history". And the tracking is still happening. So either the site uses another tracking method in addition to the etag method or there is a big f#ck-up in FF.
EDIT 2: The tracking even continues when I check "Always use private browsing mode" and then close the browser and open it again.
EDIT 3: Even a complete removal and a clean install of FF (without any add-ons which may interfere) lets the tacking happen, for both the case of "EDIT 1" and "EDIT 2".
So this pretty certainly seems to be a bug.
Private browsing keeps it for at least one restart (with clear everything set in options), even though I can't see tracker.jpg in about:cache anywhere. Other weird thing is tracker.jpg isn't showing up in a filesystem search anywhere. Wondering if there's something stored in the sqllite files.
If it is true, then that's hugely non-obvious, given the behavior (being logged out). That's bad design, plain and simple.
So he would need to store a mapping from something he already knows (from the headers of my request for his html page, or my IP) to my ETag "cookie" to know what my previous ETag was.
Wouldn't that require using some of the features he wasn't going to use (like user agent) to work?
What am I missing?
No E-Tag tracking is taking place, since the E-Tag is never send to the server for the index.php request (only for the image request). In theory he could update the session after your IP changed, but he does not seem to do that (the image requests hold on to the old E-Tag).
So, basically, to me it seems like the whole point/post is invalid. Please correct me if I'm wrong.
And if you take a look in the .htaccess file, you'll notice that the images request also goes to index.php.
I just tested it by changing my IP while staying in the same browser session. After an IP change the page only displays '1' for the number of visits, exactly as you would expect when reading the code (since the E-Tag for the image is kept (and the image request updates the counter), but the index.php uses your IP+UA combo to determine the session). This code is flawed and doesn't do a thing.
Private browsing, as well as connecting through Tor, and the tracking didn't work.
My text is lost on refresh.
Thanks for reminding me.
edit: shortened and clarified.
Three ways to easily defeat this "cookieless tracking" come to mind:
1. Turn off automatic image loading.
2. Use your HOSTS file to block/redirect the domain name to which the tracking info is sent.
3. On devices that hide the HOSTS file, use your own localhost DNS server to block/redirect the domain name to which tracking info is sent.
The common theme here is the user takes more control over what connections her computer may initiate.
Under current usage patterns a user types a domain name in an address bar of a browser (usually a browser written by some entity that pays its developers through revenues from the sale of advertising) or she types something into a search bar/box and then selects a search result. The user thereby initiates a connection to some other computer addressed by a. the domain name she types (assuming she types the name correctly; otherwise she may end up at a page of sponsored search results) or b. the result she selects.
This level of navigation is within the user's control. She intends to connect to a computer addressed by a domain name that she can type or select. Does she also intend to connect to other unspecified computers at the same time?
Due to the way these browsers are configured, many more connections to other unspecified computers may be initiated without any input from the user. Increasingly, these are computers that serve the user no useful content. They are devoted to tracking. Go figure.
Does the user want her computer to connect to other unspecified computers whose sole purpose is to track her? Under current assumptions, this is to be decided outside of the user's control (and awareness).
By exercising more control over what browsers do and over domain name lookups, the user can retain more power to specifically choose the other computers to which her computer connects.
You are right, of course -- ETag tracking is both not that exciting, and not new, but I think that some here would've have come across it.
Also, I know that older versions of Firefox and Chrome _did_ cache things, even in Private Browsing mode, but I think that that cache only existed for that session.
Btw. even if browser keeps cache, you can always clear storage paths.
One of things that doesn't seem to be known to many users is that many databases contain deleted data (marked free) for long time. They just don't think about it. Just go through all files stored by browser, you'll end up finding stuff that you woulnd't expect to be there, if you're naive. Right attitude is to expect everything to be stored always, and take proper care to destroy data when it's required. This is just like the issue with SSD drives. If you write something on drive, you wan't to destroy. There's no sure way to destroy the data from drive, without totally physically destroying the drive. You simply don't know, if the controller has written data to cell XYZ, and then re-mapped XYZ to somewhere else. Just overwrite it approach does not work in this case. And you can't even guarantee that the manufacturer tool could properly erase that cell.
Just final words. Etag doesn't have anything to do with "images", it's not tied to content-type at all. Next week I could release "css" tracking exploit, which uses etags, which is kind of css checksum. Uh...
In these days, privacy and security is hard, it's very hard. Even if you think you're doing things right, there still might be several things that you're not doing right. Even if you have used serveral years to learn how to do things right. Even after that, there's still possibility of bad luck.
But all this stuff is generally known and properly documented, so there's nothing new.
That suggests you've not tried it. I have. I've been running with disk and memory cache disabled in Firefox for a couple of years now on my laptop, and it's not anywhere near as slow as you think it would be. It's barely noticable at all.
On my work machine I have the cache enabled, so I even have something to compare against. My laptop is usually sat on an 8Mbit Internet connection. If you're on a much slower Internet connection it would make a bigger difference, but I don't think 8Mbit is particularly fast nowadays.