
Cookieless cookies - lucb1e
http://lucb1e.com/rp/cookielesscookies/
======
0x0
So this is the old E-tag trick? It's not new at all, but perhaps it hasn't
been as widely published. This can also be done with the Date / If-Last-
Modified header, manufacturing a cookie "date".

~~~
nikcub
Last-Modified doesn't even have to be a valid date string:

[http://www.nikcub.com/posts/persistant-and-unblockable-
cooki...](http://www.nikcub.com/posts/persistant-and-unblockable-cookies-
using-http-headers)

As an update to that post from 2011, I never did get the browsers to update
their parsers.

This E-Tag bug has been known for over a decade. As part of my post I tried to
find the earliest written record of it and found a post in 2003:

[http://www.arctic.org/~dean/tracking-without-
cookies.html](http://www.arctic.org/~dean/tracking-without-cookies.html)

It seems to get 'rediscovered' at least a couple of times a year.

~~~
gsnedders
<[https://bugzilla.mozilla.org/show_bug.cgi?id=114508>](https://bugzilla.mozilla.org/show_bug.cgi?id=114508>)
is Gecko bug for this; can't find one for Chromium, and CORE-40723 for the
privileged few with Zombie-Presto bug-tracker access.

------
chancancode
It's also known as supercookies. KISSmetrics used to use this technique for
tracking/analytics and got a bunch of their clients sued[1].

[1]
[http://en.wikipedia.org/wiki/HTTP_ETag#Tracking_using_ETags](http://en.wikipedia.org/wiki/HTTP_ETag#Tracking_using_ETags)

~~~
TheHydroImpulse
Does anybody know why they got sued? I'm curious as to why they'd get sued for
Etags but not cookies. Did they just not have any opt-out measure?

~~~
nikcub
Because they operate outside of the users privacy controls in the browser. You
are accessing the users data without permission, a violation of Electronic
Communications Privacy Act.

That is a simple summary, you can find more in the complain.

I just uploaded a copy of it here:

[https://docs.google.com/file/d/0Bx25s45t4-d_bDBDRjFDOUlFOUU/...](https://docs.google.com/file/d/0Bx25s45t4-d_bDBDRjFDOUlFOUU/edit?usp=sharing)

KissMetrics settled and paid $500,000. They also let users opt out and were
better about disclosing the practice on their website.

------
tzury
Much more massive demonstration can be seen at evercookie where ETag is just
one method among others.

[http://samy.pl/evercookie/](http://samy.pl/evercookie/)

------
AlexanderDhoore
In Europe there are laws that restrict the usage of cookies (opt-out in some
countries, opt-in in others (like the Netherlands)). This website proves how
stupid that is. Letting politicians mess with technology is dangerous, at
least they should be advised better. I understand everyone is worried about
their privacy, but just outlawing cookies is NOT going to fix anything.

Edit: I just looked up the Belgian "Telecommunication law" and article 129
talks about "The storage of information or accessing information that's
already stored in the device of a user" [1] (Loose translation). So I guess
it's very broad.

[1] [http://bit.ly/18DLvov](http://bit.ly/18DLvov)

~~~
andrewcooke
i found a uk govt page describing this: " _Regulation 6 covers the use of
electronic communications networks to store information, eg using cookies, or
gain access to information stored in the terminal equipment of a subscriber or
user._ "

which is more general than just cookies and would cover caching too.

[http://www.ico.org.uk/for_organisations/privacy_and_electron...](http://www.ico.org.uk/for_organisations/privacy_and_electronic_communications/the_guide/cookies)

~~~
lucb1e
Indeed, this caching technique would be covered by "the cookie law". However
cookies can be audited, but many images contain legitimate ETags. It's more
difficult to detect when websites use this to track users, and I don't think
any government agency is capable enough to check whether companies comply to
the law.

------
sker
It works even if you use incognito mode in Chrome. So incognito is not as
incognito as we thought.

Edit: not really, the article explains it but I hadn't finish reading it.

~~~
lucb1e
The incognito mode is designed to be incognito from the moment you open the
window. When you close it and re-open the window, the cache will be thrown
away (along with any cookies, localstorage, etc.). During your incognito
session, cache and cookies and everything is stored. If they didn't do this,
websites would not function properly.

I believe the incognito mode has more potential than it's currently being used
as. For example multiple parallel sessions would be not only nice and handy,
if people learned to use it, the isolation would also enhance browser
security. As I mentioned on the page, it single-handedly eliminates a number
of https attacks and tracking methods.

Edit: Oh you mean that bug in this demonstration? Yes that's a local issue
here. In practice the two sessions could not be linked, as you can see when
you press f5 in the incognito window. Sorry about that, it's mostly meant as a
tech demo where people can understand and learn from the code, not a finished
product!

~~~
sker
The two sessions could not be linked, but after I close incognito, the counter
and the stored text in normal Chrome are reset. Why does that happen?

~~~
lucb1e
This is because I use a static etag instead of generating a new one for each
session. From the page:

> When you visit a page where you don't have an ETag (like incognito mode),
> your session will be emptied.

This unfortunately also throws away your other browser session at the same
time. In practice a website would probably issue a new etag for the incognito
session, and your existing browser session will simply continue.

~~~
sker
I see, the normal Chrome session is not reset, but rather linked to the new
ETag in incognito mode, at least in this example.

You could probably have submitted a bug to the Chrome team and get a bounty
for this thing. It kind of allows to track the incognito session linking it to
a normal session.

------
dewiz
I already knew about this but completely forgot, thank you for refreshing my
memory :) I have this problem with people registering in one forum with
multiple identities and I have been fighting it with cookies to track them,
but they got smarter and delete the cookies now.. so it's useful to know I can
try something new.

------
hayksaakian
I understand the implications and the underhanded nature of this sort of
tracking. However, could it prove more efficient that traditional, legitimate
analytics? (eg: google analytics via cookies)

For example, could there be less bandwidth consumption using this method vs
cookies?

~~~
lucb1e
Cookies are absurdly long nowadays, if they were shortened lots of bandwidth
could be saved. Especially Google's analytics cookies are long and sometimes
even contain referring domains or something. I don't really understand why
they do this, but it probably has some use that lessens the burden on their
database infrastructure at the cost of user bandwidth.

Also this is more of a hack and sneaky tracking method than a legitimate way
of identifying users. Whenever someone's cache is full or gets cleaned, the
"cookie" (etag) will be lost.

~~~
gizmo686
How to you define absurdly long? How does the size of the cookies compare to
the size of the image in an add banner, or even just the *.html file?

~~~
lucb1e
Well, considering that you can easily identify every person on the planet
uniquely with 5 bytes (256^5=1 000 000 000 000), I'd say these 200 byte+
cookies are much longer than needed. It suddenly starts making good sense that
Google's spdy protocol compresses headers...

------
pktgen
Can you disable ETag-based caching but allow If-Modified-Since? A quick check
of a few popular sites shows that they're using If-Modified-Since instead of
ETags, probably for this reason?

While we're on the topic of browser caches, do any browsers let you easily
store your cache in RAM instead of disk? Without resorting to other 'hacks'
like setting the cache path to a ramdisk, that is.

~~~
eli
It's very unlikely that popular sites are choosing not to use ETags in order
to protect privacy. If they want to track you, they just use cookies. There is
no privacy risk in a site using ETags as intended; it's only when the server
abuses them to act as a unique identifies. And you can actually do the same
thing with if-modified-since and the date, you just get fewer bits of data to
work with.

------
lingben
why do we assume automatically that this is a bad thing? it all depends how it
is used. I for one see many positive uses for something like this.

~~~
lucb1e
Can you give an example of a positive use for this, that is better than a
cookie?

~~~
leokun
It saves bandwidth and leads to faster website loading. A cookie is
transported with every request. This is why sites use cookie free domains for
static content. If etags were used instead a separate domain wouldn't be
necessary.

It's more secure. If identifying session data is not accessible to JavaScript,
it makes a site more secure from XSS attacks.

~~~
nsmartt
> _It 's more secure. If identifying session data is not accessible to
> JavaScript, it makes a site more secure from XSS attacks._

HTTPOnly should take care of this.

~~~
leokun
It depends, HTTPOnly cookies are still accessible to JavaScript in some
conditions, like those using an Android browser:

[https://www.owasp.org/index.php/HttpOnly#Browsers_Supporting...](https://www.owasp.org/index.php/HttpOnly#Browsers_Supporting_HttpOnly)

------
robryk
Instead of clearing the cache, one could stop using If-Match and instead do a
HEAD request and see if the ETag is the same. This increases latency by one
RTT, if the resource has in fact changed. Also, it could be implemented
outside the browser, in a proxy (albeit the proxy won't be standard-compliant
and obviously won't work on HTTPS sites).

------
djm_
>One thing I would strongly recommend you to do anytime you visit a page where
you want a little more security, is opening a private navigation window and
using https exclusively. Doing this single-handedly eliminates attacks like
BREACH (the latest https hack).

As far as I'm aware, it would not mitigate BREACH. Can anyone shed any light
on why it would?

~~~
lucb1e
You can't do arbitrary requests, which is required for that attack (and CRIME)
to work. This is normally done by injecting traffic in http pages, but if you
are using incognito mode for https exclusively then this can't be done. In
fact the same goes for normal browsing mode, but it's so inconvenient to have
to close all other tabs just to do a wire transfer or something. And incognito
mode has the additional advantage of also disabling tracking cookies and the
like. My bank actually uses Google analytics on their website...

------
barryhunter
Look carefully at the source code. Its bogus. Not the etags trick, but the
demo itself.

The demo is actually just identifying users by hashing the REMOTE_ADDR and
USER_AGENT, HTTP headers.

So it appear to work, when it doesn't really. Users with dynamic-ip or via
proxies etc will often fail.

This is why it appears to work cross incognito windows. Chrome sends the same
useragent incognito or not.

\----

The etag trick is real. But DO need to use Javascript in the browser to
extract the etag from the headers of the cached image. It doesnt really have
to be an image. Just a request that can be made via XMLHttpRequest.

... or could set the etag on the page itself, and use the fact that the
browser will send a If-None-Match on the next request. But only works for the
one single uri, not all pages on the domain. The code appears it COULD be used
to do that. But it never sets ETag http header on itself.

~~~
barryhunter
Oh, dear. reading the article again it does note the demo is not real.

/me wanders off to wipe egg off my face.

------
tmister
Also see [http://samy.pl/evercookie/](http://samy.pl/evercookie/)

------
Groxx
Interestingly, popping open a new Firefox private window showed my tracked
data, but after _closing_ the window it was all reset (even though I had the
tab open in a normal window the whole time). I'm guessing closing the private
window erases any 'dirtied' cached files?

~~~
driverdan
I figured this out yesterday while working on a client's site. Apparently FF
private mode still uses existing cache. If you had files from the site cached
before opening the site in private mode it will still use them.

This seems like a pretty big issue to me. It defeats private mode.

~~~
northwest
Same here. This is definitely not ok.

EDIT 1: I also checked "Clear history when Firefox closes" and included
"Cache" in the definition of "history". And the tracking is still happening.
So either the site uses another tracking method in addition to the etag method
or there is a big f#ck-up in FF.

EDIT 2: The tracking even continues when I check "Always use private browsing
mode" and then close the browser and open it again.

EDIT 3: Even a complete removal and a clean install of FF (without any add-ons
which may interfere) lets the tacking happen, for both the case of "EDIT 1"
and "EDIT 2".

So this pretty certainly seems to be a bug.

~~~
kfcm
The only way I've found to ensure this doesn't happen is to set about:config
-> network.http.use_cache = false.

Private browsing keeps it for at least one restart (with clear everything set
in options), even though I can't see tracker.jpg in about:cache anywhere.
Other weird thing is tracker.jpg isn't showing up in a filesystem search
anywhere. Wondering if there's something stored in the sqllite files.

~~~
northwest
That works, thanks!

------
sspiff
If he can only associate my previous page view with my current requests, then
how does he know it's me again? I mean, the image request is separate from the
page request, and in any case only comes later.

So he would need to store a mapping from something he already knows (from the
headers of my request for his html page, or my IP) to my ETag "cookie" to know
what my previous ETag was.

Wouldn't that require using some of the features he wasn't going to use (like
user agent) to work?

What am I missing?

~~~
miken123
As it turns out, he simply uses your IP and User Agent string. See
[https://github.com/lucb1e/cookielesscookies/blob/master/inde...](https://github.com/lucb1e/cookielesscookies/blob/master/index.php#L15)

No E-Tag tracking is taking place, since the E-Tag is never send to the server
for the index.php request (only for the image request). In theory he could
update the session after your IP changed, but he does not seem to do that (the
image requests hold on to the old E-Tag).

So, basically, to me it seems like the whole point/post is invalid. Please
correct me if I'm wrong.

~~~
Gurrewe
You're wrong. That line of code is just to create something random, he could
have used rand() if he wanted to, but it's not as "random".

And if you take a look in the .htaccess file, you'll notice that the images
request also goes to index.php.

~~~
miken123
I saw the .htaccess, but that still doesn't link the request of index.php to
the request of index.php?tracker.

I just tested it by changing my IP while staying in the same browser session.
After an IP change the page only displays '1' for the number of visits,
exactly as you would expect when reading the code (since the E-Tag for the
image is kept (and the image request updates the counter), but the index.php
uses your IP+UA combo to determine the session). This code is flawed and
doesn't do a thing.

------
sidcool
This is new knowledge to me. The article is wonderful and informative. And as
mentioned in the comments, incognito mode doesn't prevent it.

------
D9u
As the OP suggested...

Private browsing, as well as connecting through Tor, and the tracking didn't
work.

My text is lost on refresh.

Thanks for reminding me.

------
amenod
While you are at it, you can check to see if your browser properly shields you
from this and other similar techniques:
[http://www.canyoutrackme.com/](http://www.canyoutrackme.com/)

 _edit_ : shortened and clarified.

------
gwu78
As usual, advertisers rely on assumptions about what they think users will or
will not do. When users deviate from the assumed patterns, tracking fails.

Three ways to easily defeat this "cookieless tracking" come to mind:

1\. Turn off automatic image loading.

2\. Use your HOSTS file to block/redirect the domain name to which the
tracking info is sent.

3\. On devices that hide the HOSTS file, use your own localhost DNS server to
block/redirect the domain name to which tracking info is sent.

The common theme here is the user takes more control over what connections her
computer may initiate.

Under current usage patterns a user types a domain name in an address bar of a
browser (usually a browser written by some entity that pays its developers
through revenues from the sale of advertising) or she types something into a
search bar/box and then selects a search result. The user thereby initiates a
connection to some other computer addressed by a. the domain name she types
(assuming she types the name correctly; otherwise she may end up at a page of
sponsored search results) or b. the result she selects.

This level of navigation is within the user's control. She intends to connect
to a computer addressed by a domain name that she can type or select. Does she
also intend to connect to other unspecified computers at the same time?

Due to the way these browsers are configured, many more connections to other
unspecified computers may be initiated without any input from the user.
Increasingly, these are computers that serve the user no useful content. They
are devoted to tracking. Go figure.

Does the user want her computer to connect to other unspecified computers
whose sole purpose is to track her? Under current assumptions, this is to be
decided outside of the user's control (and awareness).

By exercising more control over what browsers do and over domain name lookups,
the user can retain more power to specifically choose the other computers to
which her computer connects.

------
zongitsrinzler
Node.js example code for ETag cookies: [https://github.com/RobFox/nodejs-etag-
cookie](https://github.com/RobFox/nodejs-etag-cookie)

------
mwenge
See also
[http://trac.webkit.org/wiki/Fingerprinting](http://trac.webkit.org/wiki/Fingerprinting)

------
Sami_Lehtinen
It's really lame to claim that etag is checksum. That clearly tells that the
author doesn't have a clue what it's all about. Btw. Who said that there would
be any cache stored when you browse in private mode. Etag didn't even work.
Panopticlick is much more advanced than this lameness. It usually works,
unless you use something like anonymity hardened virtual machine, which of
course isn't unique.

~~~
girvo
You know you can point all that out without having to come across as rude,
right?

You are right, of course -- ETag tracking is both not that exciting, and not
new, but I think that some here would've have come across it.

Also, I know that older versions of Firefox and Chrome _did_ cache things,
even in Private Browsing mode, but I think that that cache only existed for
that session.

~~~
Sami_Lehtinen
Yes, not using cache at all would make everything very slow. I'm now of course
talking about using in session memory cache. If it's too small you can
reconfigure it using browser.cache.memory.capacity parameter with Firefox.
With fiber I never use caching. But yes, with 512kbit/s connection I
unfortunately had to use disk caching too, to avoid re-downloading anything I
simply could. But of course in that kind of situation and configurate you're
really aware that you're not destroying all data between sessions. For privacy
virtual machine with hardened configuration + tor is good idea. Otherwise
there's no reasonable expectation of privacy anyway, as they're saying. In
technical terms, there are so many ways to track users who do not harden
those, that there's no reason to expect any privacy. As we have seen with all
these NSA discussions, all technical options were pre-known already. You don't
know if sites use some techniques or not, but it's reasonable to expect that
they do use at least all publicly known techniques. And possibly some unknown.
So making attack (or tracking) surface as small as possible, when looking for
privacy is reasonable. Maintaining any data between sessions is just stupid if
you're looking for privacy. Always boot clean virtual machine, which is
similar to other virtual machines, is best approach. Otherwise there are tons
of things they can do to track you.

Btw. even if browser keeps cache, you can always clear storage paths.

One of things that doesn't seem to be known to many users is that many
databases contain deleted data (marked free) for long time. They just don't
think about it. Just go through all files stored by browser, you'll end up
finding stuff that you woulnd't expect to be there, if you're naive. Right
attitude is to expect everything to be stored always, and take proper care to
destroy data when it's required. This is just like the issue with SSD drives.
If you write something on drive, you wan't to destroy. There's no sure way to
destroy the data from drive, without totally physically destroying the drive.
You simply don't know, if the controller has written data to cell XYZ, and
then re-mapped XYZ to somewhere else. Just overwrite it approach does not work
in this case. And you can't even guarantee that the manufacturer tool could
properly erase that cell.

Just final words. Etag doesn't have anything to do with "images", it's not
tied to content-type at all. Next week I could release "css" tracking exploit,
which uses etags, which is kind of css checksum. Uh...

In these days, privacy and security is hard, it's very hard. Even if you think
you're doing things right, there still might be several things that you're not
doing right. Even if you have used serveral years to learn how to do things
right. Even after that, there's still possibility of bad luck.

But all this stuff is generally known and properly documented, so there's
nothing new.

~~~
mike-cardwell
"Yes, not using cache at all would make everything very slow"

That suggests you've not tried it. I have. I've been running with disk and
memory cache disabled in Firefox for a couple of years now on my laptop, and
it's not anywhere near as slow as you think it would be. It's barely noticable
at all.

On my work machine I have the cache enabled, so I even have something to
compare against. My laptop is usually sat on an 8Mbit Internet connection. If
you're on a much slower Internet connection it would make a bigger difference,
but I don't think 8Mbit is particularly fast nowadays.

------
ivank
about:config -> browser.cache.disk.enable -> false

~~~
abritishguy
that is the worst way to solve the problem - just use incognito mode.

~~~
mike-cardwell
I disabled the disk _and_ memory cache in Firefox a couple of years ago. Don't
knock it until you've tried it. It has much less of a performance impact than
you would imagine. I have an 8Mbit internet connection, YMMV.

------
cgtyoder
Dammit now I'm hungry.

------
slashdotaccount
Apparentl xlearing yo cahs awn dandroit do.3 dant defeat das casche. Sew
glaugle!

