Hacker News new | comments | ask | show | jobs | submit login

This has been known about for years, and was a concern on various mailing lists years ago. The solution at the time was said to be that browser vendors will build in tools for cache control in the same way they have for cookie controls.

The first sites to exploit this were, as always, porn sites. They used Etags in referral tracking to avoid webmaster fraud. (the webmaster would have to include a script from the affiliate co which would set an Etag).

You know what is more interesting? The Last-Modified header. The HTTP spec says that you are supposed to put a date in there, but it also says not to bother parsing the date if you are a client since date parsing is such a pain in the ass. So clients just copy the date string and store it and then replay it subsequent requests.

you can put whatever the hell you want in a last-modified field and all browsers will just store it and then replay it later in subsequent requests to the same resource. for eg.

initial request:

  GET /_modified_test HTTP/1.1
  Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
  Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.3
  Accept-Encoding: gzip,deflate,sdch
  Accept-Language: en-US,en;q=0.8
  Cache-Control: max-age=0
  Connection: keep-alive
  Host: localhost:8888
  User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_6_6) AppleWebKit/535.1 (KHTML, like Gecko) Chrome/14.0.830.0 Safari/535.1
initial server response from my dev server (note Last-Modified header used):

  HTTP/1.0 200 OK
  Server: Dev/1.0
  Date: Sat, 30 Jul 2011 11:48:25 GMT
  content-type: text/html; charset=utf8
  Last-Modified: random_token_i_set
  Cache-Control: no-cache
  Expires: Fri, 01 Jan 1990 00:00:00 GMT
  Content-Length: 1634
subsequent browser request to the same resource:

  GET /_modified_test HTTP/1.1
  Host: localhost:8888
  Connection: keep-alive
  Cache-Control: max-age=0
  User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_6_6) AppleWebKit/535.1 (KHTML, like Gecko) Chrome/14.0.830.0 Safari/535.1
  Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
  Accept-Encoding: gzip,deflate,sdch
  Accept-Language: en-US,en;q=0.8
  Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.3
  If-Modified-Since: random_token_i_set
with new webapps now being single-page with either hashchange or pushstate support, it means almost all requests are made on the backend to the same resource, so you can track the user across all pages on the entire site and across other sites.

concerning, but a known problem. even with these headers patched there is still a lot of information that can be used to fingerprint clients (ie. having everything switched off is still a fingerprint that makes you unique). I don't think chrome, safari, IE or Firefox will ever implement these advanced features, it will be up to somebody else to release a browser that is more privacy aware or to maintain a plugin that is.

I wrote a plugin that does this, but a lot of information still leaks through (it is in my github but I haven't released/announced it in any way). I am contemplating just forking webkit and doing a whole separate 'privacy aware' browser but haven't found the time. in short, the browser makers know about this, and have known about it for years - there is just no real interest in providing user tools to fully anonymize users.

Edit: if anybody is interested in the plugin it is here: https://github.com/nikcub/Parley

it blocks all third party requests and provides other features. it works, just needs a bit of a clean up and release.




Interesting. I looked at the RFC and it says it has to be in the format "Sun, 06 Nov 1994 08:49:37 GMT" (RFC2616/RFC1123).

But even if browsers makers would check the "If-modified-since" against that format then it would still be doable to give each visitor a slightly different date and track them that way.

Combining the date stamps on 2 or 3 jpg or css files present on every page on the site should give you enough entropy for even the highest traffic sites and make it very hard to detect.




Applications are open for YC Summer 2019

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: