Hacker News new | past | comments | ask | show | jobs | submit login
Where did all the HTTP referrers go? (2013) (smerity.com)
97 points by bemmu on Feb 20, 2016 | hide | past | favorite | 36 comments



As the author, this was a pleasant surprise! :) It was written in 2013 (potentially note that in the title?) though people still frequently say they find it useful.

As @jstayton noted, browser support has come a long way[1], and @davecardwell notes that some of the keywords are deprecated[2]. I really should update both browser support and the "which websites use it" part. Hacker News now supports it for example, which I'm hugely glad to see! ^_^

Previous HN discussion: https://news.ycombinator.com/item?id=5778444

[1]: http://caniuse.com/#feat=referrer-policy

[2]: https://www.w3.org/TR/referrer-policy/#referrer-policy-state...


I posted it because it was super useful while I was trying to debug some Referer header "error", which actually just turned out to be a gap in my knowledge of these various situations where the header is suppressed on purpose.

Main takeaways for me:

- Go from HTTPS -> HTTP and the header is lost.

- <a> tag has "noreferrer" feature. Notably imgur uses this (I was trying to understand why imgur wasn't showing up in my stats properly).

- There's now a meta tag which lets site owners decide if they want to give out referrer info or not.


Why are you glad to see that HN uses the header? HN URLs contain no private information, no search terms, anything, and seeing where your traffic is coming from is interesting. I don't see the privacy or security issue here.


I agree with you. The addition of the meta tag actually allows you to see the traffic is coming from HN, even if you are a HTTP site, which was not the case when the story was written. HN also specify only to send the "origin". This could be considered a small privacy optimization though is likely of little impact.


Ah, I didn't think of non-HTTPS pages as a benefit. I think for HN sending the whole URL would be the better choice, but they probably erred on the side of caution.


I didn't see this explicitly mentioned but if my site is HTTPS, I get referers, right? And so isn't that the easiest/best solution?


I suspect if your website is HTTPS only (and HTTP auto-redirects to HTTPS), you'll probably still lose referer headers when people use http:// links to your site.

While Stackoverflow suggests most modern browsers will maintain the referer when following a redirect (http://stackoverflow.com/questions/2158283/will-a-302-redire...), I suspect this won't work if the redirect is from HTTP->HTTPS and the origin is HTTPS. However, I haven't found any conclusive information on this, and I'm too lazy to test it from various browsers right now.


From the article:

> HTTPS websites will send referrers to any other HTTPS website even if it contains sensitive information

As such, having a website which is HTTPS will get you referers from both HTTP and HTTPS if they're willing to send them, but it's also important to know you can control them in case there are privacy implications such as leaking your customer's information to external HTTPS sites.


Thank you for this, because I think I'm about to learn from you, but I'm confused. How can you control them? Is there some way for a site to signal to a browser whether it should or send referrer headers and under what circumstances?


The site can signal, either via server headers or meta tags, whether to send or withhold referrers.

Adding <meta name="referrer" content="never"> will prevent referrers from being sent, whilst <meta name="referrer" content="always"> will ensure they're sent whether linking to HTTP or HTTPS.

The user can of course override these if so desired - see the extensions linked to by many other commenters.


Note: Authors are encouraged to avoid the legacy keywords never, default, and always. The keywords none, none-when-downgrade, and unsafe-url respectively are preferred.

- https://www.w3.org/TR/referrer-policy/#referrer-policy-deliv...

Although the current versions of Edge only support the legacy keywords according to http://caniuse.com/#feat=referrer-policy


I've been using "no-referrer" rather than "none" as per https://w3c.github.io/webappsec-referrer-policy/ which is linked to from the "caniuse.com" link you quoted above. Am I wrong?


It looks like there was a change on 19th August, 2014: https://github.com/w3c/webappsec/commit/b48b635f93a798da87c6...

However, clicking the link to “Latest Version” at the top of the document you linked (from Dec 2015) takes you to the document I posted (from Aug 2014) so…who knows? It will require some testing to see what the browsers have actually implemented I guess.


Do you happen to know what Edge defaults to if it doesn't recognize the value? One would hope to "never"?


I don’t know, sorry, although I would suspect they stick to “default”.


Just tested this myself and it looks like you're right. With meta referrer set to content=never, Edge passes the referrer. With content=none it doesn't. Chrome is the opposite, and Firefox does not send the referrer in either case. (And of course IE sends it in both cases, since it doesn't recognize the referrer meta tag.)


I use referrer block for Chrome. I have only twice had anything break from it, and it was a simple fix to whitelist that website. Disqus and coursera both break without referrer.

I don't know why everyone doesn't block referrer. It seems like such a massive breach of privacy to leak that information on every link you click.


Referrers are useful and can be a friendly, automatic way of letting someone know how you found their page. I've discovered many links to my pages on other websites, and whole articles discussing my articles, through referrers. Since none of the other pinging mechanisms ever gained universal traction, it's the best thing we have, and I'm sorry that it's all but disappeared. And I remember how amazed I was to discover the existence of "referrer spam" years ago.

You can block referrers in a number of ways whenever you'd like, for example with a simple bookmarklet:

http://lee-phillips.org/norefBookmarklet/

Finally, looking at search terms that led to your site can be very entertaining.


Referrer Control in Firefox works very well:

https://addons.mozilla.org/EN-US/firefox/addon/referrer-cont...


In Safari (the OS X version), you can also do this with JS Blocker:

http://jsblocker.toggleable.com/


Since you mentioned whitelistnig I assume an extension, care share which one?


Don't you mean referer?

Edit: yes, I know. This was an HTTP joke. Not a very funny one. Here is another one: what is the difference between a hippo and a zippo? One is very heavy, the other is a little lighter.


Referrer is the correct English spelling. It's so easy to accidentally create a bug by forgetting that the standard relies on a spelling mistake.


That's covered in the appendix.


We're probably better off spelling "referrer" correctly except in packet traces or HTTP protocol implementations.


Current browser support: http://caniuse.com/#feat=referrer-policy


Does anyone know the preferred (current) method of sending referrer directives between the referrer-policy header[1] and the referrer directive under Content Security Policy (CSP) 1.1[2]?

Since these specs are evolving, there is a lot of contradictory documentation online, and it's tough to weed out what's the accepted solution (if there is one). Presumably using headers (of some sort) is preferable to meta tags where possible though?

Edit: As mentioned by davecardwell, the always/never/default settings, which are referenced in my CSP link there, are deprecated. Perhaps the whole concept of serving the referrer policy via CSP is as well?

[1] https://w3c.github.io/webappsec-referrer-policy/#referrer-po... [2] https://www.w3.org/TR/2014/WD-CSP11-20140211/#referrer


I knew about what happened with the referrers when going from HTTP to HTTPS but had not heard about the meta referrer and have been wondering why, even though my own site is HTTPS I see so many bare domain referrers. Now I know. Thank you both to author and submitter.


Actually I like the fact that some people won't add a referrer meta. Not everbody needs to know where I came from.


Google originally justified their blocking of search term data in HTTP referrer fields as being due to the security implications of HTTPS vs HTTP. The fact that they now use <meta name="referrer" content="origin"> proves it was just an excuse to obscure that data, since they have it at their disposal to transmit that information securely to the destination site.


All of this has always been about forcing everyone onto HTTPS at any cost. Denying referral headers, not allowing HTTP/2 over HTTP, treating self-signed certificates as the worst thing ever, soon putting red X'es in the URL bars, etc.

It's all disgustingly heavy-handed. But they get away with it because most people are on board with it. They seem to lack empathy of how this would look if they weren't on board with the change. The general idea is a great one: who doesn't love extra security? But all of those cons ... interfering with caching, costing money for wildcard certificates, all the security vulnerabilities in the TLS libraries, the (miniscule, but non-zero) extra computational power required, the added setup difficulty (yes, even if you trust Let's Encrypt to execute code on your box, it's still an added burden), putting trust in the CA system that has shown several major flaws already (such as rogue certificates) ... all conveniently ignored or whitewashed away.

But all the posturing in the world won't be enough to eliminate HTTP from the web. It's going to outlive all of us. Not even because people like me want it to, but because there's just so much legacy code out there that's never going to get touched. Millions of devices and applications that only speak HTTP, that nobody's ever going to update. Hell, it's trivial to find Gopher proxy layers, and that never even got 1% of the uptake that HTTP has today.


I agree the whitewashing is annoying. One correction: the 'miniscule, but non-zero' computational power for TLS connections, in particular the handshake, is actually quite significant. (like 0.4s of CPU time in my tests). So the whitewashing problem may be worse than you think!


This would have been OK if HTTPS truly did increase security.

In reality, however, the sensitive part is the 'who' and 'when', not the 'what'. HTTPS makes sense for hiding passwords or bank data, for everything else it's mostly pointless.

(If you've watched any police procedural shows you've noticed that they don't need wiretaps to make an arrest, a destination phone number and time is all that's needed.)


Have we already forgotten FireSheep?


As a user, I'm glad they are hiding keywords. First of all it is private data, and secondly websites where doing all kinds of annoying stuff with the information (eg. they added a list of search terms used to find the page on the bottom of the page, presumably to further increase Google rank for these terms).

But I really miss the time when I saw the keywords that people used to find my page. Lots of insight that is now lost.


Or you could just, you know, not include sensitive information in URLs, reducing the risk of it being leaked through sharing, referer, or other ways. HTTPS only prevents MITM. You should still think about the design if your site.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: