
Where did all the HTTP referrers go? - Smerity
http://smerity.com/articles/2013/where_did_all_the_http_referrers_go.html
======
kijin
The article makes it sound like every webmaster is entitled to see Referer
headers. Really? How does enabling referers "save the world"?

A few weeks ago, I ran across a website (can't remember their domain) that
refused to show me any content unless I enabled third-party cookies, and even
contained a lengthy argument explaining why disabling third-party cookies
hurts the web. IIRC the whole argument was 24K bullshit, written by people who
feel entitled to keep making money with their outdated business models and who
were obviously alarmed because modern browsers were doing sensible things. And
now we're seeing a very similar argument, only this time it's about referers.
Why do you think it matters whether I came to your site via Google, Reddit, HN
or someone else's blog? What makes you think gives you the right to know that?

Webmasters never had the right to know where your visitors were coming from,
any more than the owner of a random gas station on the Interstate has the
right to know which city his customers are driving from. If SSL is making
Referer headers disappear, good riddance. We just closed a privacy hole, 99
more to go. Next in the TODO list: get rid of referers even when the referring
website doesn't use SSL, because as the article correctly points out, we've
got a bit of inconsistency there.

~~~
Smerity
If you reread the article, I actually say you should add a meta header
regardless of whether you do or don't want to send headers.

Why? It actually solves your "next in the TODO list". Most web sites that
shouldn't send referrers don't use <meta name="referrer" content="never"> so
will be leaking referrers to other web sites. Adding this meta tag will
eliminate referrers in both HTTP and HTTPS.

So yes, my own preference is to keep HTTP Referrers, but I also explain how to
kill HTTP referrers for webmasters who would like to as well.

~~~
untog
That's still taking the decision out of the hands of the user, though.

I'm not saying that I totally agree with kijin, but I don't think your answer
addresses his point.

~~~
Smerity
I didn't realise I missed a point to address, I'll seek to clarify.

It's always the user's choice as to whether to send referrers or not, as the
referrer is actually added by the user's web browser itself. Extensions exist
for just about every major web browser[1][2][...] to modify the behaviour of
the HTTP Referrer field. If you don't like the idea of sending referrers, it's
entirely within your control to never send a single referrer.

In almost all cases, disabling the referrer entirely won't result in any
broken behaviour, primarily as the HTTP Referrer is unreliable and can be
spoofed anyway.

[1]: [https://chrome.google.com/webstore/detail/referer-
control/hn...](https://chrome.google.com/webstore/detail/referer-
control/hnkcfpcejkafcihlgbojoidoihckciin?hl=en)

[2]: <https://addons.mozilla.org/en-US/firefox/addon/refcontrol/>

------
anonymouz
Referers are a privacy trade-off though, and personally I have never seen a
good reason why the desire of some website to track where I am coming from
should trump my desire to not broadcast this information to the world.
Therefore I like to switch of referers anyway.

~~~
claudius
I am doing the same. Fortunately, the websites requiring the referer to be set
are getting fewer and fewer – some years ago, I regularly got placeholder
images ‘THIS IMAGE WAS STOLEN FROM XYZ’ when browsing XYZ without referers.

~~~
joosters
This is one of the good uses of HTTP referrers!

I host a small, very low traffic website. One day, the bandwidth shoots
through the roof and stays high. The reason? One of the images on a page got
added to the .sig of someone in a popular forum. Suddenly thousands of people
are fetching the image.

The solution was to filter by referrer header, letting the image be seen by
visitors to the actual page, but linking from other sites gets blocked. Note
that usually it's best to allow requests that have no referrer header at all,
otherwise you'll be blocking some legitimate viewers of your site.

End result: bandwidth back down to the usual, tiny levels. It's not that I
cared about people _copying_ the images, I just didn't want to foot the bill
for the traffic!

~~~
graue
I think you misunderstand — this seems to have been claudius' point:

> _Note that usually it's best to allow requests that have no referrer header
> at all, otherwise you'll be blocking some legitimate viewers of your site._

~~~
joosters
It was probably a badly-placed reply; my point was to illustrate a use for
referrer headers and to state that sites blocking based on the lack of a
referrer were doing it wrong.

------
mooism2
So my understanding is that the default behaviour is:

1\. Follow link from <https://example.org> to <http://example.com> \---
referrer is not sent

2\. Follow link from <https://example.org> to <https://example.com> \---
referrer is sent

I don't understand how the same referrer can be too sensitive to be sent as
plaintext, but harmless enough to be passed to a not-necessarily-trusted third
party.

~~~
buro9
Let me fix that for you:

1\. Follow link from <https://example.org> to <http://example.com> \--- _can_
be read by a third party if referrer were added

2\. Follow link from <https://example.org> to <https://example.com> \---
_cannot_ be read by a third party so referrer can be added

The assumption is that secure pages are secure for a reason, and that the
author of a secure page is linking to other secure pages and has some basis of
trust by which the link is provided.

~~~
mooism2
Example.com _is_ the third party. (Example.org and a human user being the
first two parties.)

Let me rephrase my question: why the default assumption that example.com is
trusted not to misuse referrer information merely because example.org provides
a link and the human user follows that link?

~~~
ordinary
_Example.com is the third party. (Example.org and a human user being the first
two parties.)_

I disagree. When you click a link on a page that you retrieved from
example.org, one that leads to example.com, there is no communication between
you and example.org, nor between example.com and example.org. The
communication that takes place is between you (party 1, the initiator of the
conversation) and example.com (party 2, the target). The HTTP request
_mentions_ example.org, but being a third party, it does not participate in it
directly.

The only conversation in which example.org was a party was the one in which
you requested the page that contained a link to example.com, which has already
finished.

In that light, it seems strange to me that under HTML5 (assuming I understand
the article correctly), example.org is given a mechanism to dictate how much
information you give to example.com. Should that not be your choice, as the
sender of said information?

~~~
Smerity
1) User's browser requests page from Site A

2) Page from Site A suggests what should be sent in the referrer via the meta
referrer

3) User clicks on link from Site A to Site B

4) User's browser requests page from Site B (referrer is set by either user's
overriding option or the meta referrer from Site A)

So indeed, at no point does Site A speak to Site B directly. The meta referrer
simply asks the user to either send or not send the referrer. If the meta
referrer is not present or not supported, it falls back to default HTTP
Referrer behaviour.

As the user, you can override this behaviour and force the referrer to do
whatever you'd like. This includes refusing to send it, always sending it, or
spoofing it. Firefox for example allows you to set
network.http.sendRefererHeader and there are various browser extensions for
any popular browser that will allow for finer grained referrer control.

------
leephillips
You can selectively block referrer information if you are using a webkit
browser (Chrome, Safari, etc.) by using <http://lee-
phillips.org/norefBookmarklet/>

I've found the information in my referrer logs quite interesting and useful,
despite the Russian referrer spam, and am sorry to see it going away.

------
cnahr
When I recently tried to purge my server logs of referrer spam, I found that a
whopping 80% of all visitors had either no referrer or referred from my own
website. About a third of the rest were spam. So judging from my small sample,
referrer headers are largely useless except as spam machines. They should
probably be removed from HTTP anyway.

------
lucb1e
Oh I did not know this. Very good to know indeed! I too am missing referrers
from websites, and since my website also supports https I suppose others are
missing them from me as well. This way I can make the internet a very tiny bit
better ^^.

------
warcode
I already hide/change my referrer through browser extensions, though I guess
this might finally get those few sites who have functionality based on refs to
stop using it.

~~~
Smerity
Whilst I still think referrers should be used where appropriate, there are
certainly places or reasons you want to nix them. HTTPS not passing along
referrers by default is a sensible decision, as is the addition of <meta
name="referrer" content="never"> for sites stuck in HTTP that may want to go
under the radar.

I still feel that removing referrers entirely destroys many useful tools and
analytics that we've traditionally been able to use. It removes the core way
in which we understand connections across the Internet. By removing referrers,
the best we can do is use link graphs, falling back to the original PageRank
algorithm where we assume people are random bots that click on one of the
links on the page.

Edit: Can't reply to you hnriot due to comment depth limit. My reference to
PageRank is as links and backlinks could be used as a poor referrer
substitute, though they aren't currently used as it's a lot more work and less
accurate. I'm simply saying that, in the event that referrers all disappeared
tomorrow, you'd see normal websites trying to estimate where their traffic
comes from by using a PageRank inspired algorithm, or more naively by looking
at who links where.

~~~
hnriot
Referrers have nothing to do with pagerank. It's used in website analytics of
course, and, in general is a very bad idea leaking information between
websites. Search engines don't see your site's referrer data so the
association you tried to create with pagerank is misleading.

~~~
encoderer
I, for one, didn't think his argument was _remotely_ misleading.

I did, however, think that bringing PageRank into the argument was ill advised
and probably detrimental to his goal of advocacy through education. It just
confused things, introduced another rabbit-hole concept.

If the author thought that mentioning PageRank would lend credibility to the
_'link counting' / 'random link clicking bots'_ foregone conclusion he setup.,
he was right. It did. So link the text to a footnote referencing PageRank and
stay on topic.

------
nathell
This is how I discovered HN went HTTPS some time ago. I never noticed until
now.

~~~
sp332
HN started using SPDY recently, and SPDY requires HTTPS.
<https://news.ycombinator.com/item?id=5660797>

~~~
Skalman
HN has been available over HTTPS for longer than that.

------
happyshadows
I wish Google would change their meta-tag from "origin" to "always". It's
getting harder and harder to see which keywords bring traffic to your site
these days.

------
zapt02
If you enable SSL for your site, will you start seeing referrers again? Mainly
interested in the case of Google searches.

~~~
j_s
No. You have to pay if you want referrer info from Google.

[http://searchengineland.com/google-puts-a-price-on-
privacy-9...](http://searchengineland.com/google-puts-a-price-on-
privacy-98029)

~~~
tacticus
Or run https on your server.

~~~
j_s
The point is that Google obfuscates their search result links so that they do
not include search keywords any more -- if you are interested in knowing the
keywords, you [typically] have to pay. If you are just looking to know that
the referrer was Google, then yes you can see that. However, this is not
really useful information to most people.

They implement this in two ways: (1) If you go directly to google.com and type
in your search, the results page uses a # in the url which keeps all the query
parameters out of the referrer. (2) Google has used (not sure if they still
do/randomly test whether or not to) JavaScript redirects which overwrite the
url when a search result is clicked. I'm sure there are other ways for Google
to hide the referrer -- plus Google and various browser extensions can turn
parts of this on/off however they choose.

It is still possible to wind up with a referrer from a Google search where you
can see the search keywords, if for example the search is done using the
browser address/search bar, and the JavaScript overwriting result urls is not
active (turned off by NoScript, etc.). However, this is not in Google's best
business interest (if they can convince people to pay for the info) so I am
counting on them trending towards making this the least likely of possible
scenarios.

------
w00kie
Thanks for the info, I was not aware of this meta tag. My blog is all HTTPS
(with SPDY) and I added content="origin" to all visitor facing pages, leaving
the admin pages dark.

------
tarkin2
I get referrers from httpS://www.google.com quite often in my logs.

And I'm running my blog on HTTP.

How is this so?

~~~
ZoFreX
Read the article. This is explained about 80% of the way through.

------
dlsym
But why is that a bad thing? Just count your page impressions to see if you
blogcontent is popular.

What websites your visitors consume is IMHO not your business.

~~~
josteink
> What websites your visitors consume is IMHO not your business.

But if your product is the subject of discussion on another page which links
to yours, you may want to be able to join the discussion.

As a site owner, I don't think it's unreasonable to ask my visitors how and
where they found out about my website.

------
kvi
Yeah, that's a long known thing. I use HTTPS on parsebin.com specifically for
stripping out referrers.

------
kmasters
I fail to see why referrer is part of any spec. Referrer is a 90's concoction
when we were all so naive of future uses. I'm sure it's made a lot of porn
sites some money, but affiliate marketing is not the purpose of web standards.

As far as analytics its all done with tracking cookies now, which is another
issue that needs to be addressed, but Im not going to miss referer should it
ever really go away.

~~~
Angostura
I see it rather differently. Referrer _is_ a 1990s concoction based on the
idea that the Web is a collaborative effort that we all engage in. I don't
personally have any problem in a Web site knowing where I arrive from in most
cases. It's harmless him of information (in most cases, as I say) which makes
the Web site owner's life a little better.

~~~
mindcrime
Same here. Referer is part of the way the Web works, and I, for one, don't
give two shits if Site A knows that I arrived via a link from Site B.

That said, I can see why certain people might see it as problematic... if
you're browsing <http://www.anarchistsbombmakingforum.com> and somebody posts
a link to <http://www.fbi.gov>, then maybe you don't want the FBI knowing you
were at anarchistsbombmakingforum.com. But, still, barring _other_ privacy
problems, the FBI don't know who you are when you visit their site, just that
you came from anarchistsbombmakingforum.com.

I have a hard time getting worked out about this though... for one, if you're
surfing anarchistsbombmakingforum.com, common sense would dictate that
following a link to fbi.gov isn't such a good idea (and a forum that fosters
discussions of anything controversial should probably munge links to go
through an anonymizer anyway) AND the people for whom this really matter are
the people who have a referer blocking plugin installed in their browser.

