Best I can tell, there is zero incentive for Quora (or any other site, for that matter) to care. Their current redirect logic in no way hurts their user experience.
Right now they protect their users' privacy. What benefit do they realize by providing their users' viewing history to other sites?
I personally think that the referer header was never a good idea. I disable it in my browser, and appreciate sites that do right by their users with privacy protecting default behaviors.
I think that is does benefit Quora for content providers to see how much traffic is being generated from their site. If I knew an article was getting a lot of traction on a site I would spend more time on there, perhaps participate and continue to improve and generate content itself, thus benefiting Quora with more data and more links for everyone.
Of course there is zero incentive for anyone to do it. And if everyone chose to link the way Quora does, you get a Google Analytics dashboard which cannot tell you what all URL's are sending traffic to your site/blog. I find it really difficult to imagine.
The long term effect would be that websites can no longer use referrer as a metric. What difference would that make? HTTP resources (webpages) shouldn't change semantic meaning depending on the referrer anyway. Doing so is arguably an unintended use (or abuse) of HTTP.
On sites where there's private information in the URLs + links to external sites, overriding the referrer is necessary in order to protect users' privacy / identity. See https://www.facebook.com/notes/facebook-engineering/protecti... for why we do this at Facebook (I work on that system).
Quora 's question pages are community-built public pages. If they have private urls in account pages, they should limit referral protection to them. Also, i wonder, doesn't the 200 OK code confuse search bots?
Indeed, referers are useful information in some cases. For bookmarking apps like http://noteplz.com one useful thing is that along with the bookmark, they also store the referer, so you can later go back to the google search result where you found that bookmark.
On the other hand, with https and url shorteners,referers are a dying breed. The situation with URL shorteners is absurdly funny now, because twitter double-shortens the shortened urls, since most popular sites have their own shortener.
2. you might get holes in your data for a number of reasons: the user has JS turned off; 100ms isn't long enough for the request to go through; or the user might click off before your script can attach itself to the onclick event.
You definitely don't want to get yourself in a situation where you go down and all outbound links stop working, but if you can fail gracefully, replacing the link makes a lot more sense.
Not to mention that your analytics code won't fire if the user opens a link by any means other than a standard left-click (e.g. middle-click, right-click -> open in new window/tab, or keyboard navigation)
Why does the delay matter? They have to wait for a response regardless of whether it's an AJAX request or a full browser redirect to the record/redirect URL.
This is probably not the case, but is it possible that Quora is intentionally stripping the referer header? Duck Duck Go does just this in the interest of user privacy: why should site X know where I came from and what I was searching? https://duckduckgo.com/privacy.html Seems unlikely in this case but possible.
Incidentally, it seems that encrypted.google.com does this but not regular google. EDIT: This happens for all https->http requests, it's not a google feature (TIL).
The User-Agent generates the Referrer header, not the site. Also, encrypted.google.com doesn't do it, the HTTPS standard says that browsers shouldn't send referrer headers to sites not in the same domain or not with https.
encrypted.google.com does this because it uses https. If a website is accessed from https and a link points to anywhere except another secure location, then the referrer is not sent.
Seems they're going down the annoying search visitors by hiding information route (similar to what expertsexchange was riled on for, although not quite as bad yet).
Sending an incorrect site referrer to a downstream website doesn't solve the identity problem! HTTP headers have existed even before all these applications came into being. One just has to abide by some of those basics.
It can be fixed through a double redirect. Basically, redirect the browser to a internal page that redirects to the original page and have that page redirect to the outbound link.
And when the server sees that request with referrer set to /redirect?target=http%3A%2F%2Fgoogle.com, it will then parse out the target url and redirect the browser to http://google.com.
This way, the target url can be given a meaningful referrer url without compromising user's identity.
Ah yes, I misread your post. The trouble with that approach is that you have to enumerate the dangerous params, and if the actual page URL needs a private parameter to work, you can't get rid of it.
Right, but you can always pass the canonical url to the redirector. That lets you avoid maintaining a whitelist/blacklist of query params. This should be trivial for Quora as most of their pages already contain the meta tag specifying the canonical url:
I've described a solution in a different comment on this thread. For each outbound link on the page, build a link that points to a redirector that accepts two query parameters: current page's canonical URL and outbound link's URL. The redirector will redirect the browser back to the canonical URL. Upon receiving the request for the canonical URL, instead of serving normal content, the server redirects the browser to the outbound link's URL on the condition that its referrer came from the redirector. This way, the outbound link gets the correct referrer without using any javascript wizardry. In fact, you can use this technique to customize the referrer to whatever you want.
http://a.com/redirect?canonical_url=http%3A%2F%2Fa.com%2Fpages%2F3&outbound_url=http%3A%2F%2Fb.com%2F
"canonical_url" is set to "http://a.com/pages/3"
"outbound_url" is set to "http://b.com/"
4. Redirector logs the request and redirects browser to canonical_url (i.e. "http://a.com/pages/3)
5. Code behind http://a.com/pages/3 checks the referrer to see if it came from the redirector.
5a. If it is, parse the outbound_url from the referrer URL and redirect the browser to that URL.
5b. If it isn't, serve normal content.
Basically, every content page needs to also act as a redirector and only redirects when the referrer indicates that the previous request came from the redirector.
This works by switching the url when a user clicks a link to your redirect url, then switching it back a fraction of a second after they mouse up. This means that your redirect works even if the user right clicks and opens in a new window / tab and when a user hovers over a link, they still see the normal URL in the status bar.
On the /redirect url just log any data you need and send a 301 or 302 redirect. The destination site will see your original page as a referrer, not your redirect url.
Are you referring to the fact that the browser will interrupt your tracking request because it already started loading the linked page? I haven't really tried, but I believe this can be dealt with if your server-side code expects it to happen.
Since you are a hosted service, you could periodically loop through all of the Quora redirect links you've received and resolve them. This might be against Quora's TOS, though.
I believe Twitter does this with URL shortener links posted in tweets.
Seems you saw a Quora survey on our site? We had to change the targeting rules to make it a generic "referring site starts with Quora.com" kinda rule instead of specific URL's :(
Right now they protect their users' privacy. What benefit do they realize by providing their users' viewing history to other sites?
I personally think that the referer header was never a good idea. I disable it in my browser, and appreciate sites that do right by their users with privacy protecting default behaviors.