
301 redirects: a dangerous one way street (2012) - _Codemonkeyism
http://jacquesmattheij.com/301-redirects-a-dangerous-one-way-street
======
_Codemonkeyism
The problem with 301 without cache headers is that some browsers cache this
forever due to some interpretation what 'permanent' means.

You often can't use 302 because all your external links no longer work SEO
magic for you with a 302. Google only transfers link juice with 301 [1].

If you make a mistake and misconfigure your server, you're toast.

If a disgruntled employee 301 redirects your domain, you're toast.

If a service provider misconfigures your domain, you're toast.

If a hacker (from a competitor) 301 redirects your domain, you're toast.

If you buy a domain that had a 301 on it, it's worthless.

If you buy a domain that had 301s on it that point to phishing sites, you're
in trouble.

I always add cache headers to 301 redirects I use to at least prevent me from
shooting myself with an arrow in my knee.

UPDATE: [1] Google seems to have changed this recently. It also no longer
considers http/https different pages as it did in the past with the same
content [https://www.searchenginejournal.com/google-confirms-no-
loss-...](https://www.searchenginejournal.com/google-confirms-no-loss-in-link-
authority-on-https-implementation/147933/)

~~~
rpgmaker
I think that when you clear the cache on recent versions of Chrome it also
removes 301 redirects. I'm not sure though.

~~~
dtparr
That fixes the problem for you as a web client, but not you as a domain owner.
Everyone who visited it during the 301 will continue to treat it as such until
they clear their cache (or equivalent).

------
Piskvorrr
That's what _permanent_ means. "Adjective permanent Without end, eternal.
Lasting for an indefinitely long time. "

Also, "This response is cacheable unless indicated otherwise," says RFC 2616.

Working as designed, IMNSHO. Perhaps not working as _intended_ , but alas,
that's a case of ¬RTFM.

~~~
Typhon
This seems yet another example of web professionals not understanding HTTP.

Much like webpages that say "404 not found" with a "200 OK" header.

~~~
davesque
Is the status line technically a header?

~~~
Piskvorrr
Request (section 5) and Response (section 6) messages use the generic message
format of RFC 822 [9] for transferring entities (the payload of the message).
Both types of message consist of a start-line, zero or more header fields
(also known as "headers"), an empty line (i.e., a line with nothing preceding
the CRLF) indicating the end of the header fields, and possibly a message-
body.

/quote RFC2616 (So, the status line is an entity of its own, which is
_followed_ by headers.)

~~~
davesque
That's what I thought.

------
franze
2012 - somebody should write 2012 into the title of this post (that by the way
hasn't any concrete data)

I did some testing in 2009, think around 2012 and 2014. Additional to
loffilegrepping after some big site URL rewrites.

It's a non issue. No caching headers, the redirect gets cached only for the
current browser session. Close it, reopen it, gone, done.

Lets discuss this one based on data. (Which I cant provide right now as Im on
a beach on Sri Lanka right now with a FirefoxOS device amd I dont know how to
see Http requests on this one, but) Please prove me wrong! based on test,
data, not blogposts.

~~~
rebelde
It still seems to be a problem.

I just tested it in Firefox on a Mac. I restarted Firefox. I even rebooted.
Developer Tools > Network tab says "cached". I can't confirm that it is cached
forever, but it is not only "for the current browser session".

------
pluma
"There are only two hard things in Computer Science..."

I guess some people think the purpose of 301 is more like that of 410: update
references so you don't try to go there again. The difference is that with 301
you additionally instruct the client to not even attempt to go there again in
the future.

But the article does raise an interesting point: if I own somedomain.example
and set it up with a 301 redirect to myotherdomain.example and enough people
visit it that most people will have cached the redirect, doesn't that
basically mean I now own it for perpetuity (or until enough people have
cleared their cache) even if I don't renew the domain and new requests to the
domain are no longer served (by the same IP)?

Or do browsers have some kind of protections against this, at least based on
DNS? It's a bit too convoluted for a proper DOS attack (because you need to
own the domain long enough and make it popular enough to poison everyone's
caches) but a naive implementation seems like it would effectively render
domains unusable if someone set up a 301 on them at some point in the past.

~~~
mnw21cam
For an important busy web site, even serving up a redirect for a short amount
of time could be enough to cause some serious problems. It's a way of turning
"I hacked and defaced this website but they fixed it 24 hours later" into "I
hacked and defaced this website and they fixed it 24 hours later, but loads of
people still see the defaced version".

------
zephod
I took over a domain which had previously 301-redirected HTTP:// to HTTPS://.
It caused us no end of trouble getting the alpha site online -- obviously we
set up SSL but we didn't realise it was the _first thing we'd have to do_.

It also caused half a day of confusion to understand why some of our web
browsers were still failing to connect and others could see the alpha site
(because they'd never visited the previous 301 site at that address).

~~~
nly
This isn't just a problem with things like HTTP. The industry as a whole lacks
a standard uniform way of dealing with domain transfers or expiration. CAs for
example will happily issue certificates that expire after your domain.

~~~
arbitrage
It's not the CA's job to make sure your domain isn't expiring. It's yours.

~~~
wtbob
> It's not the CA's job to make sure your domain isn't expiring. It's yours.

Really? In the simplest case, their entire job is certifying that the holder
of the private key is the holder of the domain name[1]. That begs the
question, of course: how is it that we trust every single CA to certify every
single domain? Why don't we trust the issuer of each domain hierarchy to
certify only those domains it's permitted to issue?

The entire XPKI is broken, broken, _broken_.

[1] In the more complex case, of course, they certify that the keyholder is
some external entity.

------
kijin
I was recently burned by this. The client had left his server misconfigured
for a few hours, and a lot of his static content ended up in a redirect loop.
I was called in to fix the mess.

Although modern browsers are clever enough to detect a redirect loop and throw
an error, they're not clever enough to detect when the redirect loop is caused
by a cached 301 response. So they cache the redirect loop as well. Throw in
another layer of caching (CloudFlare), and now you've got a bunch of URLs that
will be stuck in a redirect loop for a very long time.

The only solution was to append some garbage to every URL, like "?cache=no".
Fortunately, the problem only occurred with static content, so nginx happily
discarded the querystring and returned fresh content.

------
teddyh
> _You can improve a bit on this by sending along a bunch of cache control
> headers to at least limit the damage._

It would have been useful to _include those headers_ in the blog post.

~~~
Piskvorrr
Something like this says "keep this cached for 100 days":

Last-Modified: Fri, 19 Feb 2016 12:54:49 +0100

Expires: Sun, 29 May 2016 12:54:49 +0200

Cache-Control: max-age=8640000, must-revalidate

See also:
[https://www.mnot.net/cache_docs/#CONTROL](https://www.mnot.net/cache_docs/#CONTROL)

------
spoiler
This can be fixed with a 302 from the destination back to to the start/source
of the 301, assuming that the old domain isn't redirecting any more (which
would cause a redirect loop)

------
tobltobs
Uh, I just realized that there are a lot of bullets in my foot.

~~~
borkabrak
I'll take two of this on a t-shirt, please.

------
ragmaanir
I have a draft blogpost (somehow octopress did not respect my darft: true
setting) that describes the same problem:
[http://ragmaanir.mypresident.de/blog/2015/08/03/ruby-web-
dev...](http://ragmaanir.mypresident.de/blog/2015/08/03/ruby-web-development-
tips/)

Also, 301 might poison proxy-caches, so even if you clear the cache in your
browser it might still not work.

------
honksillet
So if you were to momentarily hack a big site like twitter.com and serve out a
301 to, say, pornhub, you would permanently brick twitter?

~~~
dyladan
Likely a site with the influence of twitter would be able to get this taken
care of. (directions at another domain on how to clear your cache or even a
direct update from browser vendors possibly)

------
amichal
This is a browser design issue not a protocol/server issue. At the time HTTP
1.1 was written browser caches would hold at most a few weeks of browsing
history before things started falling out so 'permanent' functioned like our
intuition

Whats more surprising is "13.2.2 Heuristic Expiration"[1]

If you specify (or your framework does) a Last-Modified time WITHOUT Cache-
Control the browser is free to make up its OWN cache expiration rules (the
item is implicitly cacheable)

[1]
[https://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html](https://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html)

~~~
bcoates
Heuristic Expiration is a case of documenting existing weirdness. Older
browsers would have cache expiration policy be a setting the user could
control, so you can't really leave off explicit headers with any expectation a
default will be respected.

------
miseg
It's easy to live your life to keep Google happy. 301 redirects have been an
important part of online website life.

But then I do accept this perspective (if it's within your call to take this
risk). Just don't 301-redirect it. Let the search engines figure it out for
themselves.

If a user has a bookmark to an old resource, then it's a liability for you to
try to keep your web of 301s working.

KISS!

~~~
rileymat2
I would say links are a bigger problem than bookmarks. For instance stack
overflow answer links to apple or msdn references, are really annoying if they
die.

------
perlgeek
When working with non-trivial redirects, test them with something like "wget
-S $url" instead of your browser. The redirect caching makes it very painful
to to repeatedly test the same redirect.

Also, start with a 302, and only change the status to 301 once you're
confident they are correct.

------
dk8996
Is there any company that provides 301 or 302 as a service? -- something
cheap. I know it's not hard to set-up a small box on AWS and install Node.js
(or whatever). But I would pay few dollars a month for some service to run
that for me.

~~~
kdeenanauth
Wouldn't a URL shortener service be good enough? (E.g.
[https://goo.gl/](https://goo.gl/) )

~~~
kdeenanauth
Oh I re-read the article. You mean in the case of old domains.

~~~
dk8996
Yes, in the case where you have a old domains and you want Google page rank
juice (via 301).

------
newscracker
If the owner of the domain had done it, would it still really be an "eternal
issue" since:

a. most popular websites nowadays are so bloated that browser caches would
throw out many things (including your tiny site that someone checks a few
times a week or even longer) a lot sooner compared to how things were about 10
years ago (I presume the default disk cache sizes in browsers have not
increased by multiples in this period).

b. more people are browsing through mobile devices that are dumped in a few
years and replaced with a new one, new browser, empty cache, etc.

------
TazeTSchnitzel
HSTS is similarly one-way, but it's not indefinite, I think.

~~~
pluma
It's not indefinite because you need to specify a duration. However nothing
stops you from setting an extremely long duration and in fact most tutorials
seem to advise doing so for safety reasons.

~~~
TazeTSchnitzel
Yeah, a short duration is practically useless. I've seen one site that sets
the duration to a mere 24 hours.

------
ghostek
What part of "permanent" did the author miss? Seriously, specs should be read
literally, if interpretation is required then it's not a perfect spec.

------
mwcampbell
Now I don't feel so bad for being lazy and just using a 302 everywhere. I
never even bothered to learn how to configure nginx to send a 301.

------
colanderman
I thought best practice these days, at least for REST, was a 308
([https://tools.ietf.org/html/rfc7238](https://tools.ietf.org/html/rfc7238)),
since 301 has the bizarre behavior of being converted to a GET by some UAs
(which conflate it with 303), enshrined for hysterical raisins. Is this not
the case?

------
buro9
Is there a matrix showing which browsers are aggressively caching the 301s?

It would be good to get an indication of the potential impact.

~~~
_Codemonkeyism
From what I've read, looks like Firefox and Chrome cache forever, IE not. But
I might be wrong.

------
itsjustjoe
I hit this recently with my personal website and it sucks. Luckily no one
visits my website.

------
jstimpfle
Why not just redirect back from the new to the old URL? A sane client should
then check if the old 301 is still there (not sure if by the spec, but it is
common sense). I believe I once tried this and it worked with Firefox. Not
sure, though.

~~~
leni536
If you are lucky enough to have control over the new URL.

edit: > A sane client should then check if the old 301 is still there Check
out kijin's comment about this.

------
chias
I saw this article at approximately the same time as I saw this tweet:
[https://twitter.com/P0TUSTrump/status/684891719985410048](https://twitter.com/P0TUSTrump/status/684891719985410048)
.

\--- begin factually incorrect statement ---

    
    
      To save you a click: Jeb Bush forgot to renew his domain, 
      and Trump bought it and redirected to 
      www.donaldjtrump.com
    
      Thankfully, Trump is using a 302, not a 301.
    

\--- end factually incorrect statement ---

 __EDIT: ah I see -- jebbush.com has only ever had that redirect throughout
its entire (short) existence. Leaving my comment up though, since it 's more
about what one could do in that scenario than any particular current event. __

~~~
mediascreen
Jeb Bush or his organisation never owned it and did not forget to renew it.

------
Namrog84
Eli5: what happens if your 301 redirect domains a to b and b to a?

------
ikeboy
Should say (2012).

------
mignev
Thanks!

