
HTTP Immutable Responses - okket
https://tools.ietf.org/html/rfc8246
======
edmorley
For more on the rationale behind this feature, see:

[https://www.ietf.org/mail-
archive/web/httpbisa/current/msg25...](https://www.ietf.org/mail-
archive/web/httpbisa/current/msg25463.html)

[https://bitsup.blogspot.co.uk/2016/05/cache-control-
immutabl...](https://bitsup.blogspot.co.uk/2016/05/cache-control-
immutable.html)

Rough summary:

> At Facebook, ... we've noticed that despite our nearly infinite expiration
> dates we see 10-20% of requests (depending on browser) for static resource
> being conditional revalidation. We believe this happens because UAs perform
> revalidation of requests if a user refreshes the page.

> A user who refreshes their Facebook page isn't looking for new versions of
> our _javascript_. Really they want updated content from our site. However
> UAs refresh all subresoruces of a page when the user refreshes a web page.
> This is designed to serve cases such as a weather site that says <img src=""
> ...

> Without an additional header, web sites are unable to control UA's behavior
> when the user uses the refresh button. UA's are rightfully hesitant in any
> solution that alters the long standing semantics of the refresh button (for
> example, not refreshing subresources).

~~~
fulafel
What's the hurry to optimize away the revalidation requests when the user
clicks reload? Is it just beancounter mindset about saving a few "304 Not
modified" responses? In that case they shouldn't count the percentage of
requests, but percentage of bandwidth or CPU seconds. Tiny responses are much
cheaper with HTTP/2, so be sure to benchmark with that.

~~~
Denvercoder9
At Facebook scale, the sum of all those "304 Not Modified" responses is
probably a significant amount of resources.

~~~
fulafel
I'm not sure it's a good argument to take up the biggest companies and then
tally up effects of a micro improvement. You could argue for all kinds of
complexity increasing changes resulting in %0.01 efficiency improvements this
way.

~~~
marcusarmstrong
At my company, 304s account for 3% of our CDN requests.

~~~
fulafel
304 requests are so tiny that you probably end up in the order of 0.01%.

------
DanWaterworth
I think a better way to handle this would be to use subresource integrity [1].
Then, if the browser's cached version matches the hash, it can be sure that it
doesn't need to do any requests.

[1] [https://developer.mozilla.org/en-
US/docs/Web/Security/](https://developer.mozilla.org/en-US/docs/Web/Security/)

------
manigandham
It would be far better to have user agents handle caching headers correctly
instead of creating another configuration option (which will likely suffer
from the same implementation problems).

 _cache-control: private_ with either sliding or concrete expiration time
already handles this.

~~~
prolurker
cache-control: private doesn't seem to imply that a resource won't ever change
and on page refresh the browsers have to check if the resource has been
updated, immutable would avoid the 304's responses cascade.

~~~
manigandham
That's the entire point of the expiration time. Use a 2 year range and it's
effectively immutable. No content will stay on device forever anyway and
headers can easily be set to a smaller time-frame or _must-revalidate_ if the
content owner wants it.

Browsers mistakenly continue checking for new copies when they shouldn't
within the expiration time. Fixing poor implementations with more standards
never works well.

~~~
prolurker
The problem is that servers are allowed to update their resources at any time
without waiting for any specific expiration time. So when a user instructs
it's browser to refresh the page, usually expecting to get the most up to date
version, the browser has to choose between giving the still valid, but maybe
not completely updated, cached version or actually checking if the resource
has been updated.

Immutable makes it clear that the server won't update the resource in place
and will handle updates by generating a new one so the browser can happily
avoid checking those resources on page refresh.

~~~
taeric
And many people are almost certainly going to find that they actually need to
either recall an old immutable thing, or mutate it.

Also, I will certainly want to clear out my browser's cache on a regular
basis. I do not want it keeping immutable things just because they shouldn't
ever change.

~~~
Dylan16807
You can't 'recall' something you already sent out to browsers, and if you need
to mutate then it's easy to make a new URL.

This header won't make browsers cache data any differently. It skips a step
when the cache is being read from.

~~~
taeric
But in the current world, you can serve new content on the conditional check
that caches currently do.

That said, I am ultimately for this. I think. There is plenty of data showing
that this is a low hanging fruit to hit.

~~~
Dylan16807
The conditional check that they do _sometimes_. Now half your users see the
new version and half see the old version. Not much of a recall.

~~~
taeric
Still more of a recall than will be possible in the new world. And you can
always detect the old code and prompt users to refresh. (Typically happens on
a restart.)

Again, though, I am ultimately for this. I just remain skeptical of any
panacea.

------
meandmycode
What does immutable really mean when you only rent a domain name? I think
about this occasionally with domains and email addresses, they've become a
trusted piece of information, but over time the ownership of that thing
changes, does make me wonder about fraud in the future when sizeable companies
die off and their domains free up.

~~~
icebraining
In this case, if a company dies and a malicious actor gets the domain, there's
not much they can do besides tell the browser to load those assets - but they
could probably just take a copy of the original site and serve a copy of those
assets themselves.

The attack might work the other way around: the attacker buys a bunch of
domain names, serves "sleeper" malicious JS files with this on common paths
(say, the paths used by Wordpress and other common CMSs), then releases the
domain. When the new owner installs a CMS and start serving their site, the
browser loads the malicious JS instead, which is now running under the new
site's Origin (security context).

~~~
joosters
But to make this attack work, browsers would have to visit this site before
the new owner takes it over, in order to receive and cache the malicious JS.
And if you can make people receive malicious JS, you've already got your
attack vector - immutable caching isn't needed.

~~~
icebraining
_if you can make people receive malicious JS, you 've already got your attack
vector_

No, because a malicious JS file by itself can't do much. The attack vector is
the malicious JS running on the new site, with permissions to steal session
cookies and interact with the application. That's why caching without
verification is important: to make sure the browser uses the cached malicious
JS instead of fetching the new one.

------
lgierth
Nice to see this has made it through the standards process -- the `immutable`
keyword is tremendously useful for systems that store and provide actual
immutable data, e.g. content-addressed distributed systems.

------
carussell
It would be nice to have strong resource pinning. I've been in contract
situations where coordinating with in-house IT for a new server deployment
would have been a massive, go-nowhere headache, while throwing a webapp
together on my own and self-hosting would've been easy but a big problem wrt
company policy on data export. Resource pinning solves this by bringing us
into the realm of "auditable". Strong resource pinning would look something
like this RFC, except the browser would refuse to accept new resource
deployments without deliberate consent of the client. (Something that can be
bypassed with a hard refresh, as in this RFC, is not strong enough.)

Other situations I imagine would benefit from this are web crypto and HIPAA
compliance.

------
jlgaddis
I have a feeling this will end up like HSTS. It sounds really great at first,
then a bunch of folks will get burned by it (just wait until somebody
accidentally sets it for /index.html or whatever), and finally the general
recommendation will be to stop using it altogether ("more harm than good").

Forever is such a long time.

Besides, aren't there already ways to say "cache this resource for <acceptable
timeframe>"?

------
glacials
This is also known as key-based cache expiration, detailed by DHH here:
[https://signalvnoise.com/posts/3113-how-key-based-cache-
expi...](https://signalvnoise.com/posts/3113-how-key-based-cache-expiration-
works)

You never worry about when to expire your cache entries if the key changes
every time the item does. It's nice to finally see cache-busting coming out of
the woods.

~~~
jrochkind1
You can already use that kind of cache expiration with HTTP of course, and
many of us do -- but you can't _tell the user-agent_ or other client that you
are using it. The best you can do is set a far-future expires date. Some
agents/clients will still do HEAD/If-Modified-Since requests to check if it
_really_ changed before your expire date.

So this is a new thing, properly called immutable responses, to tell the
client that they really can treat this (in various ways) as a completely
immutable response.

------
cpburns2009
I like the idea of this. This would be very useful for versioned resources
(e.g., images, JS, and CSS files) because it would eliminate unnecessary
requests for them after they're cached.

The example of a 1 year cache time seems a little extreme, though. I think a
month would be better.

~~~
fooey
A 1 year cache time doesn't force the browser to hang on to it for a year, it
just lets the browser know that if it's seen that resource in the last year it
shouldn't bother getting it again. The browser is still free to dump whatever
it wants to reduce its footprint.

~~~
jrochkind1
Of course the browser/agent still can with a response tagged immutable too.
There will never be a standard that _forces_ agents to hold on to cached
content regardless of their disk space availability and needs.

But the immutable keyword gives an additional clue to the agent about
semantics, to inform caching.

------
mjs
Chrome is not going to get this:

[https://bugs.chromium.org/p/chromium/issues/detail?id=611416...](https://bugs.chromium.org/p/chromium/issues/detail?id=611416#c46)

Instead, when the user attempts a full-page reload, Chrome will revalidate the
resource in the URL bar, but not subresources (they will come from the browser
cache, if the cache-control header checks out):

[https://blog.chromium.org/2017/01/reload-reloaded-faster-
and...](https://blog.chromium.org/2017/01/reload-reloaded-faster-and-leaner-
page_26.html)

~~~
jbverschoor
Yeah I thought that was the current behaviour, and shift-cmd-r actually
invalidates all

------
IncRnd
Section 3. Security Considerations

"Clients SHOULD ignore the immutable extension from resources that are not
part of an authenticated context such as HTTPS. Authenticated resources are
less vulnerable to cache poisoning."

This must NOT read SHOULD. It must read MUST! Otherwise, your computer will be
subject to an executable planting vulnerability.

I'm surprised they don't have a much larger list of security considerations.
There are many other issues that can happen.

~~~
jlgaddis
If "an authenticated context such as HTTPS" isn't being used -- regardless of
the presence of this extension -- isn't your computer _already_ "subject to an
executable planting vulnerability"?

~~~
IncRnd
Maybe. That isn't exactly true. This adds the ability to persist a new threat.

There is a temporal difference. An attacker may wish to plant something today,
in a coffee shop, that would execute in a protected environment, tomorrow.
Immutability of caching can only help an attacker.

Yes, there are other ways for an attacker do this, but there is no reason to
add to more ways! That's why the web is in its current state.

I see how you put "executable planting vulnerability" in quotes. Sometimes (in
the current marketing), these are called APTs, but they have been around
forever. e.g. Think of dll planting in Windows and the millions of attacks and
three or four new API sets from Microsoft that resulted from that single
ability to plant a dll in the search path.

This type of persistence can also be called incubation.

------
masterleep
It's sad that there's still no good way to do deployment based expiration of
assets without horrible hacks like sticking the asset checksum in the URL. I
know that none of my assets will ever change unless a deployment occurs, and
even then, most of them won't change. HTTP doesn't seem to support this use
case well at all.

~~~
toomim
What do you mean by "deployment-based"? You want things to expire each time
you "git pull" on the server?

~~~
jeremiep
Wait you're actually running git on production boxes? Doesn't that mean your
entire build toolchain also lives on production?

The last company I worked for did that and everything was much slower and
fragile than it would've been had we deployed packaged artifacts instead.

~~~
jrochkind1
Eh, heroku seems to handle it fine.

~~~
krallja
No, Heroku has a separate build step before it deploys to your dynos.

------
Lxr
What happens when wrongly configured servers or frameworks with default
settings implement this too aggressively? Browser vendors want their product
to work as expected when the user hits refresh, so will be forced to either
voliolate the standard or show stale content.

~~~
jlgaddis
Per _foota_ [0]:

    
    
      Clients SHOULD NOT issue a conditional request during the response's
      freshness lifetime (e.g., upon a reload) unless explicitly overridden
      by the user (e.g., a force reload).
    

The server is still going to serve up the resource when requested. This
behavior is for the client-side of things (browser).

[0]:
[https://news.ycombinator.com/item?id=15262108](https://news.ycombinator.com/item?id=15262108)

