Hacker News new | past | comments | ask | show | jobs | submit login
New HTTP standards for caching on the modern web (httptoolkit.tech)
185 points by pimterry on Oct 21, 2021 | hide | past | favorite | 50 comments



I appreciate that the Cache-Status: header they describe uses RFC 8941 structured fields and thus ";" to separate items within each cache and "," between caches. It's like someone put effort into making it easy to parse.

Rant time: I just finished writing a state machine parser for "WWW-Authenticate:" and "Proxy-Authenticate:". Those headers use comma both to separate challenges and to separate parameters within a challenge, which just seems mean-spirited. Other things about HTTP authentication that seem mean, dumb, annoying, or all of the above: both the RFC 2069 example response and the RFC 7616 SHA-512-256 example response are calculated incorrectly; RFC 7616's userhash field seems to require the server to do O(users_in_database) hashes to know what user to operate on; RFC 7235's challenge grammar describes a token68 syntax that really only is used for the credentials in basic, never a challenge; RFC 7616 drops backwards compatibility for RFC 2069 even though I bought a product this year that still uses RFC 2069-style calculations; and it's based on old standards that followed "be conservative in what you do, be liberal in what you accept from others" so RFC 7230 section 7 has separate grammars for what lists you must send and what lists you must accept, which further complicates parsing the nested lists.


Digest should just die. It’s a terrible standard that makes it essential impossible to store passwords securely on the server.


100% agree. It's like when you call into phone support and the person on the other end obviously can see your plaintext password.

I'm working with IP security cameras, and the ONVIF standards [1] actually mandate Digest. I own cameras that only support HTTP Digest Authentication for RTSP. I'd tell the ONVIF folks this is dumb, but I don't want to pay $10,000/year for that privilege. [2] The only saving grace is that they aren't "real user" passwords, and you can isolate the cameras to a network segment where only the NVR can talk to them and vice versa. (And the NVR needs to keep the plaintext password anyway.)

[1] https://www.onvif.org/profiles/specifications/

[2] https://www.onvif.org/join-us/membership-levels/


Digest authentication made sense back when HTTPS was difficult and expensive to set up. Now that Let's Encrypt (and others!) have made HTTPS hosting more accessible, it's mostly pointless.


"Digest" authentication was always bad, even back when HTTPS was difficult.

It requires the server to store passwords in plaintext, rather than as hashes.


Digest auth stores passwords on the server as plaintext; basic auth transmits passwords over the network as plaintext. Both are bad, but I feel like having the plaintext on the network is probably worse.


For sure. But "better than 'Basic'" is a much lower bar than "not bad".

I was going to say that this problem is already noted in the very first RFC, that it was already known to be broken on day one; but on a closer reading I'm not sure that's actually true.


The format that Digest uses to transmit passwords is not a lot better than plaintext. It’s a simple salted hash, which is easily brute-forced offline unless the password is strong.


HTTP/4 should send all headers in JSON format and dates in ISO format.

The old RFC822 header format looks so simple, but parsing it is the devil, especially email headers such as "to".


No. It should use an efficient and flexible binary format.


> Those headers use comma both to separate challenges and to separate > parameters within a challenge, which just seems mean-spirited.

Indeed. And while the resulting semantic meaning is unambiguous, because both lists allow empty elements (well, you shouldn't emit them but as you note, section 7 says you still need to be able to parse them), it can be ambiguous for the parser whether an empty element belongs to the inner list or the outer list. So you'll likely run in to trouble if trying to use a parser-generator or generic algorithms, even though the ambiguity is totally inconsequential.

In "Scheme1 realm=foo, , Scheme2 realm=foo", is the empty element an empty param for the Scheme1 challenge, or an empty challenge between the Scheme1 and Scheme2 challenges? Answer: It doesn't matter, but good luck telling your tooling that.

> Other things about HTTP authentication that seem mean, dumb, > annoying, or all of the above: both the RFC 2069 example response > and the RFC 7616 SHA-512-256 example response are calculated > incorrectly;

RFC 2069 (rev 1): Eh, the mistake is noted in the errata.

RFC 2617 (rev 2): Gets it right.

RFC 7616 (rev 3): Indeed. This has been reported as errata (way back in 2016), but is not "verified". The report claims that the incorrect values are obtained by naively truncating SHA-512 to 256 bits, rather than making the other changes (using a different H⁽⁰⁾ value) for it to be true SHA-512/256; but I haven't checked this to comment on it.

> RFC 7616's userhash field seems to require the server to do > O(users_in_database) hashes to know what user to operate on;

I'd assume that you'd just store the H(concat(username, ":", realm)) as a column in the database, and select on that. I guess you're still doing O(users_in_database) hashes, but you'd do a hash when you create each user, not re-doing all of them for each request.

> RFC 7235's challenge grammar describes a token68 syntax that really > only is used for the credentials in basic, never a challenge;

Yeah. That's weird.

> RFC 7616 drops backwards compatibility for RFC 2069 even though I > bought a product this year that still uses RFC 2069-style > calculations;

You don't mention that RFC 2617 set up a migration path, which RFC 7616 finally "turned off".

1997 (RFC 2069): Here's a formula.

1999 (RFC 2617): Here's a new more secure formula, servers "SHOULD" set the "qop" parameter to opt-in to it, and clients "SHOULD" respond in kind. But if qop is unset, then fall back to the old formula.

2015 (RFC 7616): The old formula is gone, you "MUST" set the qop parameter to select the new formula.

That was a 16 year migration period where both were supported, during which everyone "SHOULD" have switched to the new way.

> and it's based on old standards that followed "be conservative in > what you do, be liberal in what you accept from others" so RFC 7230 > section 7 has separate grammars for what lists you must send and > what lists you must accept, which further complicates parsing the > nested lists.

You make it sound like they totally changed it; the new list syntax is a subset of the old syntax, not something different.

Eh, there are already lots of places in the spec where it says "don't send X... but if you receive X, you should accept+handle it anyway". This is just one more.


> In "Scheme1 realm=foo, , Scheme2 realm=foo", is the empty element an empty param for the Scheme1 challenge, or an empty challenge between the Scheme1 and Scheme2 challenges? Answer: It doesn't matter, but good luck telling your tooling that.

Yeah, I also wrote a parser generator grammar as a reference and tripped over this. I'm not real experienced with them, so it took me a bit to find the simple solution: the inner one can't consume the final comma (or the outer one won't parse due to the lack of separator, and the parser won't know to backtrack/where to), so for the inner one I just removed the portion of the rule that consumes any trailing commas. A divergence from their ABNF but oh well.

> > RFC 7616's userhash field seems to require the server to do O(users_in_database) hashes to know what user to operate on;

> I'd assume that you'd just store the H(concat(username, ":", realm)) as a column in the database, and select on that. I guess you're still doing O(users_in_database) hashes, but you'd do a hash when you create each user, not re-doing all of them for each request.

Oh, good point. I was thinking the nonces were used in there, but they aren't.

Still, it seems like such a stupid idea. The server has to maintain that extra column (per algorithm if they offer multiple), and it's just to protect the privacy of the username when the request and response are in plaintext. Who on earth thinks the username is more private than (in particular, isn't simply printed within) the actual request and response bodies? They should all be over TLS.

> That was a 16 year migration period where both were supported, during which everyone "SHOULD" have switched to the new way.

Right. They didn't, though.

> > and it's based on old standards that followed "be conservative in what you do, be liberal in what you accept from others" so RFC 7230 section 7 has separate grammars for what lists you must send and what lists you must accept, which further complicates parsing the nested lists.

> You make it sound like they totally changed it; the new list syntax is a subset of the old syntax, not something different.

True, but if not for the more permissive receive format there wouldn't be the complications above.


> In "Scheme1 realm=foo, , Scheme2 realm=foo", is the empty element an empty param for the Scheme1 challenge, or an empty challenge between the Scheme1 and Scheme2 challenges? Answer: It doesn't matter, but good luck telling your tooling that.

If it doesn't matter, it seems like the tooling would be happy if you told it "it's an empty challenge". Or "it's an empty parameter". What's stopping you?

In "1 + 2 + 3", is the parser supposed to generate (+ 1 (+ 2 3)) or (+ (+ 1 2) 3)? It doesn't matter, so you pick whichever one you want, and tell the parser to do it that way. This is a bog-standard problem to need to handle in a parser, and your tooling will definitely provide a way to handle it.


One challenge I've experienced recently is I can't figure out how to hint to the browser that it should refresh a particular cached page. (Without appending ?time=1634851491 to the URL.)

For example, let's say I've already cached the page /new.html

Now, I click a button which triggers a change to the page, and I am redirected back to it.

Even though the page has changed, and the browser should see a new timestamp in the header if pinging the server, it just doesn't seem to happen.

Has anyone dealt with this before? I tried to ask on StackOverflow, but lately my questions don't seem to get any attention, and I've run out of reputation to spend on bounties.


This is what ETags are for. Upon a user's first visit the server should return an ETag uniquely representing the current version of the page. The browser will cache both the page and the tag. Upon subsequent page visits the browser will send an If-None-Match header containing the tag for the version of the page it has cached. The server should compare the incoming tag with the tag for the current version and return a "304 Not Modified" response if the tags match or a full response with the newer tag in the ETag header if they don't.


a drawback of relying on ETag is that if a page is visited frequently, then the cache validation "If-None-Match" request still being sent and takes bw+latency+computation+etc and I suspect that if the connection is broken or status 503/504 is responded, then neither the cached page is shown. my understanding is that he whant to refresh the page only if it's known to be changed and always use the cached version otherwise.


Yeah, and it works the same way with If-Modified-Since and Last-Modified.


It's a combination of different headers that's hard to sum up in a short comment. A good article on the subject should talk about all these headers: Expires, Cache-control, Etag, Pragma, Vary, Last-Modified

Key CDN has an article on it. They certainly would have experience and expertise there. I didn't read the whole thing, but it seems to have it covered: https://www.keycdn.com/blog/http-cache-headers

There's also some interesting exceptions where rules aren't followed. Like browsers typically have a completely separate cache for favicons. I suppose because they use the icons in funny/different ways, like bookmarks.

There are also sometimes proxies (especially corporate MITM ones) that don't follow the rules. Hence the popularity of cache-busting parameters like you described.


I get the desired effect without Expires and Pragma, primarily using Cache-Control: no-store (or something I copied from MDN).


But no-store makes the resource uncacheable by both intermediate caches and browsers.


There's no standard way for one page to invalidate another. I've seen some private patches to do it in squid, but that doesn't help because you want to do it for browsers.

Your options are probably:

a) redirect to a different URL as you've done by appending stuff to it

b) require revalidation on each request, recipies shown by other posters

c) POST to the url you want refreshed; post isn't cachable. Note that you can't redirect to POST somewhere else, but you can do it with javascript.

d) use XHR to force a request as another poster mentioned.


E) use webworkers

Not saying its the right option, but it is an option.


for (c), HTTP 307 doesn't work?


Apparently, yes. My webfoo is a bit dated.


There's no easy way. One way is to use `max-age=0, must-revalidate` but then your origin server should be optimized for conditional GET requests.

It's a very tricky balance between origin server load and consistency. By deciding to use HTTP cache you agree to eventual consistency and this decision comes with its upsides and downsides.

There has been a proposal in 2007 for a thing called cache channels. It defined a mechanism for an origin server to expose a feed which caches would poll at an interval. The feed would list resources that have gone stale since the last query. This mechanism in conjunction with conditional GET requests would've solved part of the issue of hinting browsers to invalidate their local resources.


Cache-Control: max-age=0, must-revalidate

Sounds like what you want (presuming your server handles 304 logic correctly)


I do want caching to happen, however -- until something changes the page.


That directive says - cache, but ask the webserver if the page has changed every time. If the server responds 304 not modified, it uses the cached version.

From a performance perspective though, people on good internet might be dominated by RTT so a 304 might be almost as expensive as a full 200.


I have this same issue and have been working around it by appending to the URL. I'd like to believe there's a better way, but I don't know what it is. Alternatively, I could just disable caching but that would defeat the point.


as of my understanding of the original design of HTTP, each HTTP resource may state how long itself can be cached in the response header; and the client (browser, proxy, etc) does not have to re-request the resource before the expiry. this is the sandard, so you can not hint that a resource has to be revalidated - in standard way. obviously since then, several tricks emerged, like your mentioned timestamped URL approach - however i'm not sure upto what extent is it standardized in clients to understand that "/path?query" is somehow related to "/path", because originally the request string (path and url parameters) was opaque to the http client, so they should be cached independently. things obviously changed since then. the method i use is to fire a request to the URL which has to be refreshed by Ajax (XHR) with Cache-Control header (yes, it is a request header too), then display the response content or redirect to it.


> however i'm not sure upto what extent is it standardized in clients to understand that "/path?query" is somehow related to "/path", because originally the request string (path and url parameters) was opaque to the http client, so they should be cached independently. things obviously changed since then.

It hasn't changed. Those two URLs are still cached completely independently by the user agent. The ?time=... cache busting trick is meant to produce a cache key that's never been used before, thus requiring a fresh request. The new request doesn't clean up the cache entries for the old URLs; it just doesn't use them. That's one reason it's better to use etag and such to make the caches work properly, rather than fight them with this trick.

On many servers, if new.html is a static file, the same entity is produced regardless of parameters. But the user agent doesn't know this.


yes, thanks for clarification. my impression that /path and /path?parameter were handled in relation of each other is because some proxy added an option to do so. but good to know that user agents does not.


On Macs, in Safari, cmd-opt-R is “reload page from origin”, which I believe ignores cache


Can we stop calling things "modern"?

We're talking about the web. A constantly changing, bubbling soup of protocols, software, ideas, and practices cobbled together with duct-tape and spit. The word "modern" has no meaningful definition as it doesn't belong to any identifiable point along the timeline.


Here modern is not description of the RFC iteself,

modern web = today's web

https://www.etymonline.com/search?q=modern

> In history, in the broadest sense, opposed to ancient and medieval, but often in more limited use. In Shakespeare, often with a sense of "every-day, ordinary, commonplace." Meaning "not antiquated or obsolete, in harmony with present ways" is by 1808.


I didn't quite understand the fundamental need behind this. While the article has explained to the extent that it will help determine who returned the cache or it will help specify who can cache, but what I don't get is why would I be interested in this?

My mental model of the internet has clients and servers. Clients will talk to servers, perhaps through intermediaries, but intermediaries are also clients and servers playing appropriate role depending on the direction of data. A server doesn't need to know where a request is coming from and a client doesn't need to know who is sending it. It's a very elegant and powerful model that allows for a great deal of flexibility.

This standard appears to treat intermediaries as a first class concern with special agency. Which means clients and servers will start handling intermediaries differently. It's a complicated and restrictive architecture overall and am not sure if it's a good idea. Was there anything fundamentally broken with the current caching mechanisms that called for this?


Client<>Proxy hop has very different characteristics than Proxy<>Server hop, so you may want to use different caching strategies for each.

For example, cache on clients for a long time, because the network is slow and expensive there, but cache on the proxy server only for a short time, so that it updates from the server relatively often.

Or you may want a reverse: cache on clients only for a few seconds, because you can't purge clients' caches. But tell the proxy to cache forever, because you will manually purge its cache when something changes.

This was previously sort-of possible with max-age vs s-maxage, but other cache directives don't have two versions. Notably, stale-while-revalidate didn't work.

Stale-while-revalidate has different implications on each hop. On the client it means it's stale until the next page load, which may be undesirable. On a proxy that keeps getting hits all the time, it's only a slight propagation delay.


> because you will manually purge its cache when something changes.

If you're talking to the proxy separately, it's no longer a proxy right? It's as good as part of your infrastructure. As long as you're able to talk to the entity directly, is there a need to allow for something like this in the standard?


Since HTTPS everywhere happened there are no truly public proxies any more, but you often still work with reverse proxies, load balancers, WAF middleboxes, and CDNs.

A standard still helps unify things across vendors/implementations. A cacheable HTTP request is still an HTTP request, even if you use some other method to purge it later.


I suppose this could be useful for 'public' proxies where you are able to invalidate cache with an API call, but can't create arbitrary logic for the cache


The architecture of the web (REST), always had intermediaries as first-class citizens. I can recommend reading into Fielding's dissertation!

The declarative and stateless message design is partly motivated by the need for intermediaries processing and understanding the messages (ie POST must not be cached f.e.).

I think the introduction explained it very well: The original design generalized too much for the needs of 2021 by putting all caches into a single bucket.


The thesis: https://www.ics.uci.edu/~fielding/pubs/dissertation/top.htm

Of cache, see eg: https://www.ics.uci.edu/~fielding/pubs/dissertation/rest_arc...

> Layered system constraints allow intermediaries--proxies, gateways, and firewalls--to be introduced at various points in the communication without changing the interfaces between components, thus allowing them to assist in communication translation or improve performance via large-scale, shared caching.

With REST (and without encryption or with trusted ssl stripping proxies) you can have a lan level cache - eg: if one of your 10 000 students access Fielding's thesis, that might be locally available for the potential 9 999 next requests. Typically this was useful for caching news site front pages - and could in theory work for video content too).


It recognized the need for intermediaries sure, but I don't believe it treated them as first class citizens. The idea has always been to treat them as proxies. Proxies extend/enhance the behaviour of a certain process without changing the interfaces. This particular standard changes those interfaces by treating proxy communication different to target communication.


Build leaner websites and don't overcomplicate things. You won't need CDNs and data-kraken like Cloudflare & Co.


A lean website (they call out haveibeenpwned) can still get billions of hits. CDNs were initially conceived to help with international / intercontinental latency, which you can't fix by having a leaner website. And there's a heap of problems to solve that you can either spend your own valuable time on, or outsource to a specialist.


Leaner websites can only reduce bandwidth strain, there's no impact on latency, right?


Really good article - I hadn't heard about either of these headers and I really appreciated the clear explanation of both.


We really need a way to tell the browser NOT to access or return the cache if the server returns a 304. There are so many situations where stale data just gets discarded because it's not actually useful, but the browser still goes through the motions of accessing the disk, storing in memory and then garbage collecting immediately. The browsers' only existing capabilities are to naively assume that we still need to process that file in some way, but the only thing accomplished is using some electricity.


Are there any standards for signed but cacheable http content? Seems like that's a tool that's missing from our toolbox (unless anyone can enlighten me.)

Seems like something as simple as that could do most of the job IPFS is setting out to do, for example.


It sounds like you're describing Signed HTTP Exchanges (SXG).

https://developers.google.com/web/updates/2018/11/signed-exc...




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: