Hacker News new | past | comments | ask | show | jobs | submit login
Cache headers could probably be more aggressive (macarthur.me)
83 points by skilled on Sept 21, 2023 | hide | past | favorite | 51 comments



Caching is largely a means to reach a goal of reduced bandwidth but even more so network latency. Almost everything else is fast and cheap, and network latency can largely be simplified to number of round trips, similar to the N+1 problem for database queries. Reducing number of round trips is a better proxy for performance (although global CDNs can make some round trips faster than others).

I don’t know what serving stacks kids use these days but judging from my web developer tools, it appears to be a hodgepodge of dynamically loaded crap and microservices (which adds also TLS handshakes to the mix) that of course fetch things in the least efficient ways possible. Caching (a) won’t even work for many of these cases and (b) wouldn’t help much even when it does work.

I’d love to see a post about round trips exclusively and how to reduce it in the bloated stacks of today. I hope it’s not as bad as I think.


The madness begun when SPAs (single page applications) became fashionable along with simple REST APIs. This made the client experience smoother but the network waterfall could get a bit long. For a shop you might have a workflow like [click product]->[get /product]->[get /product/reviews].

Then GraphQL came along to solve the N+1 issue as let you query everything in one network call... but only over HTTP POST so it broke caching on CDNs.

Then edge workers/lambda came to solve that issue. You can also bring your data closer to the edge with replicated databases.

Most newer stacks like Nextjs, Nuxt or SvelteKit behave like a traditional server side rendered page at first and then dynamically load content like an SPA once loaded which is the best of both worlds. They'll also be using HTTP/3 to force-feed a bunch of assets to the client before it even requests them.

Ideally you'd have your data distributed around the world with GraphQL servers close to your users and use SSR to render the initial page markup.


The data that is queried by POST through GraphQL is the same that used to have cache disabled on older web applications. You just do not want to cache that, you want to reload it every time the user asks for a reload.

This is not the problem.

The problem that the GP is pointing out is that modern frontends seem to break the data all over the place, and request each piece independently. GraphQL allows solving this problem, but on practice nobody uses it that way because it breaks the abstractions your frontend framework gives you.

Anyway, this entire discussion is on a weird tangent from the article. If you (not the parent, but you reading this) don't cache your static assets forever (with guarantees that this won't break anything), then you should really read the article.


> but only over HTTP POST so it broke caching on CDNs.

That's why when making request related to realtime data in the server, a POST is always needed! GET is ONLY get for static content.


That is... not right.

While POST is effectively never cached, GET isn't always either. Cache headers should be set properly on GET responses, indicating desired caching behavior (which is more than 'on' or 'off', as the OP gets into), on any content, static or not.

The fundamental difference between GET and POST in http is idempotency. An non-cacheable response to an idempotent request is sometimes a (intentional, desirable) thing, which is why you can make GET responses non-cacheable if you want.

Static content isn't the only thing you might want to be cached, there are all sorts of use cases for caching (which again can be configured in many ways, it's not just on or off) for non-static content.


While you’re right in theory, it doesn’t always work in practice. Some systems fail to respect caching headers.

Further, in my experience, intermediate caches are mostly useless for non-binary product data. Either you need to make a round trip or you don’t. Sure, you can cache and return “not changed”, but you still get the latency. Just returning the data often isn’t much slower.

POST avoids all of those issues by pretty much saying “give us what we neeed every time”


By that logic, would you write a web site where every link was actually a POST unless it was to "static content"? That would be a disaster, no?

I guess the person I was replying to, and you, were talking about Javascript API calls rather than ordinary HTML executed by a "browser". It still seems like a wrong idea to me, but if this is what you do on actual apps and have success, I guess that's a thing.


> Some systems fail to respect caching headers.

Don't we call those bugs? http has a pretty well defined spec?


Customers don't care what's defined in the spec. They care how it works.


Here's how to reduce roundtrips:

* Support TLS 1.3 + HTTP/2 or HTTP/3.

* Reduce the number of different hosts that you connect to. The optimal number is 1.

That's it.


It’s a good start but no guarantee. If you have a sequential chain of requests, for whatever reason, the number of round trips is determined by business logic. If you have anything like say a microservice/API request preceding another, you’ll see more round trips. Given the leftpad-as-a-service trend we’re seeing today, I would expect that to be quite common.


Additionally, use early hints to push resources to the client before your server responds with the rendered page.


Wasn't that feature already removed?


You’re referring to HTTP/2 push, early hints uses two header frames in HTTP/2 or QUIC, the first one is usually pushed by your CDN containing a list of resources used on the page, and the second one is pushed out by your application and proxied by the CDN.


Asset domains to avoid cookies being sent on every js/css/image request?


Is the idea here that saving the data from the cookies will speedup things, because requests become smaller? Or does it have some impact on caching or something else that I am not considering right now?

I would expect that unless you have exceptionally large cookies, the saved roundtrips from another TLS handshake matter more than the data transmission for the cookies.


It's a security matter more than a performance matter, although improved performance is a nice side effect.

For assets served from a third party (a CDN), you don't want to send cookies that might include secrets (a session cookie that could allow access to a user's account for example).

You can trust that a third party won't intentionally log or make use of any sensitive information in cookies but you can't guarantee it. Best not to send it at all.


I mean - if you separate your html from your assets security wise, that naturally means they need to be on different hosts, as you cannot really reroute requests before TLS decryption based on paths or any other indicator.

But the motivation to put stuff on a CDN would be to improve performance. If you put your HTML on your own HW and your assets on a CDN for performance reasons, you might want to check if that really pans out, because those extra roundtrips may kill all performance savings you get from the CDN.


> assets served from a third party (a CDN)

Won't that naturally be on a different domain anyway?

How much performance hit do cookies really have in 2023?


Fingerprinting with content-based hashing in the filename works, but it’s not ideal. You’ve got to make it work for basically every static asset regardless of how it is generated.

Generally speaking, I’ve found it’s better to remove this from the build logic and treat it as an infra / deployment issue. Dump all of your static assets into your object store using the Git hash of the current version as a prefix. So if you are deploying foo.js, then you might serve it as https://static.example.com/<git hash>/foo.js.

This means that you can set all of your assets as permanently cacheable and still have your cache invalidated for everything every time you deploy something new. And it doesn’t matter what you use to build your assets; your build pipeline doesn’t need to know anything about this, it can just generate assets normally.


That means that after a deployment all my users will find themselves refetching everything on the next page load regardless of if it has changed.


If you’re using Webpack you can emit chunks by their content hash, which means the file names don’t change even when your commit changes.

This way, a new deployment of your application doesn’t cause new assets to be served to users, and maintains the speed of your website.


> If you’re using Webpack you can emit chunks by their content hash

You’re talking about this in terms of chunks of JavaScript, but static asset caching is something you should be doing for all static assets, not just JavaScript. Whenever I’ve seen somebody use Webpack content hashing for this, they’ve done it for JavaScript and CSS and forgot about basically every other type of file. And also, now you are tied to only referencing those assets via JavaScript because you have to get the URLs from Webpack, whereas with a URL prefix you can just use relative URLs which work in other contexts as well.


You aren't tied to only referencing things from JS: you can have Webpack emit a manifest file with a JSON object mapping from the original filename to the hashed one. Then oldschool web app code can reference these fingerprinted files using the manifest.

To fingerprint files you normally wouldn't import in JS, you can 1) import them in JS anyway just for the bundling side effect or 2) use a Webpack copy plugin to copy all font files, for example, to the build directory and fingerprint them.


Yeah, for this I like to append the deployment time to the asset paths from within my application; however in the case of static websites it can be a problem.


Sounds interesting. How do you handle that? How do you select the correct/latest ref of each asset to send to the user? It feels like you would have to do the same amount of work that you would have to do with fingerprinting.


With fingerprinting, each asset has a URL you need to ultimately get from your bundler individually. With a URL prefix, your assets retain their original filenames. In some contexts you’ll need to know the prefix, but it’s the same for every asset. And in other contexts you can just use relative URLs and not think about it at all.


You would need the correct git hash, no? Instead of /rndotherhash_filename.ext you are looking for /rndgithash/filename.ext

I fail to understand how resolving that is simpler.


Support you have the following assets:

  - /js/foo.js
  - /css/bar.css
  - /img/baz.png
With a content-based hash, end up with something like the following:

  - /js/foo.abc123.js
  - /css/bar.def456.css
  - /img/baz.ghi789.png
This means that for each and every asset you have, you need to keep track of their unique filename for that deployment and you need to be able to access that whenever you want to refer to the asset. In practice that means most people rely on their bundler to rewrite every place they refer to those assets, which can make it awkward if they need to refer to them from non-JavaScript contexts. Also in practice, people often seem to ignore anything other than JavaScript or CSS, meaning you get stale assets if they aren’t JavaScript or CSS.

With a Git hash prefix, you would have something like:

  - /abcdef/js/foo.js
  - /abcdef/css/bar.css
  - /abcdef/img/baz.png
This means you only have one single reference to manage, which you can keep track of without relying on your JavaScript bundler. It also means that in a lot of contexts you don’t need the reference at all, for instance in your stylesheets you can just refer to url(../img/baz.png) and there’s no rewriting necessary to get the right behaviour.

Your build process and your infrastructure caching are two separate concerns; it’s easier if they aren’t tightly coupled.


It might be easier, but it's strictly worse caching since all static assets get invalidated every deployment instead of just the assets which changed.


That would be my objection as well.


For static assets also query keys can be used (/image.webp?hash=abcd). Most web servers just ignore everything after the question mark for static files and browsers use the full URL as cache key.


I don't really like these content marketing posts, that subtly hide referrals.

I also don't want to jump to conclusions, maybe the author really does like PicPerf (and just so happened to include a referral tag lol), but I think most people here are very sensitive to content marketing.

I'm trying to apply for a visa, and it's almost impossible to find official government documents or 1st hand experiences buried underneath all the shitty content marketing from law firms.


> maybe the author really does like PicPerf (and just so happened to include a referral tag lol

Worse: The author of the post is the creator of PicPerf


The author doesn’t mention this but if you want your server to respond with 304 status codes you will need to implement ETags [1].

[1] https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/ET...


You can implement Last-Modified instead.


It’s often not that easy to figure out the last modified when serving up a web page. There might be multiple DB records for that page. Hashing the page into an etag often easier to implement.


Sure, I was simply pointing out that one doesn't need to implement ETags to have 304 responses.


I mean sure, technically you can send any status code from your origin application. ETag is generally the standard way to go about it with regards to web servers. Last-Modified is more of a fallback mechanism.


Acting like Last-Modified is somehow substandard is a bit disingenuous. The If-None-Match (etag) request header takes priority over the If-Modified-Since (last-modified) request header (if both are sent), but they provide identical functionality (caching and preventing modification if out of date) and are both part of the HTTP standard.


You have to be careful with this though. The static site hosts mentioned provide "atomic deploys" which means that old assets will be removed from the live site when the next deploy happens. This means that users requesting a hashed file name after the next deployment happens will be greeted with a 404. So your choice with these providers is to pick between a stale asset or a 404. In many cases a stale asset is preferable (especially for fonts and CSS which are typically at least mostly compatible between updates). And if you aren't hashing your assets then you will want to keep the revalidation behaviour (or at least keep the cache time short).

This gets much worse if you do any sort of lazy loading such as images that get loaded as the user scrolls or client-side routing which loads the next page dynamically using a hashed script file.

I wrote a blog post about these problems a while ago: https://kevincox.ca/2021/08/24/atomic-deploys/


I generally prefer to make the regex rule look for the hash signature, e g. `/\.[0-9a-f]{6}\./` rather than solely by file extension. That way if there's any files that didn't get fingerprinted it won't cache it indefinitely.


I've always had a great a time using Fastly (varnish) for caching.

You can cache a lot more than you think if you mastermind your strategy.

A diagram of fastly/varnish state -> https://www.integralist.co.uk/images/fastly-request-flow.png


I was spec-ing out Varnish for a project recently and was really impressed with what it could do. The stale-while-revalidate approach (immediately serve data past the `ttl` for a `grace` period while asynchronously refreshing it) keeps data flowing. It can `keep` data past the `grace` period and serve that when the origin goes down or use that in conditional requests (If-Modified-Since) to the origin to prevent fetching unchanged content. Coalescing requests mitigates thundering herds that reach stale caches. VCL scripting makes the whole thing very configurable. I think Varnish needs better marketing, but I suppose Fastly does that now.

[1] https://www.varnish-software.com/developers/tutorials/http-c...

[2] https://info.varnish-software.com/blog/request-coalescing-an...

[3] https://docs.varnish-software.com/tutorials/object-lifetime/

[4] https://www.varnish-software.com/developers/tutorials/varnis...


This is the literally the first thing you should check for on a new codebase, particularly because modern build tooling recognizes the need and makes it easy. When you have a.js depending on b.js depending on c.js and on and on, the latency on each network request (just to get the 304) really adds up and can drastically affect perceived speed. Indeed, if most of your users are on speedy connections, the difference between a 500 byte 304 response and a 50 kb re-download is pretty much immaterial compared to the latency of making the request in the first place.


Isn't it quite standard to use hashing for resources? It's also possible to append a query key (/style.css?hash=abcde123) it's also possible to use the last modified timestamp instead of a hash. Computing hashes on every request is slow, just reading timestamps is usually very fast. It's important though to mark the HTML resources as no-cache or with a very short cache time, otherwise some users may see broken or old content.


> It's also possible to append a query key

Unless you are actually serving different versions based on this key you can run into issues here. If the user loads version 1 of the page but some point later tries to request /script.js?v=1 but you have deployed version 2 in the meantime they will get the new version which may not be compatible. (also note that every time they request a script it will be "some point later" compared to the HTML file)

This is becoming more of an issue with lazy-loading single-page applications where they will load the script next routes as you click on them. This means that it may be many minutes between the original page load and the load of the next script.

Of course maybe that is better than a 404 like most of these providers provide with a different file name, but the 404 may be easier to catch and handle with a true reload.


I think you misunderstand the intention of those query keys. They are only meant to be used for cache invalidation, and shouldn't control what content is served.

If you want to serve multiple versions of a static asset at the same time just use different file names.

Edit: That's why SPA frameworks usually use the hash inside the file name approach, so you can keep serving the previous script files for users that didn't reload the application yet.

What we usually do for infrequently updated enterprise applications: we terminate all active sessions during an update and users are forced to do a silent SSO re-login and re-load the SPA. This also allows us to deploy non-backwards compatible API changes without running into problems. This strategy may not fit everywhere though.


An alternative is to think of the browser as a distributed managed cache.

Browsers have a cache API that can be used directly or in service/webworkers.

A managed cache comes with significant upsides, such as complete control, but costs design/developer time and comes with a risk:

Caching done wrong can lead to bugs that are very hard to reason about and fix. Especially when service workers are involved.


We heavily cache, as the article suggests, and to cache bust things we simply add ?nn,n to the url. e.g. site.js?1234

Whenever we update something, whether an image, js, or css, we increment the number. Most of it is automated since we have millions of images.


Interesting that he uses.some ext service for caching the images.

Doesn't this creates a security issue with ext content?

Shouldn't this easy to Selfhost through some nginx config?

Like nginx do a proxy pass upstream and cache and return a cache header? The only thing you need to do is either use a specific path or do the same as this service does with a URL at the end.

Of course denying random hosts l.

Should be doable in a few minutes




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: