But, isn't subsetting going to result in users now caching your subset instead of a cached copy of everything? I would think that does more harm than everyone grabbing a fully cached copy once from a cdn.
For starters, leveraging caching via a common CDN pretty much requires everyone to be using a single version from a single CDN. If you can't agree on that, then every time a new version comes out the web is split and the caching doesn't work, and every time someone decides to use another CDN (or someone provides a new one) the group is split again.
But then split that across all the fonts, formats, and compression schemes available and you'll see that the chance that a visitor has seen that font, at that version, from that CDN, using that compression scheme, in that format at any point in the past EVER is actually significantly smaller than you'd think.
Which brings us into the next point. Even if you've seen it before, the chances that you'll have it cached is pretty small. Browser caches are suprisingly small in the grand scheme of things, and people tend to clear them more often than you think. Add in privacy browser mode and "PC cleaner" programs and the average person's caches lasts much shorter than at least I expected it to.
But even worse are mobile caches. IIRC older android had something like a 4MB cache!!! And until very recently safari had something like a 50mb limit (and before that didn't cache ANYTHING to disk!). Now it's better, but you are still looking at a few hundred MB of cache. And with images getting bigger, big GIFs being common, huge amounts of AJAX requests happening all the time in most web pages, you'll find that the browser cache is completely cycled through on a scale of days or hours not weeks or months.
IMO it's at the point where the "dream" of using a CDN and having a large percentage of your users already have the item in their cache isn't going to work out, and you are better off bundling stuff yourself and doing something like "dead code elimination" to get rid of anything you don't use. And that method only becomes more powerful when you start looking at custom caching and updating solutions. A few months ago I saw a library that was designed to only download a delta of an asset and store it in localstorage so updates to the application code only need to download what changed and not the whole thing again. Sadly I can't seem to find it again.
All this common web stuff that is distributed by several CDNs (as well as separately by individuals) really suggests to me that there should some browser feature like `<script src="jquery.min.js" sha256="85556761a8800d14ced8fcd41a6b8b26bf012d44a318866c0d81a62092efd9bf" />` that would allow the browser to treat copies of the file from different CDNs as the same. (This would nicely eliminate most of the privacy concerns with third-party jQuery CDNs as well.)
So to take it to a bit of a rediculous (but still possible) point, I could probably guess what your HN user-page looks like to you. So from there I could serve that in an AJAX request to all my visitors with this content-based hash and if I get a hit from someone, I can be pretty damn sure it's you.
And that only really solves one or 2 of those issues. The versioning, compression schemes, formats, number of fonts, and sizes of browser caches will still cause this system's cache to be a revolving door, just slightly more effective.
And as for the security concerns of using a CDN. Subresource-integrity (which someone else here linked already) allows you (you being the person adding the <script> tag to your site) to say what the hash of the file you expect is, and browsers won't execute it if it doesn't match. So that lets you include 3rd party resources without fear that they will be tampered with.
Solution: Using a sha256="..." attribute should only allow you to access files that were initially loaded with a tag that has a sha256 attribute, and this attribute is only used for resources the developer considers public.
This not only solves the CDN issue but it also solve the issue of having to rename the files manually everytime someone do a change. It just makes caching that much saner.
If you see a script tag with the URL bank.com/evil.js, the browser shouldn't assume that the bank is actually hosting evil.js. Even if the hash matches, the content might not be there.
The bank might be using a content security policy to minimize the damage that an XSS attack can do. It only allows script tags from the same origin. However, now an attacker just needs to load evil.js with a particular hash into the cache, and they can create the illusion that the site is hosting it, without having to hack the server to do so.
It's not. Because does this hurts page load times.
You are having to create a new, cold, TCP connection to go fetch 50-100KB of CSS/JS/whatever from some random server. Which even in HTTP/1.1 is usually slower than just bundling that into your own CSS/JS/Whatever. HTTP/2 makes it even more so.
Just store and serve these things yourselves.
My employer manages like $20k devices. I betcha we spend 5 figures annually on this crap.
installing a ton of fonts up front takes a pretty significant amount of space, installing a subset for their language/preference or letting the user manage it makes it VERY easy to fingerprint users based on what fonts they download, and doing any kind of cross-origin long-term caching is a security nightmare as it lets you begin to map out where a user has been just based on what they download.
It solves a problem that people other than web designers care very little about, but costs me money and creates a slew of other problems...
Personally, I wish it was easy to just turn off!
You could compare it with http/2: If you do a survey, you won't find many people even knowing it. That doesn't mean it's useless to them.
Most people already have attractive, readable fonts installed on their computers, which are likely either sensible defaults, or configured for specific reasons (e.g. minimum size to deal with eyesight). Web pages that render as blank white space for awhile, or briefly tease me with text before blanking it out, give me a much more negative impression than ones that simply render in my chosen default fonts.
On 3G or shaky Wi-Fi, I've regularly given up on browser tabs because all I see is an empty page, even after half a minute of loading and when most images have finished downloading. (Maybe other browsers are better than Safari, but I won't switch just to see prettier fonts.)
I've been blocking web fonts for a while, and it feels like I have to whitelist one site out of three because it depends on icon fonts.
Another option: find the most-used icons or combinations. Group them.
Another option: similar to nandhp, get a hash of the font selection and name the file so. There's a very good chance a nearby proxy has that combination stored already.
Disk is cheap. Particularly disks that you don't pay for like your users' disks.
: Combination of security reasons and unnecessarily coupling apps to the public internet.
On the other hand, it seems that FA themselves are building a CDN with subsetting, so they could in fact provide those shared subsets. Unfortunately (but understandably) it's paid, so most of us can't use it.