Hacker News new | past | comments | ask | show | jobs | submit login
Say goodbye to resource-caching across sites and domains (stefanjudis.com)
74 points by stefanjudis on Oct 26, 2020 | hide | past | favorite | 79 comments

Unfortunately security and efficiency are at odds here.

We faced a similar dilemma in designing the caching for compiled Wasm modules in the V8 engine. In theory, it would be great to just have one cached copy of the compiled machine code of a wasm module from the wild web. But in addition to the information leak from cold/warm caches, there is the possibility that one site could exploit a bug in the engine to inject vulnerable code (or even potentially malicious miscompiled native code) into one or more other sites. Because of this, wasm module caching is tied to the HTTP cache in Chrome in the same way, so it suffers the double-key problem.

Can anyone find any data on how often cache hits happened for shared resources from CDNs anyway? How useful was this actually? I'm not confident it was a non-trivial portion of bytes fetched by a typical session. But maybe it was. Has anyone found a way to measure in any way?

Pretty poor. Even with a widely used library like jquery, version skew meant that there was pretty limited overlap. I collected notes on the issue some time ago: https://justinblank.com/notebooks/browsercacheeffectiveness.....

Agreed. I've stopped relying on CDN caching for my projects and instead try to focus on avoiding large js payloads entirely.

For the last half decade, almost all apps have been deployed with webpack/browserify/other bundles. All the assets get smashed together into a big custom bundle that doesn't get cached across sites.

This has been really sad & a big loss for the web, in my opinion. And it's one that we were about to emerge from[1], it seems like.

Alas, if we do go back to a more old-school CDN-based style of web scripting/javascripting, powered by our new ES Modules (& hopefully Import Maps) this new sharding-by-origin change will mean that we will never ever see the CDN hit-rate benefits we once saw.

It seems like it is a necessary change, to protect the user from being tracked, but it still hurts my heart so much, that we are so near to getting back to sharing resources on the web, only to have all that sharing snatched away. Whatever metrics you are looking at today, know that they represent a very sad state of affairs, that brought great pain & suffering to the hearts of many webdevs who aspired for much much much higher hit rates.

[1] https://www.bryanbraun.com/2020/10/23/es-modules-in-producti...

I think the linked resources show that it probably never worked the way we hoped it would. The first investigation I linked was from 2011, prior to the existence of webpack.

A lot of people have high-hopes that import-maps[1] will allow us to consume a variety of ES Modules from a variety of CDNs effectively. It gives us back the "bare specifies" that CommonJS introduced, where you say `import $ from "jquery/index.js"` in the code you write. Then the import-map helps the browser understand which CDN or otherwise to reach out to get that index.js file. We think this will allow ES Modules to be broadly usable & "modular", in a way that they have not been. I & a bunch of others are holding our breath on this one. It feels like it really can fix this huge hang up, giving us a way to author modules in a way that allows modular consumption.

[1] https://wicg.github.io/import-maps/

Given that cache hits only work with a specific URL the results are in practice anything between pointless to only slightly good (with maybe one or two exceptions).

I mean to have a cache hit you need:

- Same CDN

- Same library

- Same library uploader/name

- Exact Same library version down to every byte of js

- Exact same way to refer to given version (e.g. if latest is 1.3.2 then foobar-1.3.2 and foobar-latest are not the same, except if foobar-lastest is a temporary redirect to foobar-1.3.2. But that would induce a further round trip).

If we furthermore consider that most people most times visit a small number of domains it's not to hard to reason that the value gained from caching doesn't outweigh the cost for the majority of users.

My sense is that it was always pretty low. What are the odds two sites use the same exact version of jquery and same third party cdn while the cache is still warm?

I guess it really depends on the use case and the library.

I can imagine a tonne of WP sites referencing jQuery from the Google CDN, e.g.: https://ajax.googleapis.com/ajax/libs/jquery/3.5.1/jquery.mi...

Looking at the headers the JS asset would be cached for 1 year.

There are always a few exceptions.

But you also must consider that most people most times visit a small set of domains.

Which means that most times they will have jquery and similar cached even without cross domain caching.

I guess the CDNs assume people should use libraries without version pinning, ie "latest".

I feel that for a long time this has greatly lost it's usefulness. In a time when more websites are "webapps" built using webpack and other similar tools, we've seen a big decline in the use of CDNs.

Yeah -- for all that people are worried about efficiency gains, I'm kind of doubtful that most end-users, even on slow connections, will even notice that caches are restricted to within domains.

I suspect that website who are conscious of loading times are already testing performance with nothing cached. And websites that aren't conscious of loading times are probably using bundling techniques that would already make cross-site caches useless. In both cases, I'm having a hard time believing that loading JQuery is the reason anyone's website is slow.

There are theoretical schemes that could allow us to share libraries between sites without having the same privacy impacts, but I'm not sure it's even worth the effort of proposing them.

I'm not sure how far this is technical possible but for people which are on so slow/low bandwidth connections that they have a noticeable drawback because of this change I believe there is a better solution:

An extension keeping widely used versions of libraries preloaded as well as a small db of CDN/urls so that it can serve the pre-loaded libraries instead of the CDN ones when possible. This also could do thinks like collapse foobar-latest and foobar-X.Y.Z (X.Y.Z == latest) and could force load a different version with security patches. I.e. it would act kinda like a linux package manager for a limited part of common libraries.

Decentraleyes does exactly this.

Check out LocalCDN for a fork with actively-updated CDNs.

I use decentraleyes and it tells me it replaced network version with local versions 395 times since installation (probably a year ago). That's not very muhc.

It doesn't get hits like it should. It would be nice to be able to add libraries to the cache manually because many are out of date.

Probably just like shared libraries. Ostensibly multiple applications will share the same DLL.. In practice their versions are incompatible, if software uses the same dependency to begin with.

My guess is the impact of cross site caching is negligible. We're losing nothing here.

1) Cache hit must be extremely low because of different versions/variants/CDN for each library. (Have you seen how many jquery there are?).

2) It's irrelevant outside of the few most popular libraries on the planet, maybe jquery/bootstrap/googlefonts.

3) Content is cached once on page load and the saving are happening over the next tens or hundreds pages you visit. That's where the gain of caching is (10x - 100x). Saving 1 load when changing site is negligible in the grand scheme of things.

For anyone asking what this means in numbers:

> The overall cache miss rate increases by about 3.6%, changes to the FCP (First Contentful Paint) are modest (~0.3%), and the overall fraction of bytes loaded from the network increases by around 4%. You can learn more about the impact on performance in the HTTP cache partitioning explainer. [0]

[0]: https://developers.google.com/web/updates/2020/10/http-cache...

Additional metrics: https://github.com/shivanigithub/http-cache-partitioning#imp...

This is an incredibly problematic number to report, bordering downright deceitful, because it ignores that we have not been able to use CDNs (because we have had to bundle our many JS files together, first for CommonJS, then for an ES-Module 1.0 specification with fixed/non-modular import addresses). But just recently, we have finally begun to pay off the technical debt to let us once more build web pages that use many different JS files from CDNs[1]. We are finally emerging from this long dark sad road to return to the glory of using CDNs that can cache our assets for us, let us share widely across the web.

And just just just before ES Modules finally get good & modular & helpful, we destroy the shared cache that would have made them helpful. We have been on this voyage for 10 years, starting from the pre-modules but cached days, through the long dark & violent seas of modules-but-no-caching, to finally finally get to modules-that-cache-well. We finally have standards & tools in place that would allow us to begin to cache modules effectively, across sites. Except no, not any more.

Whatever numbers you see for this, they are lies. They don't represent any honest truth. They portray only a poor reflection of a bad place that we have been desperate to escape from. We have wanted to cache modules effectively with CDNs for years, but ES Modules had not been suitable to the task. To judge the impact based only on what we can see, without projecting to what the web we were all trying to make happen: that's incredibly sad. We'll never know. All possibility is being chopped off and cut down. This is an incredibly sad, incredibly tragic culmination to a long long struggle to make the web better, and frankly, I am disappointed beyond words that the teams have taken such an aggressive change of defaults so casually for such minimal proven harm. If there is a real security issue here, it should have been dealt with via more Content Security Policy flags, not by unweaving the web & making each site have it's own view of the world, unique from every other site. The security atmosphere is paranoid & delusional, & nothing tops their inquest for absolute security.

[1] https://www.bryanbraun.com/2020/10/23/es-modules-in-producti...

Admittedly Content Security Policy is not enough. It's not site's assets being protected here.

We are protecting users from sites coordinating their actions via the presence of resources. If I visit store.example they might cache a /big-spender resource. Then if I visit other.example, they can check to see if I have /big-spender cached.

As a user, I ought to be protected against coordinated tracking mechanisms like this. Content Security Policy might be able to let store.example protect it's asset, but in this case, the problem is that store.example might be deliberately exposing the cached/not-cached state of that resource to others; it is the user, not the site's content, that needs to be secured.

Thusfar the only safe we've found to do it is to have every site have it's own naive, isolated, alone view of the world. This is, alas, in my perspective, extremely unfortunate. I picture the spider web of information being cut into pieces, broken apart. But I also recognize the necessity of this. I can't stand it, but I see no alternatives. And makes me so sad that we will never ever see modules work on the web. That ~2011 was the last & will forever be the last good year for CDNs, before CommonJS & bundling took over, before we made CDNs no longer places of sharing.

> I have mixed feelings about these changes.

I feel for those on low bandwidth and low data limit connections. Website developers should focus on bloat and address that. That doesn’t seem to be happening on a larger scale though.

> It's excellent that browsers consider privacy, but it looks like anything can be misused for tracking purposes these days.

Of course. Every bit of information you provide to a site will be misused to track and profile you. That’s what the advertising fueled web has gotten us to (I don’t blame it alone for the problems).

I wasn’t aware that Safari had handled the cache privacy issue in 2013. It seems like it has always been way ahead on user privacy (thought it’s not perfect by any means). I’ve been a long time Firefox user who has always cleared the caches regularly, and I’m curious to know if any browser has consistently provided more privacy out-of-the-box than Safari.

I wonder how long it will take for browsers to go beyond the cache concept and implement an integrated package repository so I can upload my manifest + my 3kb app.js and tell the browser to download (and store) all the dependencies I need.

It will not only help with performance, but will also stop the absurd tooling madness that front-end has become.

How does that differ from cache manifest (see link [1])? It's now being replaced with service workers, but largely storing the dependencies on first refresh is what it does.

[1]: https://en.wikipedia.org/wiki/Cache_manifest_in_HTML5

This still works on a single website level. A common package manager will help every website that need the same deps (at least in the same semver range) with the benefit of a download once/available to everyone, true immutable, cache.

Edit: The most common example. Let's say you need jQuery. The browser download the repo once and then it will be ready and available for maybe millions of website. Just think about the benefit of the saved bandwidth alone.

I cannot stop to think how stupid is to download the same assets again and again and again for every website you visit.

Yep, this is kind of npm but for browsers. But already the sheer size of npm shows how this is hardly possible: http://www.modulecounts.com/ -- I expect the npm repository to be at the size of two to three letters in gigabytes. This is quite large compared to the total hard disk cache of your browser (which also includes images, CSS, HTML, etc).

Of course you don't need to download the whole repository as with npm. But just the links of the optimized distributable assets. In short, your /dist folder.

Since NPM is now the predominant way of distributing packages, they don't usually have a /dist folder so individual packages would again need to think about this. This was one of the reasons that bower is not really used anymore.

That would make browsers the gate-keepers for the packages

Also, I feel like adding another package manager wouldn't go that was...


Do you think that those shady-tracking CDNs are better gate-keepers?

So the natural progression here is that only big sites with their own CDN solution will be fast? And for most people and companies that will mean "rely on a large service to actual host your content for you", because they are not operating their own CDN. Because speed matters when it comes to search ranking.

So they are then beholden to major platforms such as Google to host sites for them from a global cache? Similar to what AMP does, but for all kinds of content?


> is that only big sites with their own CDN solution will be fast?

No, not at all. This change gets rid of global caches.

A) Your site caches will still work, they just won't be shared across sites. A cold-cache load of Gmail will go through the exact same process as the cold-cache load of your site, and subsequent loads of both sites will be just as fast.

B) If your site's initial load time on a cold cache is unacceptable, you are already making an engineering mistake and you need to cut back on the Javascript.

C) Most large sites are already choosing to bundle their own libraries or serve them from dedicated CDNs instead of trying to coordinate with each other to make sure the same resource location is used across multiple websites.

D) Even in a theoretical world these changes were going to make the web a lot slower (and reminder, that world doesn't exist), the risk of library domination in that world would be larger than the risk of website domination. Imagine a world where you write a competitor to JQuery that's just as good, maybe even smaller and more efficient, but nobody uses it because "we have to use the popular library that's likely to be already in the user's cache."

While we're on the subject of D, the fact that nobody says that -- that we don't see smaller JS libraries thrown aside in favor of libraries/versions that are more likely to be already cached -- is strong evidence that your site sharing a JQuery cache with someone like Google or the NYT is not an important performance concern.

> So the natural progression here is that only big sites with their own CDN solution will be fast?


This is about disabling cross domain caching. Which rarely has a cache hit already by now (see many other posts on this site).

> "rely on a large service to actual host your content for you"

This is the case anyway even with cross domain caching as the source you cache still needs to have the exact same URL including domain. The cross domain only refers to the site loading the resource.

So e.g. `foo.example/jquery-3.2.1` and `bar.example/jquery-3.2.1` where never treated as the same at any point in time. The only think changed is that if `foo.example` and `bar.example` both depended on `cdn.example/jquery-3.2.1` and you visited `foo.example` before `bar.example` it might already have been cached when you visit `bar.example`. Through most times it wasn't as e.g. `bar` used a different CDN a different URL to the same resource or a different version.

So this change doesn't really affect small sites more then any other side. And the effect is generally negligible.

Just as planned by a monopoly.

Assuming all browsers are going to implement this partitioning, doesn’t it give web devs even more reason to use 3rd-party CDNs? You’re not paying for the traffic and you don’t have to worry about users’ privacy.

What is the most nightmare case of private information leaking here? I can't seem to come up with anything that horrible from my own imagining, especially not worth throwing away the advantage of cross domain resource caching.

The example that they give, that you're logged into Facebook, doesn't seem very useful other than maybe fingerprinting? But even then 90 some percent are going to be logged in, so the only real fingerprinting there is on the people who aren't.

Probably finding out that people are logged into some sort of site which leads to blackmail opportunities? Imagine finding out that a straight, married politician of the strict "family first" type is logged into a gay dating site. That would lead to some interesting "opportunities" to get them to vote in ways they would not otherwise do.

There is also the possibility of leveraging this type of information in social engineering scenarios. Imagine getting compromising information on a sysadmin at a major commercial port and blackmailing a root password out of them, then leveraging that to set up a persistent threat and deleting their database every hour for a few weeks until they finally manage to lock you out again. The damage would be in the hundreds of millions. You could potentially do all the usual interesting things to foundries and/or oil refineries too if you manage to compromise insiders. Really, the sky is the limit if you use your imagination a bit.

So what we actually need is

- a decentralized way to store these libraries

- by a source with established trust (so it can't be misused for tracking)

JS/CSS library blockchain?

Unless you have very few libraries and always force everyone to the latest version, it's still quite practical to abuse this for tracking. For example, there are sites running Dojo on at least 86 versions [1], all of which are pretty uncommon. If one site causes you to load one of these versions, and another site checks which one you have in cache, that's >6 bits of information. Combine this with all the other libraries and versions, and you can easily get enough bits to uniquely identify someone. It's even worse if one site can load multiple versions of the same library: that turns 86 versions into 86 bits.

[1] 1.13.0, 1.12.3, 1.12.2, 1.12.1, 1.11.5, 1.11.4, 1.11.3, 1.11.2, 1.11.1, 1.10.9, 1.10.8, 1.10.7, 1.10.6, 1.10.5, 1.10.4, 1.10.3, 1.10.2, 1.10.1, 1.10.0, 1.9.11, 1.9.10, 1.9.9, 1.9.8, 1.9.7, 1.9.6, 1.9.5, 1.9.4, 1.9.3, 1.9.2, 1.9.1, 1.9.0, 1.8.14, 1.8.13, 1.8.12, 1.8.11, 1.8.10, 1.8.9, 1.8.8, 1.8.7, 1.8.6, 1.8.5, 1.8.4, 1.8.3, 1.8.2, 1.8.1, 1.8.0, 1.7.12, 1.7.11, 1.7.10, 1.7.9, 1.7.8, 1.7.7, 1.7.6, 1.7.5, 1.7.4, 1.7.3, 1.7.2, 1.7.1, 1.7.0, 1.6.5, 1.6.4, 1.6.3, 1.6.2, 1.6.1, 1.6.0, 1.5.6, 1.5.5, 1.5.4, 1.5.3, 1.5.2, 1.5.1, 1.5.0, 1.4.8, 1.4.7, 1.4.6, 1.4.5, 1.4.4, 1.4.3, 1.4.1, 1.4.0, 1.3.2, 1.3.1, 1.3.0, 1.2.3, 1.2.0, 1.1.1

Yes beside some ideas about e.g. ipfs + emulating network weather on all accesses (instead of just cached ones) the real annoyance is that there is no sane standardized Js standard library.

If we could we should make following best practice:

- Only use react and similar if you write a webapp, do not use such tools for websites. If your website is so complex that you need it you are doing something wrong.

- Have a js standard library which provides all the common tooling for the remaining non-webapp js use case.

- Make it have one version each year (or half year), browsers will preload it when they ship updates and keep the last 10 or so versions around.

- Have a small standardized JS snippets which detects old browsers which are not evergreen (like IE) and loads a polyfill.

Sure there are some requirements to get there. E.g. making it reasonable easy to have proper complex layouts in a reactive fashion without much JS or insane complex CSS. (Which we can do by now due to css grid, yay).

If you're relying on browser updates, then why not just work on getting whatever JS improvements you want into browsers directly?

- Back&Forward compatibility by Versioning and shipping multiple versions with the same browser

- Easier prototyping and experimental usage of pre-releases

- Backward compatibility with older browsers on the first view versions at least

From a quick search, there's apparently already a way to have the browser verify the cryptographic hash of a resource loaded from an external source: Subresource Integrity (SRI). [0]

Can anyone comment on whether it's practical, and whether it could help here?

[0] https://developer.mozilla.org/en-US/docs/Web/Security/Subres...

SRI now solves the problem of ensuring a third-party resource doesn't unexpectedly change its contents, but it can't address the security issues of being able to time how long a request takes to tell if you already have it in your cache.

AIUI it is less practical than you'd like, for several reasons that work together to greatly mitigate its impact: 1. The remote content most vulnerable to hostile change is also remote content already being loaded precisely because it changes; ad scripts, etc. Protecting your jQuery load is nice, and if someone did compromise a CDN's jQuery it would be a big issue, but it's also not what has been happening in the field. If the content has to change you can't SRI it. 2. If you have a piece of remote content that you rigidly want to never change, it's much safer to copy it to a location you control, and it's almost as easy as SRI. 3. If a subresource fails SRI there's no fallback other than simply not loading it, so it has a very graceless failure mode. This combines with #2 to make it even more important to put it in a locally-controlled area. Once local, SRI is more-or-less redundant to what SSL already gives you.

Basically, it's one of those things that looks kinda cool at first and makes you think maybe you should SRI everything, but the real use cases turn out to be much smaller than that.

SRI solves a real problem if you want to use a CDN without opening yourself up to compromise if the CDN is compromised. But yes, there are lots of other problems that it doesn't solve.

> ad scripts

Should not exists IMHO ;=)

But yes, SRI is just for the case the CDN gets compromised.

Define "help". With the new caching rules, there is just no way two sites with different TLDs can share asset caches, so it wouldn't help in the sense that double downloads could be avoided.

You still can de-duplicate storage in the cache, just not the download.

True, but the "avoid multiple downloads" part is most likely more significant than the "avoid multiple copies on disk" part. Storage is comparatively plentiful while network bandwidth is often quite scarce, and 3 seconds longer loading time are going to be more visible than a 10 MiB larger disk usage.

I think that the developers of it have argued there's a security concern with using it to avoid looking up libraries. I know it's come up here on HN before if you can find it.

A blockchain solution is probably over the top for this and does likely have major problems with regard to all kinds of regulations.

Just content based addressing (e.g. ipfs) is good enough to be actually useful and allows local hosting and sharing caches at the same time.

Still neither of this would fix the privacy problem. Through with something like ipfs emulating network delays on access could work, but would be VERY hard to get right and make immune to statistical timing analysis.

It's also one of the view ways how to get some (imperfect) degree of censorship resistance without running in direct conflict with laws to e.g. hinder access the child pornography. Note that this is just imperfect protection working for countries which do not officially have censorship in their law and have no effectivish national firewall. I.e. it wouldn't work for China, it also wouldn't work if the political situation in some western countries get worse. But it does work for "non-official" censorship enacted based on not-so-legal pressure and harassment. Or corperate censorship enacted by companies supposedly on their own will.

Like this? https://www.localcdn.org/

(Without the fancy / bs bingo technology)

So we need something like Fedora organization to make trusted distribution of javascript libraries for the web. As bonus, these libraries can be precompiled to native code ahead of time.

Or just ship the libraries with the browsers.

(Obviously also not ideal)

The browser is a good candidate.

One of the benefits of using cdn resources is that it enables prototyping with, say, bootstrap so fast because you could essentially upload a single html file instead of a bunch of css, is,l and graphics. I mean, that will still be possible but there are more benefits to CDNs that just performance.

But the cost of it is making your site which normally only depends on one server depend on several. That inherently lowers the reliability of your site.

I see no problem with using remotely hosted resources for prototyping. But you should never let a link to remotely hosted fonts or scripts make it into production code.

In isolation 1 is better than two. But when each of those servers is on new equipment with expert techs getting an alert if things go down the risk gets reduced. Unless your support on the single server is on par multiple managed may be safer.

It doesn't help your website if your own server has issues while the CDN has the best experts ensuring its 99.999% uptime. The probability of a failure is still a multiplicative factor of both uptimes, so it's strictly worse. Your user won't care if the CDN serving JS files is up while the website itself is unreachable. The only uptime that is relevant is that of your own server.

Edit: Typo

> It doesn't help your website if your own server has issues while the CDN has the best exports ensuring its 99.999% uptime.

That's certainly true of the server on which your APIs, if any, reside, but isn't it typical for your website itself to also leverage the CDN for distribution?

Distributing your website on a CDN is fine, but then, all of your fonts and JS belong on that same CDN.

Basically, you should never have a production website which calls out to cdnjs.cloudflare.com or ajax.googleapis.com or fonts.googleapis.com, you should be hosting all of your site's dependencies in the same place or set of places.

As a side perk, your site also will stop looking like trash for users who use browser extensions to block such external calls. ;)

But what if you also put you static content (e.g. blog-post) onto the CDN and as such your site is still operating to >50% of it's features when your sever is down. (Just missing comments, new posts, announcements etc. but not the main content).

If you put 100% of the load on the local server you increase your chance of failure. Moving the bulk of the load to a cdn can reduce the load.

You are only as good as your local server. Having a cdn means you need a better local server.

I would argue if you're shoving so much additional JavaScript on top of your website that you need a better server to host it, you are serving way too many scripts.

Yup. Unless your site is so unreliable that it is likely to fail between serving the homepage and the assets the CDN will only hurt your availability.

You are forgetting that under high load availability of local servers is likely to degrade and serving non small parts of you content via CDN can noticeable reduce the load and as such improving local availability potentially improving total availability.

Yeah, I've said it here before but I created cdnjs.com 10 years ago or so.

I use webpack and other more secure/performant strategies for all my JS needs when working in a serious environment.

But when I am building something personal/light, I still load up cdnjs.com and drop something in. It's just easier than thinking about how I will serve files etc

Even more food for thought - what if the cache is slower than the network? https://simonhearne.com/2020/network-faster-than-cache/

The negative effect is probably overblown; keep in mind that subsequent visits by the user to the same site can still use the cached version they loaded previously, and the odds of a cache hit in that case are relatively high.

I'm curious if this will result in any popular CDN folding. After a decent chunk of users update, they'll be hit with much more traffic than usual - and possibly more than they can afford in some cases.

Very unlikely IMHO.

There is still caching, just not caching of the same resource used by different domains.

Most people most times visit a small set of domains. All for which the resource still is cached from the previous visit.

Combine it with the small likely hood to get a hit on cross domain caching the change in traffic is likely negligible.

Perhaps using content security policy headers for trusted CDNs could fix this?

One issue is privacy. With a common shared cache, a website can detect if you visited another specific website by loading the exact same resources from the same CDN and checking whether it's very fast to load because it's cached.

No mixed feelings, this is unconditionally good. Who ever thought it was ok to force your users to download stuff from unrelated, commercial servers

ipfs and bittorrent v2 solves this problem by addressing content with hashes rather than URL.

it does not solve this problem at all, and in fact has the same problem -- it is possible to detect if content has been loaded before with ipfs in the same way as this. the remapping of content-id => content-data is not only trivial, it is required for ipfs to function in the first place.

At work we can no longer load stuff from CDNs anyway, because GDPR. For customer projects that is. I guess there would be the possibility to include it into some disclaimer but then we would need to check with the CDN about their data retention policy and check with the customer and that's just not worth it.

Applications are open for YC Winter 2024

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact