
Do not let your CDN betray you: Use Subresource Integrity - Sami_Lehtinen
https://hacks.mozilla.org/2015/09/subresource-integrity-in-firefox-43/
======
dspillett
This could also be used to remove the need for a CDN for common libraries like
jquery and similar resources. If the browser knows it has a file in its cache
with the same properties (has, size, name) even if it come from a different
site it can be pretty sure that the content is the same so no new request is
needed.

So your site could use the cached copy of jquery (for instance) that was
originally brought down to serve my site, or vice versa.

~~~
bugmen0t
We've been toying with this idea in earlier revisions of the spec, basically
using the hash as a cache key and not loading the same file from websiteB if
it has already been loaded form websiteA.

Unfortunately, this could be used as a cache poisoning attack to bypass
Content Security Policy.

See the section about "Content addressable storage" at <[https://frederik-
braun.com/subresource-integrity.html](https://frederik-braun.com/subresource-
integrity.html)>.

(If you can come up with a magical solution to this problem, join the W3C web
application security group mailing list and send us an email.)

~~~
pdkl95
Cache poisoning doesn't make sense when you are using hashes. If someone can
generate sha384 collisions in a way that allows them to substitute malicious
files in the place of jQuery, we have bigger problems.

> Content injection (XSS)

If we assume XSS, an attacker could simply inject whatever they want. The
cache isn't needed. This still wouldn't poison any legitimate cache keys.

> The client still has to find out if the server really hosts this file.

So use (URL, hash) as the key in the permanent cache. This removes most of the
bandwidth, and using a CDN allows for one GET per file across many sites.

So what exactly is the attack? I'm really not seeing how someone could attack
a permanent cache without first breaking the hashing functions that we already
have to trust.

edit: after reading
[https://news.ycombinator.com/item?id=10311555](https://news.ycombinator.com/item?id=10311555)

This would work in the cases where we allow XSS (which is already a
compromised scenario). Simply adding the URL (or maybe even just the hostname)
prevents this entirely, and we still get almost all of the benefits for local
resources, and we get all of the benefits when using a CDN.

edit2:

There are two issues being discussed. 1) Is the file we loaded form a
(possibly 3rd party) site correct? 2) Did we _ask_ for the correct file(s).

Cache poisoning is when you can fool #1, while XSS attacks manipulate #2.

~~~
airza
The idea behind content-security policy is that it allows scripts to come only
from whitelisted domains. You can't inline evil scripts and you can't link
them from any domain. So, in the case of XSS, the attacker CAN'T just do
whatever they want. They need to make the browser think that the script is
being hosted on a whitelisted domain.

Hence, the attack here is making the victim load that keyed script on a
different page, then redirecting them to an XSS hole that links to that script
as 'hosted' by a whitelisted domain. Since it seems to be on a whitelisted
domain and match the original script's hash, it will execute on the page,
which is not ordinarily possible on a page which is running CSP.

I hope this encourages you to not immediately assume that large groups of
people working on technically complicated problems are stupid in the future.

~~~
spb
This took me a second to understand:

The scenario in question has Protected Site A vulnerable to an XSS attack, but
protected from it due to their CSP not allowing scripts from foreign domains
(only trusting scripts from `trusted.example.com`). This is what CSP is for:
it's not for what you expect to serve, it's for protecting against what you
_don 't_ expect to serve.

In the theoretical attack content-addressable scripting could open up, the
user visits Malicious Site B, which loads a malicious script with the hash
`abad1dea`. The owners of Malicious Site B use their XSS attack to insert the
(simplified) HTML `<script
src="[https://trusted.example.com/payload.js"](https://trusted.example.com/payload.js")
hash="abad1dea">`. If Malicious Site B tried to insert a direct link to their
payload at `malicious.example.com/payload.js`, it would be blocked due to the
site's CSP - however, if the site trusted the fact that it's seen `abad1dea`
from `malicious.example.com` as evidence that it could get the script from
`trusted.example.com`, this would open up a vector allowing Malicious Site B
to run the `abad1dea` payload in a way that would not be blocked by the CSP.
This is why the UA still has to make the request, even though it already has
the content.

With the behavior that's been specced, a request will be made to
`trusted.example.com` which will either 404 or give a different script,
causing the XSS attack to be blocked by the page's CSP.

~~~
emn13
CSP already has a mechanism for hash-based whitelisting - if this is the only
limitation, it'd be just as easy to allow cache-sharing whenever CSP is absent
and/or the specific hash is explicitly white-listed.

------
dccoolgai
One thing about SRI that's also great - even beyond the security concerns with
3rd party scripts - is the benefit of stability. You know whoever controls the
other end of the src attribute on your 3rd party <script> tag won't go
changing things (even with the best of intentions) quietly that break your
site. It's an out-and-out win and I hope all browsers support it soon.

~~~
lsaferite
Any idea if there is a catchable event to detect an integrity check failure?

~~~
bugmen0t
"On a failed integrity check, an error event is thrown. Developers wishing to
provide a canonical fallback resource (e.g., a resource not served from a CDN,
perhaps from a secondary, trusted, but slower source) can catch this error
event and provide an appropriate handler to replace the failed resource with a
different one."

Source:
[https://w3c.github.io/webappsec/specs/subresourceintegrity/#...](https://w3c.github.io/webappsec/specs/subresourceintegrity/#handling-
integrity-violations)

------
thexa4
You could probably make an addon that tries to fetch the data from IPFS
instead of the CDN. Might be a nice way to bridge the existing web.

~~~
allannienhuis
Thanks for that reference. [https://ipfs.io/](https://ipfs.io/) is very
interesting!

------
riquito
> An important side note is that for Subresource Integrity to work, the CDN
> must support Cross-Origin Resource Sharing (CORS).

If the CDN doesn't support CORS and the browser does support subresource
integrity, subresource integrity is ignored (bad, since an attacker can
disable CORS before changing the js) or enforced, thus refusing to execute the
js (good)?

~~~
bugmen0t
spec co-editor here.

SRI returns false (i.e. non-matching integrity) for scripts (or stylesheets)
that do not enable CORS and are not same-origin [1]. Otherwise, an attacker
could just disable CORS to bypass SRI.

 _flies away_

[1]
[https://w3c.github.io/webappsec/specs/subresourceintegrity/#...](https://w3c.github.io/webappsec/specs/subresourceintegrity/#does-
response-match-metadatalist)

------
peteretep
This could have been used to stop the Github DDOS from China, if I'm
understanding it correctly

~~~
codedokode
No. They injected code into some visitor analytics script (like GA) and those
scripts are constantly updating so you cannot calculate and store the hash.

~~~
bsder
Well, seems like a good reason to block scripts that are constantly updating,
no?

I suspect this is going to become a turbo AdBlock. If the original page
doesn't sign the content, block it.

------
chaitanya
Today, whenever any resource (script/stylesheet) on an HTTPS website loads
from HTTP, the browser rightly warns the user about insecure content.

But with SRI, it should be possible to send scripts, css, etc. over plain HTTP
right? As long as the landing page is HTTPS, and the hash checks out, is there
any reason for browsers to show a warning to users then?

~~~
TorKlingberg
You would loose privacy: A snooper could see exactly what you are downloading.
It may not matter in most cases, but still not something browsers want to
compromise.

This exposes a general issue that sometimes you want data integrity but not
privacy. With https it's all or nothing.

~~~
chaitanya
Yes that's the only thing I could think of -- loss of privacy. (On the other
hand, thanks to SNI, its easier than ever for a snooper to know which website
you are visiting even if its completely on HTTPS; so the loss of privacy by
switching to plain HTTP here is only incremental).

------
ikeboy
_Subresource Integrity works on both HTTP and HTTPS._

This seems like exactly the thing that they were talking about when they
started depreciating http [0]. Does this mean they've changed their mind?

[0] [https://blog.mozilla.org/security/2015/04/30/deprecating-
non...](https://blog.mozilla.org/security/2015/04/30/deprecating-non-secure-
http/)

~~~
dspillett
No, it just means that this method works for both HTTPS and plain HTTP - they
are not stating an opinion of either protocol here. If you are using HTTP then
this would protect you from one class of problem, but it leaves all the other
potential problems open if they are relevant to your content.

------
Animats
I've been plugging subresource integrity for months on HN, and have been
modded down for it. Now Mozilla says "Don't let your CDN betray you". I've
called some CDNs "MITM-as-a-service".

Pages which use this should detect subresource integrity fails and report them
to both the browser user and a non-CDN logging machine. Subresource integrity
should put a stop to CDNs and ISPs inserting ads and spyware, because if even
a few major sites use subresource integrity, they'll get caught quickly and
will suffer bad publicity.

This encourages using a CDN for only the bulky parts of a site. Put the
important pages (entry pages, login pages, credit card acceptance) on a server
you control, with your own SSL cert. Put the resources loaded with subresource
integrity on a CDN. Now you're not trusting the CDN at all.

Tools for website maintenance will need some improvement. Files need version
info in their names; if the content changes, the URL should change, too. Maybe
use the hash as part of the URL. Such files can have indefinite cache
expiration times; they're immutable.

------
darkr
Any idea why SHA384 is used over SHA256?

Presumably it's not for collision avoidance, and it's not like anyone's going
to be hitting the maximum message size of SHA256 with anything stored in CDN..

Edit: So it seems that all of the main variants of the SHA-2 family must be
supported[1], and the spec supports multiple hashes being presented at once.
It's just that SHA-384 seems to be used in all of the examples I've seen so
far.

> Conformant user agents MUST support the SHA-256, SHA-384 and SHA-512
> cryptographic hash functions for use as part of a request’s integrity
> metadata, and MAY support additional hash functions.

> When a hash function is determined to be insecure, user agents SHOULD
> deprecate and eventually remove support for integrity validation using that
> hash function. User agents MAY check the validity of responses using a
> digest based on a deprecated function.

1: [http://www.w3.org/TR/SRI/#cryptographic-hash-
functions](http://www.w3.org/TR/SRI/#cryptographic-hash-functions)

~~~
bugmen0t
[https://github.com/w3c/webappsec/issues/477](https://github.com/w3c/webappsec/issues/477)

~~~
joveian
It doesn't look like anything new AFAICT. I don't think they ever recommended
sha-256 for TOP SECRET documents and with a quick scan I didn't see any new
discussion in fips-180-4 itself.

------
popasmurf
About a year or 2 ago I saw someone here on HN suggest doing this very thing.
If my memory is correct, I think it was after jQuery's CDN once became
compromised.

[Edit] I think this was the discussion, just over a year ago! -
[https://news.ycombinator.com/item?id=8359223](https://news.ycombinator.com/item?id=8359223)

~~~
jimktrains2
[http://jimkeener.com/posts/http](http://jimkeener.com/posts/http) is my
concept from 2013 (hash attribute) and I'm sure I'm not the first, either.
It's an old idea and I'm _very_ glad to see someone doing it!

EDIT: I just reread my post and there are some wonky ideas mixed in with (what
I think are) decent ones. Sorry!

------
joveian
Hopefully it will soon be possible to use this for software downloads even if
the files themselves are served over http or ftp (including, sadly, many open
source projects; the thing I like best about the popularity of github is that
you know you can clone over https). It is depressing how much software can be
downloaded from any of dozens of mirrors without a hash anywhere (unless your
antivirus software checks hashes of such downloads, which it sounds like many
on Windows do). Of course, these sites could already have an https page with
hashes for manual checking but almost never do. Hopefully automated checking
would convince a few to do it.

P.S. Blake2s would be a great additonal hash to support.

------
kazinator
Or, here is another idea: host the Javascript you need in your own damn
domain!

------
progmal1
At its face this looks like a good idea, but given the hassle of injecting the
new checksum in the base file for every change I doubt it will be done by
anyone but the most security conscious companies.

~~~
mapgrep
1\. Not many people hand-write HTML tags these days, they tend to get
generated at the level of apps like WordPress (which already has an SRI
plugin) or frameworks like Rails (which can already do things like
javascript_include_tag :application, integrity: true).

2\. The people who do hand write HTML tags tend to be precisely the type of
people who would go out of their way to generate an md5 checksum on the
commandline, or write a script to post process their HTML files.

------
nateabele
> _An important side note is that for Subresource Integrity to work, the CDN
> must support Cross-Origin Resource Sharing (CORS)._

So, an extra request per resource, in other words?

~~~
tveita
CORS doesn't require preflight requests for simple GETs and POSTs, since you
can always trigger those anyway. The browser can just do the request and check
the headers on the response.

~~~
nly
Wouldn't doing this on POST be dangerous? POSTs are not expected to be
idempotent, so you're trusting the server to understand and check the Origin
header etc.

~~~
codedokode
You can send POST to any server using a JS-submitted form so it doesn't
introduce new attack vector.

~~~
malft
You get slightly more power in that you can post malformed multipart stuff.

------
emn13
Why require CORS? What good is that, here? If you're a malicious CDN, you'll
just not support CORS and so no one can use this to protect themselves. If the
CDN is benign but gets hacked, this just leaves the (unlikely but possible)
option of the hackers breaking CORS, waiting for the inevitable "I guess I'll
just disable SRI", and _then_ altering the subresource.

Requiring CORS simply makes security more difficult to achieve.

~~~
bugmen0t
The browser must _not_ know the content (or the hash of the content) of files
on your intranet (or any other domain that is not the one you are visiting
right now.)

See [https://annevankesteren.nl/2015/02/same-origin-
policy](https://annevankesteren.nl/2015/02/same-origin-policy) and
[http://w3c.github.io/webappsec/specs/subresourceintegrity/#c...](http://w3c.github.io/webappsec/specs/subresourceintegrity/#cross-
origin-data-leakage)

~~~
emn13
If the hash functions are secure, the only way for this to leak information is
if the attacker can make good guesses as to what the resource is a priori, and
then use this to verify it.

Fair enough, that's some information leakage, but it's certainly not _easy_ to
exploit. Normal cross-origin limitations still apply, so you'd need to get
creative to even get the information in the first place, and if you can, it's
not clear what this adds over a timing attack.

I'm still a little skeptical such a heavy handed restriction is necessary to
maintain the current level of security; but then again - why take the risk?

------
z3t4
I think it's just better to download the files and host them yourself.

The only reason why you want to include third party scripts is that you want
automatic bug updates etc!?

~~~
jszymborski
I have to disagree. Typically, having a 3rd party CDN hosted library like
jQuery change on you is most definetly un-expected behaviour and not desired.

The main purpose for CDN hosted scripts is allowing them to be cached in your
browser, reducing latency via close edge servers and load times across website
via caching.

This proposal prevents malicious changes, as well cache poisoning, which was a
very scary threat up until this announcement due to attack vectors like the
one described in this defcon talk:
[https://www.youtube.com/watch?v=kLt_uqSCEUA](https://www.youtube.com/watch?v=kLt_uqSCEUA)

(PDF slides here:
[https://defcon.org/images/defcon-20/dc-20-presentations/Alon...](https://defcon.org/images/defcon-20/dc-20-presentations/Alonso-
Sur/DEFCON-20-Alonso-Sur-Owning-Bad-Guys-Using-JavaScript-Botnet.pdf))

~~~
simoncion
> The main purpose for CDN hosted scripts is allowing them to be cached in
> your browser...

Hopefully, subresource integrity schemes will eventually allow browsers to
fetch from local cache based on the checksum of the file contents, rather than
the URI from which the resource is served. :)

------
ihsw
Next can we apply it to whole HTML documents (including iFrames)? Then
malicious ISPs/malware/etc would be unable to inject their ad spam code.

~~~
matt_heimer
This works because you trust the HTML provider (as a source of a valid hash)
but not the CDN. If you don't trust your ISP then getting trusted hashes
becomes more interesting. You'd need an encrypted/unmodifiable connection to a
3rd party repository of hashes for HTTP content but that'd only work for
shared identical content (like the stuff in a CDN). For personalized content
(any website with a login) you need a personalized hash, an HTTPS response
could have a header with a hash (RFC 3230: Instance Digests in HTTP) but you
can only trust that if you trust HTTPS and if you trust HTTPS then what does
the header really add? You probably need two ISPs or at least VPNs if you want
to start detecting tampering by an ISP.

~~~
ihsw
The idea is to provide protection even in HTTP-only environments (as HTTP is
enforced in some places).

It would add overhead to verify hashes in the manner that you have mentioned
but I think it's worth it.

------
travjones
Make sense. Subresource integrity looks very easy to implement for developers,
but what about CDNs? Is it difficult for them to implement CORS?

~~~
bugmen0t
For static resources, it's nothing more than sending this HTTP header "Acess-
Control-Allow-Origins: *".

We reached out to jQuery and code.jquery.com does this for a few months now.

------
Ellipsis753
I still don't really understand why we can't just run any files in cache (from
other sites) with the same hash.

If I have the sha-256 of an exe file, I'm perfectly happy to run any exe file
with the same sha256 simply because collisions don't happen. Why is this
different for JavaScript?

If an attacker can inject HTML script tags into your website haven't you
already lost?

~~~
airza
Content security policy is a defensive technology which makes the answer to
your last question "no." Attackers still need to have their script appear to
execute from a whitelisted domain, which is only possible if the system you
propose is enacted. IE - have your own random webpage which loads a script
with hash X, then redirect to an XSS hole which appears to load that script on
a client's site that also comes from a whitelisted domain. Since it is cached
from the first site, it will be loaded as if it was hosted on the second site
and thus bypass CSP.

------
neoCrimeLabs
Perhaps I'm missing something here, so the browser can check the integrity of
the included scripts as such:

    
    
        <script ... integrity="sha384-...">
    

How will the browser confirm that the source HTML requesting script isn't
modified by the CDN?

Well known CDN services, such as CloudFlare, are known for modifying the HTML
-- This is not uncommon.

~~~
Sidnicious
This is probably meant for pages where you serve the HTML, but JavaScript
libraries and styles may come from a CDN.

------
ff7c11
Why is this better than the Content Security Policy header where you can
specify a hash or the hash attribute on a script tag?

~~~
Someone1234
I'm a big fan of CSP, but even I have to admit that the CSP is quite
large/expensive for a HTTP header field.

It might be different for other sites/stacks but on ours we deliver CSP at the
whole site level, meaning it is delivered with every response we send.

Script integrity is only sent when that specific script is used, and it means
our workflow doesn't have to change to rewrite a HTTP header dynamically with
each page (based on which scripts are or aren't on that specific page).

I legitimately have no idea how I would implement CSP with hashes for the
scripts on that specific page. It would require me to actually patch the
software stack upstream. I do however know exactly how I'd use the integrity
field on a script block and could implement it with just raw HTML.

PS - Not to mention that few browsers support level 2:
[http://caniuse.com/#feat=contentsecuritypolicy2](http://caniuse.com/#feat=contentsecuritypolicy2)

~~~
codedokode
CSP header doesn't need to be added for every response, e.g. for images or JS
files.

------
kra34
Wouldn't it be faster (browser being able to use resource) to just skip the
CDN and host the resources yourself?

~~~
deathanatos
Not necessarily. Take jQuery: _many_ sites use it. If both sites A and B load
it from a CDN, and I've already visited site A, then when I visit site B,
jQuery is already in my cache: we might not even need to request it, despite
never having been to site B before.

However, traditionally, the CDN now controls the content of your JS, and could
inject whatever they want into it. That's where this proposal comes in…

(Of course, if it _isn 't_ in the cache, then you might need as much as a DNS
lookup+TCP connect+a TLS handshake to another host… tradeoffs. HTTP/1.x is
also limited to n connections to a DNS name at time, so you can parallelize
requests by hosting across multiple domains, such as a CDN, but I find this
argument less compelling.)

~~~
codedokode
Different sites might use different versions of a library. Also the cache gets
polluted very quickly because many sites add caching headers for every request
and cache size is limited.

If average page size is 500 Kb and jquery size is 30 Kb gzipped you do not
save much by hosting is on a CDN. What you get is more DNS requests, more
downtime when that server fails or stalls and give out data about your users.

I think it is easier just to host everything on your own server.

------
jszymborski
This is absolutely excellent! I always hold off on using 3rd party hosted
scripts when feasible, but this'll change things for sure!

Congratulations Mozilla!!! This is one of the few recent changes I've seen to
browsers that fundamentally changes web-page security in a simple and novel
way.

------
theandrewbailey
Mozilla docs: [https://developer.mozilla.org/en-
US/docs/Web/Security/Subres...](https://developer.mozilla.org/en-
US/docs/Web/Security/Subresource_Integrity)

------
Olap84
Chrome already have this for your information:
[https://www.chromestatus.com/feature/6183089948590080](https://www.chromestatus.com/feature/6183089948590080)

------
nodesocket
Here is a nice node.js package to generate a SRI from a file.
[https://www.npmjs.com/package/node-sri](https://www.npmjs.com/package/node-
sri)

------
nly
Great! Now if only we can get asymmetric document signing going, including a
"must be signed" header to compliment HSTS...

SRI does little for trust when your CDN is also proxy-caching your HTML (e.g.
Cloudflare).

------
killnine
Simple, yet seemingly effective. As long as the hash the browser is checking
matches the hash the server produced, the only options to beat this that I see
are hash collision and bypassing the mechanism.

------
ape4
I like the concept but there will probably more failed integrity check because
of procedural mess-ups than actual attacks.

------
carsonreinke
I wonder about situations that is used with 3rd parties (such as Facebook) and
how they could distribute changes.

~~~
MichaelGG
I doubt Facebook or any third parties will find much benefit to this. They
will just say that they will protect their own CDN. Otherwise they effectively
kill their ability to update their scripts. In fact, I'd expect them to
deliberately disable this by updating the script with a random byte every so
often.

If a site wants to pin a third party to a specific version, they'll need to
copy the file themselves. Though I'm not sure if this can be detected and
"fixed" by the script author. I've noticed that Stripe's js file logs a
warning if it thinks it's being loaded from another domain.

~~~
carsonreinke
I mean if the Subresource is common place, someone might do this with a third
party. The spec does not seem to address this.

------
ycitera
or just don't use CDNs... they are just another privacy issue on top of all
other problems...

------
betimsl
Though, I don't see why CDNs wouldn't replace the sha1 with sha1 of injected
file.

~~~
sarciszewski
The CDN doesn't control the SHA1, your website does.

~~~
rubbingalcohol
Some sites host their static HTML on a CDN too. Example: anyone who uses
Cloudflare.

~~~
codedokode
If you use proxy server that terminates HTTPS connections you have to trust
it. There is nothing you can do.

------
codedokode
That is absurd. To prevent malicious modifications and data leak you can just
host code on your own server. Using free CDN (without SLA) you just increase
your site downtime and get nothing in return (well except that company and NSA
can now collect IPs, UAs and referers of your users).

CDNs are used to optimize traffic cost but hosting just single JS library
there won't save you much.

Calculating hashes is bothersome and requires modification of an app so
probably nobody is going to use it.

I never use libraries hosted at free CDNs.

EDIT: I cannot think of a scenario where this feature can be useful.

EDIT 2: And you cannot use this feature for scripts like Google Analytics
because they can be modified anytime.

~~~
err4nt
It's not just about CDN-hosted files, it's also a way to not execute untrusted
JS. Let's say you have a high-traffic WordPress site with a dev environment.
You might have a clean theme and know all your JS, but be using plugins other
people have written.

Using this it looks like you could put the hash in your markup and if their
JavaScript code changed it wouldn't execute. For some people having that
deadman's switch might be better than an always-execute policy on JS that you
aren't writing yourself.

~~~
codedokode
There is CSP (
[https://en.wikipedia.org/wiki/Content_Security_Policy](https://en.wikipedia.org/wiki/Content_Security_Policy)
) to protect againgst inclusion of unapproved third party scripts. Hash check
won't help against malicious WP themes because they would not output integrity
attribute into generated HTML.

