Presumably, the "[Warning] Do not copy or self host this file, you will not be supported" is because they change the script reasonably often, meaning that using SRI will require them to change their hashes on every linked page every time the script changes or it will stop working --- probably not what they want.
Yeah, the best way to handle this is with a version in the path and then the host can knowingly/willingly upgrade the library version and SRI hash at the same time.
That doesn't really solve this problem, it just slows down both the rollout of the subverted version and the rollout of the fix. People aren't auditing the (minified) javascript they put onto their sites.
SRI is good for use with a CDN, where the same entity controls both the HTML that references the JS and the JS being referenced. In that case it keeps someone who subverts the CDN from being able to XSS the site.
That depends on what "this problem" is -- lots of people here would say that the problem is websites that depend on unaudited, untested 3rd party resources. I can tell from your other comments that you think it's safe to trust 3rd party resources from places like Google. So there's a disagreement that is worth talking about explicitly.
Most sites are built on lots of unaudited untested (by them) third party code sever side. Adding some client side isn't great, but also isn't a fundamental change to the dynamic.
Uh, so that's terrible added to terrible, but it doesn't really address what I just said!
If I was the mean person at IBM that buys startups and makes them "right", it's a lot more work to fix terrible added to terrible than plain terrible. And my last startup got bought by IBM, so I've experienced that pain personally.
If there’s a version in the path, then the website referencing the script can include a hash, secure in the knowledge that the resource they reference should never change.
Want to know how you can easily stop this attack? What
I've done here is add the SRI Integrity Attribute and
that allows the browser to determine if the file has
been modified, which allows it to reject the file.
SRI does not fix this problem. If you put an integrity attribute on the script, then the next time BrowseAloud releases to prod their script will stop working on your site.
This is a product that works by running a script on your page to make changes. There's no option for defense in depth here: either you trust that their processes are secure enough that they're not going to XSS you, or you shouldn't run their code on your site at all.
The bar for including javascript from other sites should be a high one, but there are times when the tradeoff is reasonable. For example, I have Google Analytics [1] on my site, and I trust them to handle this responsibly.
>The bar for including javascript from other sites should be a high one
Yes, but that's not how most developers think these days. Try browsing the Web with NoScript and you will routinely witness dozens of domains in the block list.
>The bar for including javascript from other sites should be a high one
>but that's not how most developers think these days.
The decision to include third party javascript is sometimes not even up to developers these days.
"A deal has been signed with company X, put their widget on the site" is something I've now heard a few times.
Arguments about the third party code greatly increasing page load time, page size, introducing security vulnerabilities etc then fall on deaf ears. High developer turnover seems to co-occur in these environments.
> For example, I have Google Analytics [1] on my site, and I trust them to handle this responsibly.
I don't. And I don't appreciate people injecting Google beacons on their website. Same thing with Facebook, Twitter, Disqus and all the other shit-scripts.
I use uMatirx and it’s given me a pretty good lesson on how much stuff sites load from 3rd-parties. Many sites need me to play whack-a-mole to get them to display - which 3rd party sites do I need to allow to get the content to show up.
I’m really torn by this sort of thing.
On one hand, when many sites use jQuery (for example), there’s huge benefits (bandwidth, speed, etc) in having it loaded from one location relatively infrequently and cached for many pages. This is exactly the promise of shared libraries, just on a much wider scale.
On the other hand, why aren’t web site operators delivering all of the code that is needed for the site to function? If they want all their users to execute that code, why aren’t they willing to serve it themselves?
I’m not sure what the right answer is, but this incident is a pro for pulling in all of your dependencies. The limited data plans that a lot of people have are a big pro for hosting libraries centrally.
> which 3rd party sites do I need to allow to get the content to show up
This is a game I'm increasingly unwilling to play. The number of domains has got out of hand and are increasingly vaguely named - whether it's a CDN, ads or tracking is often almost impossible to judge.
Life's too short now sites are frequently calling out to 15 or more domains for me to jump through hoops just to get some text to display. Images I'm usually happy to dispense with. So I increasingly favour sites that don't need these games. There's still more than enough - for now at least!
> On one hand, when many sites use jQuery (for example), there’s huge benefits (bandwidth, speed, etc) in having it loaded from one location relatively infrequently and cached for many pages.
Does it really decrease bandwidth usage and load times? Is there a study that looked at that specifically in recent months or is it just a assumption? I'm getting the feeling that including common JS libraries is on the decline and more and more developers are using node.js packages that are built into their frontend code. Not to mention that there are quite a few CDNs that are basically providing the same service for the most common frameworks and are thus detrimental to this theory.
Load times, I believe yes, unless you have your own CDN. But you’re right that the odds that any particular version is in the user’s cache are low. I gathered some links on the subject a while ago: http://justinblank.com/notebooks/browsercacheeffectiveness.h....
> On one hand, when many sites use jQuery (for example), there’s huge benefits (bandwidth, speed, etc) in having it loaded from one location relatively infrequently and cached for many pages. This is exactly the promise of shared libraries, just on a much wider scale.
This is an interesting comment.
So, packing all the JS into one file + optionally removing unused symbols would be similar to statically linked binaries. If optimized well, this would be the smallest amount of code / binary a system has to load to execute the program, but if you have multiple binaries, you might load the same library code multiple times. And security updates of libraries require you to recompile everything. But, in this context, you have to trust basically one JS-file from the host you're accessing.
On the other hand, dynamic libraries minimize the bandwidth / memory required to hold the code for a set of binaries, because libc/jquery/... just gets loaded once, instead of once per application. And, technically, shared libraries simplify security updates - update openssl, restart services, and hope that RHEL tested well so you don't end up with a mess. This would be really cool, if you think about it. Depend on magic-js-provider.acme.org/jquery/2.1.x.js, and get security fixes like that. Except the testing would be a nightmare.
Near the end of the post, the author suggest using SRI to prevent the compromised CDN problem. However, that only works for linking to static files (eg. a specific version of a JS library).
The right answer is to specify a HMAC when serving js from third parties.
CSP/SRI already enables this.
Ideally you should be able to serve content from your own domain + specify the hash, and let browsers optimize things by re-using cached content from another domain. There’s no easy/standardized way of doing that today.
I think you can already do a browser extension that intercepts loads and gets things with script integrity from a cache, another domain, ipfs, whatever.
There is/was discussion in the standards body of using the SRI hash for exactly this purpose. It sounds really promising but iirc there was a privacy kink to work out.
Could you limit it to widely used libraries? e.g. browser ships with the last N versions of jQuery etc., and avoids the request if it's a known hash. If it's widely used, it seems like the amount of information leaked would be low, and if the browser _always_ ships with it (rather than caching on first use), it would only identify the browser and not browsing history anyway.
As a bonus, developers could use it knowing that there is much less likely to be a performance hit from fetching it.
Which, incidentally, exposes an info leak. Timing the load time for a cached script reveals if you've loaded that script before. Value low for extremely common libraries, higher for less common libraries.
Use of integrity hash validation is pretty limited -- I see 90k sites in the top 10 million.
It's a shame this isn't more popular, I'd love to build a browser add-on that uses the integrity hash as the name of the script, and load them from ipfs or something.
Top sites: gov.uk, nhm.ac.uk, change.org, blogs.worldbank.org, handbrake.fr, army.mil, genome.gov, ...
SRI[0] is still a fairly new technology with lacking browser support. I expect its usage to grow, but it's worth keeping in mind that the use of SRI does not matter at all if the client doesn't support SRI.
As a guy who crawls, indexes, and archives websites, subresource integrity matters to me whether or not the client supports it. You probably had end-users in mind.
Yeah exactly I was thinking of end users. Since the support in end user's client dictate what websites will implement I think we'll see use of SRI increase over the coming years
There are a bunch of crawlers that aggregate that kind of info -- I'm building a new search engine, and I'm not in the business of publishing stuff that would encourage anyone to block my crawler.
builtwith.com is an example, but they're being pretty strict with what you can see for free these days. And hey, they only know about 2k sites using script integrity, so maybe I've got a bigger crawl than they do! :-)
We are clearly in the Antivirus level of protection (blacklists) to maintain backwards compatibility, instead of a whitelist of allowed domains or features to use.
This makes me wonder what happens when a popular nodejs library get used in this way. What could hackers do with thousands of compromised nodejs servers?
yeah the complete lack of signing on npm libs + the large dependency trees that can trip you up (e.g. the leftpad problem) are only going to cause more issues as attackers move on to that as a vector.
Something like JS 'permissions' could help here. I.e. load a script but limit how much CPU/GPU access it has plus determining if it has DOM access, XHR access, etc. So you load the script and tell it what it can do. Also developers taking security more seriously. But it is rarely a priority, since risk reduction is not paid for today.
I like this option, similar to how apps request permissions on a mobile OS. There could be a default list of modifications an external javascript script can run, any further access must be explicitly white listed by the developer. Then even if the script was malicious, at least the impact is controlled.
for instance, maybe the default is that an external script can read the DOM but needs extra permissions to write/modify it?
It’s similar. If I remember the CCN playbook correctly, it was about taking over one of your JS dependencies via npm. This is about cross hosted JS, loading remote code.
A useful tech it is. The problem is 3rd party code from adnets is changing all the time, and they will never tell you about that because they hide all kind of anti-clickfraud trick there. From intentionally broken JS syntax, to intentionally broken Unicode, to actual 0day exploits.
When I first read this discussion, I just assumed that BrowseAloud was some sort of ad-tech or analytics code. But it's assistive technology for screen readers. The list of web sites that were compromised were web sites that were trying to do the right thing to help disabled users.
Ad-tech is a giant security hole that can't be fixed without burning it all down, but BrowseAloud could be fixed.
How would I get Mozilla to warn me when a script is loaded without Subresource Integrity? I'd like to avoid being caught unawares by this type of security hole, especially if it lets third parties execute code in my browser without any controls.
> What I've done here is add the SRI Integrity Attribute and that allows the browser to determine if the file has been modified, which allows it to reject the file
Wouldn't this negate one of the benefits of a 3rd party hosted SaaS?
Otherwise, you have to redeploy everytime your provider updates their lib?
Haven't seen it too much with 3rd party service providers code that are somewhat outside your code boundaries. Taking 2 different popular 3rd party SaaS that are included as JS in sites as an example.
am I weird in that I'd prefer to host a known version of a dependency than effectively hot-link like this?
Avoiding this sort of attack is a just a side-benefit to the main premise of knowing that its always going to work consistently.
If you use 3rd parties such as analytics and chat scripts you don't have control if they legitimaty change the response. I get it, they are supposed to use versioning and bump, but you'd be surprised.
All your hard coded hashes fail and your 3rd party analytics, chat, etc all stop working. Is there a solution to that?
There’s still large numbers of people on older browsers that don’t support the integrity attribute. It’s not foolproof but it’s one of those things you can do to improve the experience and security with no side effects to older browsers and benefits to new(er) ones.
I suspect what we'd actually end up with is miners PLUS ads. There's little incentive to not use both forms of monetisation, as neither prevents the other from being used.
I also don't think it's a great trade-off for a variety of reasons (environment impact, possible wear and tear on user's machines, battery life on mobile devices etc.).
No. It. Won't. The issue isn't PoW/PoS but the loading of infected code into browsers. PoS may stop the need for this particular JS code to get inserted. However, PoS will hinder no other JS from being inserted.
In principle, sites should be secure. In practice, putting an implicit bug bounty on every widely-used Javascript library does produce more exploitation.
I think you missed an opportunity to engage your parent comment more productively.
There is hardly even a nod to security, no defense in depth, and no cryptographic protections. There is widespread loading of untrusted unvetted code. The operating principle of the web seems to be, "it's okay to do this, everyone else is."
I'm sorry, I'm still not following. How can CSP reporting be used by marketers? It requires a header sent from the server, so it's not like a tracking pixel that can be added by a third party. And I'm not sure about local debugging only, locally it offers no benefits over just viewing the devtools console, whereas it offers a lot of benefits when enabled on users of your sites.
when visiting the ICO website
That is... amusingly ironic.