
My idea to significantly speed up web browsers with a HTML “checksum” attribute - dheera
https:&#x2F;&#x2F;code.google.com&#x2F;p&#x2F;chromium&#x2F;issues&#x2F;detail?id=554981&amp;thanks=554981&amp;ts=1447349693<p>Downloading numerous file dependencies can slow down the web browsing experience. These include frameworks such as Polymer, Angular, jQuery, jQueryUI which are often huge, comprise tens of separate HTML, JS, and CSS files, but are however duplicated across multiple servers, often out of necessity.<p>CDNs partially solve this problem but there are still numerous competing CDNs, as well as places (e.g. China) where ones&#x27; choice of CDN may not work well.<p>I propose an HTML attribute &quot;checksum&quot; where one can specify the checksum of a file, e.g.<p><pre><code>    &lt;link rel=&quot;import&quot; src=&quot;&#x2F;js&#x2F;polymer&#x2F;polymer.html&quot; checksum=&quot;44194508fca6f3187c5db3602c72fb03bb6055b8&quot;&gt;

    &lt;script src=&quot;&#x2F;js&#x2F;jquery.min.js&quot; checksum=&quot;43dc554608df885a59ddeece1598c6ace434d747&quot;&gt;&lt;&#x2F;script&gt;

    &lt;img src=&quot;&#x2F;images&#x2F;some_huge_image.jpg&quot; checksum=&quot;cb3c80c5d6f14bcd5a134399253c0566951af38e&quot;&gt;
</code></pre>
Web browsers should then keep track of the SHA1 checksums of all cached files, and if a file with a matching SHA1 checksum is found, REGARDLESS OF ITS ORIGIN, the file is not requested again from another server.<p>This would further speed up the web experience as we move forward with things like Angular, Polymer, Webcomponents, and other things that are going to have a huge, huge number of file dependencies.
======
27182818284
The first time I read your post I thought you might be referring to some of
the stuff addressed in:
[http://www.html5rocks.com/en/tutorials/security/content-
secu...](http://www.html5rocks.com/en/tutorials/security/content-security-
policy/) with the sha1 hashes for scripts.

There is also the idea of bundling in popular scripts to the browsers
themselves. There are a few different discussion threads on StackOverflow
about it like [http://stackoverflow.com/questions/8287607/why-is-jquery-
not...](http://stackoverflow.com/questions/8287607/why-is-jquery-not-
integrated-within-the-browser)

~~~
dheera
The problem with bundling popular scripts into web browsers is that they
evolve too fast. Also there would be inevitable politics (e.g. Microsoft and
Apple invent and bundle their own equivalent to Google Polymer) and lack of
standardization.

On the other hand, organizations like the W3C move too slowly with respect to
deciding new standards for widely-and imminently-needed features (hence band-
aid solutions and meta-languages, whether server- or client-side, like jQuery,
SCSS, CoffeeScript, etc.). HTML still doesn't have an standardized <input
type="slider"> and it's 2015. So having these popular scripts, frameworks, and
their ever-evolving versions is often a necessary evil, because new
technologies and their markup necessities will keep appearing constantly into
the future, and standards bodies will never catch up.

Checksums for caching would solve the problem, while effectively achieving
this "bundling" in a dynamic and democratized fashion, by "bundling"
(=caching) whatever is encountered by the user at any given point in time into
the future, and without having to update the browser software itself.

Also, this strategy can be applied to commonly-used webfonts, images, icon
sets, and so on. For example, although there are attempts to centralize fonts,
such as Google Fonts, I'm still forced to use a local server copy in order to
maintain site accessibility in China. Checksums would solve the same problem
while decentralizing the whole initiative; as long as someone got the same
font from Google Fonts or even another site's local copy, as long as the
checksum matches, the browser wouldn't actually waste bandwidth fetching my
local server copy.

------
rnhmjoj
This is basically the core idea of ipfs: [https://ipfs.io/](https://ipfs.io/)

~~~
diggan
Yes and yes for using IPFS for this! It's a perfect use case. Easy to
distribute, easy to verify that you're actually getting the right thing and
filename (hash really) is based on the content. Would be trivial to implement
some library that help you with this and then people just have to throw up
IPFS nodes and we'll have a community CDN up and running in no time.

------
spankalee
See [https://developer.mozilla.org/en-
US/docs/Web/Security/Subres...](https://developer.mozilla.org/en-
US/docs/Web/Security/Subresource_Integrity)

~~~
dheera
Thanks! It seems that the "integrity" attribute is aimed at preventing XSS. I
would totally support that as well. Perhaps the "integrity" attribute can be
taken advantage of by browsers automatically for cross-site caching, in
addition to its intended purpose of preventing cross-site scripting attacks?

~~~
vorotato
Cross-site caching adds all the risk you intend to take away by adding the
integrity attribute. What if someone makes a Jquery.js file that hash-
conflicts with the version you have, but has a malicious payload. You visit
bob.com, get the infected jquery with the hash 0xcafebabe, which just so
happens to match with jquery's 0xcafebabe. You might say wow how unlikely, but
with entire botnets dedicated to guessing matching hashes, something like this
could be devastating.

