
Building for HTTP/2 - saidajigumi
http://rmurphey.com/blog/2015/11/25/building-for-http2
======
krschultz
Nice to see an in-depth article on a topic working it's way up Hacker News.
There's no hyperbole, just solid recommendations based on facts.

------
holloway
It's been found that resource concatenation is still more efficient
[http://engineering.khanacademy.org/posts/js-packaging-
http2....](http://engineering.khanacademy.org/posts/js-packaging-http2.htm)

And 'building for http2' seems to be advocating hashes in filenames without
neccesarily walking the dependencies, which can have cache problems
[http://holloway.nz/r/hashes-in-filenames/](http://holloway.nz/r/hashes-in-
filenames/)

ES6 imports are static (you can't use JS to dynamically include a file) so
this means transitive dependencies need to be resolved at build time, not by
using a manifest file which resolves paths clientside. This also applies to
(eg) SVGs that reference JPEGs because you probably don't want JS involved
there either. Manifest files are a bad idea, except perhaps for instructing
webservers to tell browsers to preemptively download files. AFAIK there isn't
a standard for this yet so a manifest doesn't solve the problem.

Ps. I've got a gulp build for http2-style packing and I'll release it on my
github in a few weeks (same account as my username)

~~~
magicalist
> _It 's been found that resource concatenation is still more efficient
> [http://engineering.khanacademy.org/posts/js-packaging-
> http2....](http://engineering.khanacademy.org/posts/js-packaging-http2....*)

_for a page with 2.4MB of JS spread over 296 files.

And, notably, the article doesn't advocate for serving individual files in the
first place, but instead says that some level of grouping is likely
advantageous, but http2 gives new opportunities which will need to be
explored, like grouping by level of churn in libraries or by units of
functionality

> _And 'building for http2' seems to be advocating hashes in filenames without
> neccesarily walking the dependencies_

I believe you misread. They use a shallow tree in their scout js file; all of
the resources a page will need are declared there and nothing is loaded in
succession.

> _Manifest files are a bad idea, except perhaps for instructing webservers to
> tell browsers to preemptively download files. AFAIK there isn 't a standard
> for this yet so a manifest doesn't solve the problem_

I'm not sure what this means. The manifest is generated by the build process.
There is no standard for dependency manifests, sure, but there's no reason you
can't whip a reasonable one up that a simple server module could understand if
you wanted to add push support. Eventually the cow paths will be paved and
we'll come up with a more standardized approach (or approaches) to that kind
of manifest.

~~~
holloway
> for a page with 2.4MB of JS spread over 296 files.

Fair point, there is always sweet spot in benchmarks like this. That said
there are examples of delays that they mention which seem to be independent of
the amount of requests.

>> I believe you misread. They use a shallow tree in their scout js file; all
of the resources a page will need are declared there

Being shallow doesn't fix the problem when resources are split up at all and
that's what the article (correctly) suggests to benefit from HTTP/2 features.
As we move away from a single bundle.js file we need to think more about how
to manage splits. E.g. The article talks about splitting into `application`
and `vendor` hashes, and you mention grouping by churn or functionality. So
presumably we agree that being able to split files without significant speed
loss is a major feature of HTTP/2.

All I'm suggesting is that the filename hashing strategies need to be improved
to avoid application errors when splitting.

If we don't improve filename hashing strategies then we'd still have to rely
on small HTTP cache durations, or we'd get temporary errors while caches
expire, which is a problem because the whole point of the article is to have
extremely long cache durations. As they say,

"Files loaded by the scout can have extremely long cache times because [...]"

Then as the article says,

"This can be solved by using hashes of the file contents rather than version
numbers: vendor-d41d8cd98f.js. If a file has not changed, its hash will remain
the same."

This is bad advice because they don't talk about dependencies affecting
hashes. Even if that file hasn't changed but it depends on another file then
its hash needs to change too, or else application.js will will have different
contents depending on when it was loaded. Eg.

application.js has a requirejs import of vendor.js

requirejs(["vendor.js"], function(vendor){ });

So if we generated a naive hash as suggested then application-45345345345.js
might look like,

requirejs(["vendor-3455345ABCDFFF.js"], function(vendor){ });

Then vendor.js changes but application.js doesn't, and so
application-45345345345.js is updated to read,

requirejs(["vendor-f2ab4cde6f55.js"], function(vendor){ });

And now you can see the problem that application.js didn't change but it's
dependencies did, so application-45345345345.js will have different data
depending on when you access it which is exactly what you don't want with long
HTTP cache durations. This can cause mismatches in what version of a file is
imported, and cause application errors.

The only way to have permanent URLs with content that doesn't change (a
prerequisite for long HTTP cache durations) is to include dependencies in the
calculation of filename hashes. Or you could have an dynamic resolver (eg.
based on a manifest file) but ES6 and SVG have static imports. More to the
point though, if you can resolve it at build time why not?

When they say "This can be solved by using hashes of the file contents" this
is a fundamental mistake and it's simply bad advice.

~~~
vive-la-liberte
When you update application.js, the hash of application.js itself will change
as well. If you include the hash of application.js in its filename, I don't
see the problem.

If you do not include the hash of application.js in its filename then I think
the main problem is whoever decided to use hashes for _some_ scripts but _not_
for others.

That being said, I don't use gulp and all that stuff. It seems to me that some
people get so caught up in technology that they

1) Reinvent the wheel a million times. I admit that make has its short comings
but I think a similarly general build tool is the way to go. Look into tup,
they know what's up.

2) Don't notice that they are sitting atop a tower of abstractions and that
their tower is starting to become unstable.

3) Are pulling in massive amounts of scripts as dependencies to build
something that could've been implemented with little to no scripting at all,
and with their high bandwith workstations and latest computers and devices
they don't notice the fact that while they are trying to reduce load times
(which by itself is a worthy goal) the browsers of their visitors are crashing
because rendering these websites take more ram than is available on many
devices. Jesus, fuck!

So that was a bit of a rant. Sorry about that, I guess.

~~~
holloway
> If you include the hash of application.js in its filename, I don't see the
> problem.

Then, and I'm not meaning to sound harsh here, but you don't understand how to
generate permanently cachable URLs, and that's what's required here.

If dependencies aren't included in calculating that hash then some users may
run several different versions of the application at once (with all the errors
that entails).

------
deftnerd
I've been trying to use SubResource Integrity on all my sites as well as
Content-Security-Policy headers. This makes it so it's better to link to
static assets rather than embed javascript or CSS within the page.

Using HTTP2 has meant that there are no longer any performance downsides to
multiple include directives. I'm pretty happy about that.

------
rmdoss
He mentioned CDN's not supporting HTTP/2\. I am pretty sure a few them do.

Including [https://keycdn.com](https://keycdn.com),
[https://sucuri.net](https://sucuri.net), Akamai and likely a few more.

------
percept
"Files loaded by the scout can have extremely long cache times because the
scout loads resources from versioned URLs: when a resource is updated, it is
hosted at a new URL, and the scout is updated to load the resource from that
new URL."

------
meddlepal
I though TLS wasn't a requirement of HTTP/2? Did that change?

~~~
alexdom
It is not mandatory in the specification, but all major browsers have decided
to require it. [1][2]

[1]
[https://en.wikipedia.org/wiki/HTTP/2#Encryption](https://en.wikipedia.org/wiki/HTTP/2#Encryption)
[2]
[https://en.wikipedia.org/wiki/HTTP/2#Encryption_2](https://en.wikipedia.org/wiki/HTTP/2#Encryption_2)

