
A Guide to HTTP/2 Server Push - okket
https://www.smashingmagazine.com/2017/04/guide-http2-server-push/
======
mozumder
_> As of now, Nginx doesn’t support HTTP/2 server push, and nothing so far in
the software’s changelog has indicated that support for it has been added.
This may change as Nginx’s HTTP/2 implementation matures._

Forget Apache or Nginx. Use the H20 server instead:
[https://h2o.examp1e.net](https://h2o.examp1e.net)

I've been using it on a production server with a Django app for about a year
now, with HTTP/2 push support, and it's been great. It includes advanced
feature that knows if a web browser already has pushed content in its cache,
so it doesn't resend it. It's architecture seems to offer the best of Nginx's
multiprocessing capabilities as well. Configuration is as simple as Nginx as
well. (I went from Apache -> Nginx -> H2O)

~~~
vfaronov
H2O also allows an HTTP/1.1 backend to push by sending links [1] with
rel=preload — these links are automatically converted to pushes by H2O. This
can be further sped up with early hints [2].

[1] [https://tools.ietf.org/html/rfc5988](https://tools.ietf.org/html/rfc5988)
[2] [https://tools.ietf.org/html/draft-ietf-httpbis-early-
hints](https://tools.ietf.org/html/draft-ietf-httpbis-early-hints)

------
jakelarkin
Couple gripes with HTTP2 ...

1) HTTP2 Server Push is a hobbled protocol. It was designed for serving static
assets to desktop browsers for pre-caching. Most of the internet traffic now
is moving to mobile apps or rich browser clients. Developers would greatly
benefit from a true bi-directional API that could be fully leveraged by event-
based or reactive frameworks. The community could even build it's own APIs if
the browsers just exposed the DataFrame primitives of HTTP2. But they don't.

2) Open-source proxy servers like NGINX, HAProxy and many key IaaS providers
like Cloudflare, Google Cloud do not support full duplex HTTP2 connections
(one-side only at best, translated to HTTP1 on the backend). This is 1 year
after RFC 7540 was officially released and 5 years after SPDY. Why even bother
baking considerations for HTTP2 into your application layer when your edge
infrastructure won't let you leverage it anyway.

~~~
manigandham
The lack of HTTP/2 on the backend to origin connections is a big sad mystery.

If anything it would greatly increase performance, especially for smaller
servers and longer distances, with the multiplexing and stream priority
settings alone. Required TLS would also increase security.

~~~
voidlogic
>The lack of HTTP/2 on the backend to origin connections is a big sad mystery.

Sad, sure, but its not a mystery. As of HTTP/1.1 these proxies already pools
TCP/IP connections between the proxy and its origin (the backend server), so
it doesn't gain the same connection multiplexing/windowing/cardinality
benefits that HTTP/2 to end client gets. This means for many proxy code bases
H2 to origin is just a nice to have. Obviously, by prioritizing like that you
miss other nice to haves like push.

~~~
manigandham
HTTP/1 connections are limited to one request on the connection at a time and
head-of-line blocking. HTTP/2 connections will also be pooled but are able to
sustain many more concurrent requests because of multiplexing, while also
setting certain requests as higher priority. It's a major improvement for high
volume services.

The performance and features are why Google's gRPC uses it as the foundation
for microservice communications at scale and low-latency.

~~~
voidlogic
>HTTP/1 connections are limited to one request on the connection at a time and
head-of-line blocking.

This is true, however this limitation just increases the size of the
connection tool needed for the same concurrency. I'm not saying HTTP/2 isn't
better, I'm saying its less better for middleware than end user connection
optimization.

~~~
derefr
If you have 100k people who've opened e.g. websocket connections to your
backend, I doubt you're going to structure things as a 100k-socket connection
pool open to your backend. You've probably either:

A. got that scaled across at least 100 backend servers, even if those servers'
CPUs are almost entirely idle; or

B. have chosen a completely different, extremely roundabout architecture
involving client async requests with HTTP 202 responses + client polling of a
"status of server-side promises" endpoint; or

C. are effectively manually doing what HTTP2 does automatically, by having a
"stateful-connection load-balancer server" (e.g. SocksJS) that mediates
between long-lived client connections, and short-lived RPC requests with async
responses from your backend.

Whereas, with HTTP2, that could very easily be one machine, taking in those
100k nearly-idle connections, and passing them over _one_ TCP socket to _one_
backend.

------
cflat
Using `Link: rel=preload` is interesting, but it misses the real opportunity
to get the critical resources before the html and window to grow the
congestion window early. At best you are in a race condition on the browser's
preloader/speculative parser. At worse, you are competing with the html base
request with your pushed resources.

The better solution is to push these resources _before_ the HTTP headers and
status code from the application framework. The real opportunity is to get the
content downloaded early, grow the congestion window ahead of receiving the
html, then yielding the socket to the html, and then continue pushing content
until the browser starts making it's own http requests. As all things -
mileage will vary. Some times this is a large opportunity (eg: User in
Australia requesting content from NY; or DR failover). Other times, push
opportunity might be negligible (eg: cached html page) On high RTT networks,
this does save you the full round trip for the request.

For more details on this check out my talk at Velocity Amsterdam:
[https://youtu.be/GjWD1pOkxUk?t=1534](https://youtu.be/GjWD1pOkxUk?t=1534)
[https://speakerdeck.com/colinbendell/promise-of-
push](https://speakerdeck.com/colinbendell/promise-of-push)

I also built a tool (on top of webpagetest.org) that helps you evaluate the
potential for push here: [https://shouldipush.com](https://shouldipush.com)

------
alex_duf
So wait, if the server pushes assets before the client request it, does that
mean the server pushes them regardless of the cache status on the client?

If so that should really be limited to small assets to be beneficial

~~~
Ajedi32
As I understand it, the client has an opportunity to proactively cancel pushes
of assets it already has cached.

The server will send a "push promise" which basically says "I'm going to send
this file to you", and then the client can come back and say "don't bother, I
already have it". And this all happens in parallel with the download of other
assets (like the main page), so it doesn't really slow anything down.

Here's an article I read which goes into how server push works in a lot more
detail: [https://hpbn.co/http2/#server-push](https://hpbn.co/http2/#server-
push)

~~~
jimktrains2
Doesn't that require more round trips than just having the client ask for it?

~~~
Ajedi32
No, even in the worst case it's exactly the same number of round trips.

Without server push:

1\. Client requests main page ->

2\. Client receives main page <-

3\. Client requests subresources ->

4\. Client receives subresources <-

With server push:

1\. Client requests main page ->

2\. Client receives push promises and main page <-

3\. Client cancels promises it doesn't need ->

4\. Client receives subresources <-

The only difference is that with server push, steps 2, 3 and 4 happen in
parallel, and step 3 can be omitted entirely in the event that the client
doesn't need to cancel any pushes.

~~~
jimktrains2
Or, what may be a more common case, depending on caching method:

Without server push:

1\. Client requests main page ->

2\. Client receives main page <-

With server push:

1\. Client requests main page ->

2\. Client receives push promises and main page <-

3\. Client cancels promises it doesn't need ->

~~~
Ajedi32
Ah, fair point. Now I see what you were trying to say.

Assuming the server has no way of determining which assets the client has
cached (which depending on implementation may not be the case) you're of
course correct. However, after step 2 the page has already fully loaded in
both cases, so step 3 doesn't really slow anything down.

------
lobster_johnson
I was surprised when I read that Server Push wasn't intended for event
sourcing. Anyone know why?

These days people use Server Sent Events and WebSockets for this: The client
requests data from the server (possibly through a "subscription" type of
request), and the server keeps sending messages (e.g. chat messages) until the
client cancels the request or closes the connection.

Not having looked deeply into Server Push, it sounds ideal for this sort of
thing, and then we could do away with both SSE and WebSockets, neither of
which are very HTTP-y.

~~~
jaffathecake
SSE is just a long-running connection, so it fits with HTTP/2 pretty well (and
HTTP/1 for that matter).

H2 push needs to be linked to an initial response, and I don't think it's
acceptable to push resources once the initial response is complete. Also, H2
push is pretty much a request/response pair, so it can be cached, so it
doesn't really fit the SSE model.

You're right about websockets though. At some point they'll be replaced with
something more H2-like, but it won't be push-based.

~~~
dullgiulio
If the next protocol iteration is something like QUIC, then the WS could just
be implemented over QUIC UDP packets.

------
niftich
It's true that the _mechanics_ of triggering HTTP/2 Push are done with the
Link rel=preload directive (defined by the W3C [1]), the advice to hardcode
HTTP headers on outgoing responses that point to file references is simplistic
at best -- which shows that this whole thing ( _still_ ) isn't yet thought out
too well.

Most of the content of my earlier post on this topic [2] still stands; since
then, Caddy has implemented rule-based ways of specifying what to push [3][4]
(although not via Google's push manifest [5], despite being asked to). It'd be
far more productive if the community coalesced around one clear declarative
way of specifying what to push -- Google's push manifest does this well enough
-- which could be ingested by server software and applied as needed.

If this were done, the problem space is moved to somehow generating that
manifest, which can be done by hand, or scraping resources, or as an output of
a more complex tool [6] that knows more about the info-space of the resources.

And of course, keep in mind the advice that Google has prepared as part of the
research they've done on deployments of HTTP/2 Push [7].

[1] [https://www.w3.org/TR/preload/#server-push-
http-2](https://www.w3.org/TR/preload/#server-push-http-2) [2]
[https://news.ycombinator.com/item?id=12722383](https://news.ycombinator.com/item?id=12722383)
[3]
[https://github.com/mholt/caddy/issues/816](https://github.com/mholt/caddy/issues/816)
[4]
[https://github.com/mholt/caddy/pull/1215](https://github.com/mholt/caddy/pull/1215)
[5] [https://github.com/GoogleChrome/http2-push-
manifest](https://github.com/GoogleChrome/http2-push-manifest) [6]
[https://github.com/webpack/webpack/issues/1223#issuecomment-...](https://github.com/webpack/webpack/issues/1223#issuecomment-232311846)
[7]
[https://docs.google.com/document/d/1K0NykTXBbbbTlv60t5MyJvXj...](https://docs.google.com/document/d/1K0NykTXBbbbTlv60t5MyJvXjqKGsCVNYHyLEXIxYMv0)

------
rb808
What I'd love in the HTTP protocol is some way to control IPs used by client -
eg fail over to different IP if first is unresponsive. Such a small client
side change would make back ends much simpler.

~~~
jeroenhd
There's already a system to do so: it's called DNS.

You can check for timeouts using XHR already and if a server doesn't respond
then having it send a new IP wouldn't work anyway.

The problem with using DNS that way is the usual 30 second time out. On the
other hand lower values than 30 seconds can cause problems for many users with
slow or mobile connections.

Honestly I can't think of a reason why the HTTP specifications should contain
anything about IP addresses.

------
username223
Looking at the graphs, all of the variations come in at 2-4 seconds, in most
cases 10-25% average difference. So it's all slow, but not unbearable; it
seems like a bit of a wash.

------
TYPE_FASTER
Flashback to the mid-nineties and push media.

[http://www.javaworld.com/article/2077287/marimba-software--
p...](http://www.javaworld.com/article/2077287/marimba-software--pushes--
information-over-internet.html)

~~~
jgrahamc
Literally, nothing like that.

