We use CloudFlare, so most of our users get HTTP2 even though our own infrastructure is still HTTP1.1 (however some corporate customers have proxies, which usually downgrade the browser connection to HTTP1.1).
The end result is that HTTP/2 is an improvement for most common workloads, but not all; especially in app-type scenarios with lots of mobile users with suboptimal connections and comparatively few requests (e.g. because you already do batching rather than sending zillions of requests), HTTP/2 can regress typical performance.
WebSockets over HTTP/2 is now specified in RFC 8441; not sure what the implementation status of that is. That solves one of the main problems.
My understanding is that HTTP/3 (with UDP-based QUIC instead of TCP) then resolves all remaining known systemic regressions between HTTP/1.1 and HTTP/2. So yeah, HTTP/1.1 to HTTP/3 should be pretty close to “free speed”.
But even then, it changes performance and load characteristics, and requires the appropriate software support, and that means that many users will need to be very careful about the upgrade, so that they don’t break things. So it’s not quite free after all.
Policy: What is allowed architecturally and what isn't? Are there regulatory requirements? Do you have strict enforcement mechanisms?
Instrumentation: Do you need to watch traffic going over the wire? Will your network filters flag it? Do you have application proxies that route traffic based on payload? How is it going to handle multiplexing if existing solutions don't take it into account? Are you using any proprietary stuff?
QA: Every client, server and intermediary may be using different implementations, and that means bugs. Have you certified all the devices in the chain to make sure they operate correctly? (It doesn't matter, until it really matters)
Operation: Each implementation needs to be upgraded one at a time, so the extent of your technology will determine how long and potentially error-prone all this will be. It will be different for each org, but definitely take a long time for really big ones.
* There is no chance someone will approve this server to run a nginx instance someone compiled themselves
* There is no chance someone will approve this server to run anything but nginx as that's the company standard for proxy servers.
* There is no chance someone will approve this server to install software from a 3. party yum repository. (And that's even a much bigger chance than someone allowing the firewall in front of that server to allow outgoing connections to the internet, so installing form 3. party repos could even be performed)
In the end there was likely 2 ways to get http/2 support for that service:
* Pay some 3.d party to make it happen and be responsible for that server.
* Wait until nginx in RHEL (the epel repository, which was approved and mirrored internally) supported http/2.
We did the latter, which happened many months later.
Putting http2 here is a pain because you probably don't want https. You'd have to have nginx decrypt and reencrypt all the traffic, and you'd have to deal with certificates etc.
In theory the client can cancel the response for a resource it's already got but by the time the response bytes reach the client it's really too late
I always found this feature interesting but weird.
But then Cache Digests moved onto Cuckoo Filters - https://github.com/httpwg/http-extensions/pull/413
On the Apache front, mod_http2 didn't ship until 2.4.17, again CentOS 7 and other RHEL7 based distributions lags behind on 2.4.6.
Sure, that doesn't mean you couldn't compile / install your own version, but for a lot of people that's just not likely to happen. Sticking with the distribution version keeps you within any support contracts, gets you security patches etc. and all the information you need to keep auditors and the like happy.
you can also install the latest version of apache from red hats software collections repo that supports http2 but it throws everything into /opt/rh/rh-httpd24/ which is a bit weird.
Now, if you want to take advantage of HTTP/2 features like server push that's another story.
Unfortunately neither of those currently support HTTP2.
For serving Python, I think your best bet right now is uWSGI behind NGINX.
It's easy to complain about almost anything - it tends to be a lot harder to make a proposal for its replacement.
A justification for the complaint is necessary to be taken seriously, though. And the OP didn't provide one.
but stateful connection management comes with a cost, especially on tcp.
TCP has stateful connections, but both HTTP versions are being sent over TCP anyway so in that sense the transport was always stateful.
It is not currently. HTTP/2 header compression is stateful.
HTTP/2 is a protocol that appear stateless to the end user.
Http/2 is multiplexed, unlike http/0.9-1.1, and while that has some overhead, it being a binary protocol probably makes up for it.
of course the "user layer" is stateless, but the whole connection handling is a state machine (which actually http/0.9-1.1 wasn't)
Keep-alive isn't any better. In Apache bad nginx, keep-alive and http/2 parallel requests are handled at a separate thread and hardly adds any noticeable load.
No, http2 is not better. We actually did the tests. We're not quite Google-scale, but having to handle tens of thousands of requests per second put us in the 'high load' camp.
Basically, HTTP/2 is tuned for a very specific case of Google traffic which pretty much never happens in places that aren't Google.
Pipelining might amplify it, but it is always there, especially with unreliable mobile connections.
On pipelined requests it's not too bad, you're not supposed to pipeline requests that aren't safe to retry. But pipelining ends up being somewhat rare in practice. Reusing an inactive connection is actually pretty risky, the server may be able to shut it down, your network may have silently dropped the connection already (some NAT timeouts are really short, I've seen cases in real mobile networks where the timeout was under a minute!).
I'm not thrilled with multiplexing in http/2, but the sensible stream closure would be really nice to have. If you see a goaway, you know it it saw your request or not, so you can resend it with a clear concensce.
If you are trying to build a robust system, in which requests don’t get lost, the difference is in quantity but not in quality - you must robustly handle the uncertainty in both cases.
But, most people don't realizing the byzantine hell we all inhabit; and http(s) client library defaults for retrying apparently idempotent requests will often work well enough; but server idle configured less than client idle is much easier to trip over.
Without keepalive, you create a new connection and pay the latency costs of doing so. With keepalive, there's a chance you try to reuse an old connection, and it fails, which requires round trips to learn about, and you still have the latency of creating a new connection. So more total latency in that case.
It seems it would improve the average case but make the worst case slightly worse. If your keepalive timeout is short, maybe it would come up often enough to matter.
The problem is many servers don't send this header when closing idle connections. nginx is a notorious example. But well behaving servers should be sending that header if they intend to close the connection after a request.
However, when the server holds the connection open for some amount of time and then decides to close it, it's not permitted for the server to send a response header, because there's no request to respond to. I would love to be wrong, but I don't think I am, because this scenario is mentioned in the RFC, "For example, a client might have started to send a new request at the same time that the server has decided to close the "idle" connection. From the server's point of view, the connection is being closed while it was idle, but from the client's point of view, a request is in progress." 
An example chain of events is:
t0 client opens connection (syn)
t1 server accepts connection (syn+ack)
t2 client sends first request
t3 server sends response and keeps connection open
t63 client sends second request
t63 (simultaneously within a margin of the one way trip time), server closes connection because it's been idle for
t64 client receives FIN
t64 server receives data on closed socket and sends RST
t65 client receives RST
http/2 improves this greatly because in this example, a compliant server will send goaway with last-stream-id 1 prior to closing the connection, and the client will know the second request was not processed and should be retried. It still suffers a latency penalty because it has to start a new connection, and it already wasted somewhere between a one way trip and a round trip.
The Keep-Alive header  is optional, but has parameters timeout, indicating the idle timeout, and max, indicating the number of allowed requests. Max is useful for pipelining, to avoid sending requests that won't be processed; timeout is very helpful for avoiding sending requests when the server is about to close the socket.
The Connection response header is specificed to have two optional parameters, timeout, and max. timeout
For the keep-alive header, you are right, I wasn't aware of it.
In my scenario at time N, the connection is idle -- both sides have received all data the other has sent, all requests have received a response.
If the server half-closes (through shutdown) and sends a FIN, simultaneously with the client sending a new request; that enables the server to read the request, but not respond to it, so I don't see how that is helpful?
The problem from the client side is it's sent a request, and seemingly in response the socket is closed. That could indicate the server crashed on the request, or the server closed the socket because it was idle. If you have a request that you know or suspect shouldn't be made more than once, you shouldn't retry it on a new connection. Assuming the tcp packets from the server, you can actually take a good guess at causality, because the ACK number and TCP Timestamp indicate if the server saw your last transmission, but that information isn't exposed through normal the normal socket API; you could maybe guess based on round trip time too, but it is nicer in http/2 (or other protocols), where there is an explicit close message.
shows they removed it in Chrome due to issues, and weirdly this Firefox ticket:
(last updated 5 years ago), seems to show they're not going to enable keep alive for HTTP 1.1, but Firefox is most definitely utilising it.
Firefox most definitely is utilising it fully, and obeys the Keep-Alive params specified, which I can't see any of the other browsers doing.
FWIW, this is the reply header lighttpd sends:
HTTP/1.1 200 OK
Last-Modified: Wed, 16 Sep 2015 08:50:41 GMT
Date: Fri, 08 Mar 2019 10:34:12 GMT
That sounds like a bug with your server implementation. The web would melt down if Chrome's keep-alive support didn't work.
Firefox does this perfectly. Chrome sends a FIN ACK in response to the response from the server, and it closes the connection its side, triggering the next recv() within the server to return 0 (or poll() to fail, depending on how I do it). But then it opens another connection after that with another request which could have been within the previous connection.
I also can't see Chrome doing it on non HTTPS websites in general when monitoring through wireguard, which is why I don't believe it's an issue with my server's implementation: I haven't actually observed Chrome utilise it at all on several computers on both Linux and Mac OS.
- Serve real content, say, a basic HTML page with a couple of external resources.
- Serve same and monitor (turn on all teh logs) with apache.
- Use the Chrome dev tools. In the network tab, right click on the column headers in the request list and enable the 'Connection ID' column. Keep in mind any modern browser will open concurrent connections in the base HTTP 1.1 case.
The web server I've written is now serving several websites I'm hosting, some of which have thousands of images, so there's plenty of scope for connection re-use.
Each connection ID is different within dev tools in Chrome - although errors have 0 - (often sequential, sometimes there are huge gaps). And this is the same with HTTP websites on the net (just tried www.briscoes.co.nz and am still seeing the same thing in Chrome 72).
Using Wireshark I can see that Firefox is doing the correct thing, so I'm not sure why you think what I'm looking at is too low-level: I think it's pretty clearly something the client's doing at the TCP level to close the connection its end after its received the content length.
It may well be that I'm not sending a header Chrome's looking for, but I'm not sure what it is, as in its request it's sending "Connection: keep-alive", and the response has the same, and Firefox works fine.
It's almost like it's a client configuration or something, but I don't know what that would be, as I've tried on several machines.
The "Keep-Alive" header was something tacked onto http/1.0 and doesn't really mean anything these days.
"Sending a 'Connection: keep-alive' will notify Node.js that the connection to the server should be persisted until the next request."
The article seems to confirm this behavior.
So clients have to account for non RFC compliant servers.
MDN's documentation on this header references https://tools.ietf.org/id/draft-thomson-hybi-http-timeout-01... for the parameters, but this is an experimental draft that expired in 2012.
Which is to say, I can't really fault Safari for not respecting keep-alive parameters that never made it out of the experimental draft phase.
I've had apache refuse new requests because old connections were holding slots.
The client would in the middle of sending a new request, but the server would have already decided to close the connection and the request would fail.
I believe this is a common problem, can and yet the spec has nothing to address this obvious race condition.
But the fact that the underlying HTTP connection is kept-alive by default doesn't necessary mean that the client is going to actually re-use that connection for multiple HTTP requests. And, in fact, in Node.js the connection is not reused by default.
Request count is really not a big deal with HTTP/2 multiplexing.
By forcing base64, you're eliminating all the caching and using much more CPU power to parse that back into a binary image. You're also making the page load slower as the initial payload is bigger and image data has to be handled in line rather than asynchronously.
Do any static site generators rewrite image links as data URIs?
I'm using Hugo and this is at least no default behaviour, not sure though if there is a switch for that.
The SSL connection init costs are real, although SSL session re-use can help there even without keep-alive.
I was burned hard by this in Azure. It seems that the default expiry time is around 4 minutes for the TCP load balancers. You can bump it to 30 min, but if I recall the default interval on Linux is 2 hours. Any long-standing idle TCP connections would get into a state where both sides believed they were connected, but the packets would get dropped to the floor. When the LB timed out, it didn't emit any FIN or RST packets, so neither side knew it had been torn down.
Fun debugging on that one. During the day there was enough activity to keep the connections alive, but at night they'd break. The overall behaviour was that the service worked great all day, but the first few actions out-of-business-hours would fail due to application-layer timeouts, and then everything would work great again until it had sat idle for a while.
(i.e. To handle the case of "HTTP-Request", "huge delay", "final response". Rather than a streaming/chunking reply that is very long/slow.)