So is nginx with http2 enabled vulnerable too? Caddy? I should I not worry about this, because a small (by Cloudflare scale) botnet may DDoS a single server completely anyway?
I'm curious to learn more. How how much work is it to establish a stream and close it? It feels like something that could be done very quickly, but it also involves setting up some state (stream buffers) that could be a problem too.
Well, it requires almost an order of magnitude more energy to serve HTTP/3 than HTTP/1, so maybe?
Why do I say this? Because it breaks nearly every optimization that's been made to serve content efficiently over the last 25 years (sendfile, TSO, kTLS, etc), and requires that the server's CPU touch every byte of data multiple times (rather than never, for http/1). Its basically the "what if I do everything wrong" case in my talk here: https://people.freebsd.org/~gallatin/talks/euro2022.pdf
Given enough time, it may yet get close to HTTP/1. But its still early days.
Bit of a leading question since you assuming that this is "complexity bloat" and not just "a feature that people use", but yes, HTTP/3 has streams and so it should be vulnerable.
HTTP/3 is not vulnerable to this specific attack (Rapid Reset), because there it has an extra confirmation step before the sender can create a new stream.
HTTP/2 and HTTP/3 both have a limit on the number of simultaneous streams (requests) the sender may create. In HTTP/2, the sender may create a new stream immediately after sending a reset for an existing one. In HTTP/3, the receiver is responsible for extending the stream limit after a stream closes, so there is backpressure limiting how quickly the sender may create streams.
A number of people have expressed concerns about making the relatively simple protocol more and more complicated in the name of performance. This looks like it's going to be their "Ha, told you so!" moment.
It reminds me of Meltdown/Spectre: you have a pipe, and instructions need to flow through it in a single file line. Let's increase performance by allowing things to be sent/processed out-of-order!
That's a good example, because it would be an incredibly bad decision to drop speculative execution since it leads to a massive performance improvement.
technically the issue is speculative execution, not superscalar execution (ie. "allowing things to be sent/processed out-of-order!"). Most high performance processors have both, but you can have one without the other.
My problem is that it often seems like significant complexity is added in order to chase marginal performance gains. I suppose performance is relatively easy to measure while complexity is not.
A vulnerability is a flaw in the implementation that allows an attacked to trigger some kind of unexpected result. The result in this case is defined in an RFC. It is 100% working as intended.
What is the vulnerability anyway? I skimmed the linked article twice and could find no explanation of how it works, beyond "request, cancel, request, cancel" and that it's called Rapid Reset. Why is HTTP/2 in particular vulnerable? Are all protocols supporting streams vulnerable? How is it possible to vomit such a long article with so little information?
I visited a few common sites and they seem to use HTTP/2. I'm not sure the point of arguing it's not fundamental, a cursory glance shows HTTP/1 is bottlenecked by not being able to use the same TCP connection to serve multiple resources (something HTTP/2 fixes)? Is there ire against HTTP/2 adoption, and for what reasons?
I'm not an area expert, but common issues raised over the years:
- HTTP/2 as implemented by browsers requires HTTPS, and some people don't like HTTPS.
- HTTP/2 was "designed by a committee" and has: a lot of features and complexity; most of those features were never implemented by most of the servers/clients; most of those advanced features that were implemented were very naive "checkbox implementations" and/or buggy [0]; some were implemented and then turned out to be more harmful than useful, and got dropped (HTTP/2 push in browsers [1]) etc.
http 1.1 connections can be reused, including with pipelining, and it can open multiple sockets to make requests in parallel. http 2 allows out of order responses on one socket. is it worth the complexity? http 1.1 is over 20 years old and battle tested.
Actually even the diagrams are wrong because they focus on a single connection to explain the problem, carefully omitting the fact that a client can easily open many connections to do the same again. I agree it's mostly marketing and press-releases.
Yes, the attackers will obviously open many connections. In fact, they've always opened as many connections as they have resources for.
But establishing a connection is extremely expensive compared to sending data on an already established channel. With this method they need to open far fewer connections for the same qps.
There's no need to confuse the issue by trying to diagram multiple connections at the same time.
tl;dr HTTP/2 allows clients to DDoS backends much more effectively by using the multiple-stream feature of HTTP/2 to amplify their attack directly inside the reverse proxy (which typically translates HTTP/2 to HTTP/1).
> When Cloudflare's reverse proxies process incoming HTTP/2 client traffic, they copy the data from the connection’s socket into a buffer and process that buffered data in order. As each request is read (HEADERS and DATA frames) it is dispatched to an upstream service. When RST_STREAM frames are read, the local state for the request is torn down and the upstream is notified that the request has been canceled. Rinse and repeat until the entire buffer is consumed. However this logic can be abused: when a malicious client started sending an enormous chain of requests and resets at the start of a connection, our servers would eagerly read them all and create stress on the upstream servers to the point of being unable to process any new incoming request.
Sticking with HTTP/2, or going with grpc/similar is also possible. It depends on which corner of the Internet you inhabit. (Cloudflare isn't the whole Internet, yet)
As soon as you start to implement a proxy that supports H2 on both sides, that's something you immediately spot, because setting too low timeouts on your first stage easily fills the second stage so you have to cover that case.
I think that the reality is in fact that some big corp had several outages due to these attacks and it makes them look better to their customers to say "it's not our fault we had to fight zero-days" than "your service was running on half-baked stacks", so let's just go make a lot of noise about it to announce yet-another-end-of-the-net.
I remember noticing this from the HTTP/2 RFC, maybe 8y ago. I was studying the head of line blocking issue on a custom protocol atop TCP and was curious to compare with HTTP/2. I think I might even have chatted with a coworker about it at the time, as he was implementing grpc (which uses HTTP/2) in Rust.
It never occured to me that it could be used nefariously!
> The exploit has more to do with their implementation than the protocol.
Is it? I imagine that implementations can do things like make creating/dropping a stream faster but how would an implementation flat out mitigate this?
There is a maximum bandwidth at which data can arrive. Simply make sure you can always process it faster than the next packet can arrive, or implement proper mitigation in cases where you cannot.
It's called programming under soft real-time constraints.
Well yeah, that's just how DoS kinda works with these sorts of vulns. "Be faster" is obviously a good strategy, but is it viable ? Is setting up and canceling a stream something that can be done at GB/s speeds? Maybe, idk.
If you push arbitrary amount of pressure through a pipe that can only handle 1000 psi, you need a valve to release the excess pressure, or it will blow up.
In the real world, pipes cannot put arbitrary pressure, so your constraint is more bounded than this. So if you receive 2000 psi but your pipes can only handle 1000, you just need a small component that can handle the 2000 to split the pressure in two, and you can handle it all without releasing any.
The same applies to digital logic; it's always possible to build something such that you can guarantee processing within a bounded amount of time by optimizing and sizing the resources correctly.
As the word "digital logic" suggests, these sorts of guarantees are more often applied when designing hardware than software, but they can apply to either.
> Simply make sure you can always process it faster than the next packet can arrive
This is pretty much impossible unless you make the client do a proof-of-work so they can't send requests very quickly. Okay, you could use a slow connection so that requests can't arrive very quickly, but then the DoS is upstream.
From what I can tell, people were talking about reset flooding that was dated back in July. Not as a novel thing, either. It's just one of several known vulnerabilities, suggesting they've known about this kind of thing for awhile.
This sounds like an IP spoofing issue, it is an IP/layer3 problem where ISPs don't filter spoofed addresses from their users. There sre technical solutions but should also happen is cutting off these ISPs from the internet as a whole when there is a large scale ddos affecting global scale network performance.
No. This is not a ISP problem and the ISP can not solve this - it’s not even visible to the ISP for encrypted connections. This a problem with HTTP/2 itself that web servers / load balancers / proxies need to account for.
This attack just spams requests to a web server. The novel part of the attack is that it also spams packets to cancel those requests to bypass any concurrency limits that may be in place.
The novel HTTP/2 'Rapid Reset' DDoS attack - https://news.ycombinator.com/item?id=37830987
The largest DDoS attack to date, peaking above 398M rps - https://news.ycombinator.com/item?id=37831062