Hacker News new | past | comments | ask | show | jobs | submit login
QUIC is now RFC 9000 (fastly.com)
561 points by blucell on May 28, 2021 | hide | past | favorite | 235 comments

> The internet transport ecosystem has been ossified for decades now, and QUIC breaks out of this ossification

But it's still just a layer on top of UDP, and still implemented at the application, like in the past. So how is the ossification broken?

Every app has to implement it itself rather than calling a syscall and letting the OS deal with its complexities (same as for TLS, making fewer apps implement it without a lot of extra work). Which also increases context switching. In the future more protocols will be built on top of QUIC, expanding the user-space stack, increasing fragmentation of application-space IP stacks. And are network cards now going to start implementing it?

It's painful to watch us stride headlong into the future depending on band-aids because surgery is too complicated.

The “ossification” that QUIC deals with is primarily about internet routers who decide to dig into their IP packets. https://http3-explained.haxx.se/en/why-quic/why-ossification Many have optimisations around TCP and long-standing TCP characteristics, which they are able to do because the TCP headers are unencrypted. This led to TCP being difficult to improve, because of all the implementations out there making assumptions and doing these optimisations that weren’t compatible with new developments, or worse, dropping traffic with new, unrecognised TCP options.

In comparison QUIC is almost 100% encrypted, and exposes comparatively little data for routers to ossify on. So even if network cards start optimising, they can only do so much damage because they are given very little information. For example, there is no way for them to do their own special per-stream flow control, because the stream identifier is encrypted. The only things visible are the UDP src/dest port, the QUIC connection ID, and a minimal set of flags. Most important is that the flow control information is encrypted, which is one of the main things TCP couldn’t improve on due to ossification.

That’s not to say the number of implementations stops being a problem — just means that you can actually use QUIC version numbers and negotiate features at the endpoints and ignore anybody along the routes. So you can get adoption of new ideas much more quickly.

Quick illustration of the things no longer visible to network operators: https://blog.apnic.net/wp-content/uploads/2019/03/quic-fig2...., from https://blog.apnic.net/2019/03/04/a-quick-look-at-quic/

Ah, the old middleboxes excuse. Changing TCP would break the middleboxes, so instead we'll invent a brand new protocol which is (theoretically) ignored by the middleboxes. The idea being if you slip something in via UDP, nobody will notice, and then you encrypt it so the middleboxes give up on it.

This assumes that lack of transparency is the only way forward. I'd argue that IP and TCP ossified not because of transparency, but because the design didn't make it easy enough to simultaneously remain compatible while adding new functionality. Yet we know a transparent, grokable, mungeable, backwards-and-forwards-compatible protocol that plays well with middle boxes is doable: HTTP. We could design lower-level protocols to have the same design properties, with both long-lived backwards compatibility and the ability to adopt new functionality. If the problem is with middleboxes, then the solution shouldn't be "how can we avoid them", but "how can we make a protocol that works well with them".

(As a side note, a lack of transparency has chilling effects. Since the adoption of TLS v1.3, more and more TLS security has been broken in order to provide the necessary IT functions of traffic inspection, caching, routing, classifying, shaping, and blocking. Nation states will need to continue creating their own fragmented walled gardens, and continue invading privacy more, precisely because we're giving them less control over how they manage these functions which are a matter of national policy. You can't software you way out of a process problem.)

You're arguing for the "smart network" approach. This approach already failed. Some HN readers might not even have been born when that happened. The dumb network won. Perhaps the smart network can be pulled off successfully, but it won't be by humans, and I don't see anybody else around here trying.

The supposed "chilling effects" don't work out in practice. You will sometimes see people saying oh, China blocks TLS 1.3. Nope, many of the services you use now have TLS 1.3 and yet still work fine in China. "They just push people to a fallback". Nope, Downgrade Protection means that's exactly the same as just dropping all TLS connections.

TLS 1.3 doesn't change the correct way to interpose, it just makes yet more of the bad ideas that shouldn't work actually not work. If you were doing any of those and they broke, that's not TLS 1.3, that's because you were doing something that we already warned you shouldn't work. Stop doing things that can't work.

And it's true, such people keep demanding "transparency". I think they'd probably avoid some of their problems by just admitting that what they want is at best eavesdropping and that this isn't what other people think "transparency" means. There was a whole thread about this on the TLS working group list earlier in May. We get it, you would like to eavesdrop on people, but you don't like to admit this and you'd prefer if there was some way to pretend everything is fine, while, on the other hand, actually eavesdropping.

What's funniest is that the big players in this game want to play both sides so badly they can't even stop contradicting themselves long enough to write a document. The last version of draft-camwinget-opsec-ns-impact that I reviewed has such an obvious contradiction which I called out, it wants everybody else to use RSA key exchange, so that it can snoop all their TLS traffic, but of course implementers of the draft must forbid RSA key exchange, so that nobody snoops them.

> The dumb network won.

If the network was actually dumb perhaps we wouldn't have as many issues with routers fiddling with TCP as the GP described.

If was really, really dumb perhaps we could have managed to deploy SCTP and DCCP.

I wonder how much NAT interferes with new protocols, and if 'simple' stateful inspection with IPv6 would (have) made thing easier.

Nah, middleboxes are garbage. The security features they claim to offer are all snakeoil checkbox-ware, businesses and government agencies/state-controlled telcos use them because they have a nominal compliance obligation to do a thing so they buy a middlebox that they can wave their arms at when the auditor/commissar comes around.

Generalized "optimizer" middleboxes don't actually work in real world performance testing, with the possible exception of machines that apply dumb global queue discipline to fix a defective buffering device somewhere else in the network, but these don't need to do packet inspection at all, just byte-counting/early-drop.

Traffic shaping mostly only works in the endpoint, but aside from that, all you need is src/dest/byte count to do as good as traffic shaping as it is possible to do without being circumvented trivially (by the far end, without the users knowledge, thanks to the web letting the server script the client)

If you want to run a totalitarian state, do what actually competent totalitarian states do and take over the endpoints.

I'll give you that a lot of the middlebox problems are because they are garbage, however...

Dismissing the security features they are ostensibly delivering snakeoil is more than a bit unfair. There is a legit security context where absolute transparency & auditability is the right design objective, rather than absolute privacy. I'd argue the greatest failing we've seen with the evolution of TLS was the failure to recognize and address that reality.

For TLS if you build a middlebox which obeys the protocol invariants (e.g. see RFC 8446 section 9.3) then you're golden. This isn't so hard, it's almost embarrassing that anybody needed these writing down, but evidently they did.

This is true for QUIC too. Obey the invariants. You don't need to read all these complicated documents, just the invariants.

Now of course there are obstacles. First doing this is expensive. Doing things that don't obey the invariants (many of which also destroy your security) was cheaper. Too bad. If your supposed "legit security context" is so valuable you'll afford this.

Second doing this in effect requires confessing to the people you were "legitimately" giving "transparency & auditability" that you are snooping on them. Too bad, if your supposed "legit security context" really was what you pretend this conversation will be easy for you.

Yes, yes, you must obey the invariants. The trick is in the consideration of the invariants.

The problem isn't the cost or the "confession", because those are already part of the context. The problem is that in the process it creates new security risks that needn't be there but for the choices made by TLS.

In particular, the lack of separation of concerns between protections from eavesdropping, tampering and forgery proves to be quite problematic for environments where forgery & tampering are a serious concern, but eavesdropping is the nature of the context. It's why you see proposals for ugly hacks of TLS like eTLS; it's doing more harm than good.

Sure, in terms of raw features it looks like being able to uncouple the integrity verification would enable a client to choose to say "You can see what I'm doing, but you can't tamper with it". There are researchers who would like to do that, and last I looked the TLS working group invited them to turn that into an actual draft.

But it's essential in practice to remember that overwhelmingly the environment people are interested in is the Web, and on the Web those aren't actually different things again, in practice.

For example, suppose you're OK with me eavesdropping your Amazon purchases. You're buying a Lego robot for your daughter's birthday, you work for the investment bank, it's nice that we even let you do that from work and clearly we can't just let you do whatever you want with no oversight or you might rob us blind. We have a hypothetical TLS 1.4 which lets you opt in to this eavesdropping but nothing else, and you've done exactly this.

We eavesdrop your purchase, we were prevented from tampering with it. But, this is the Web. The eavesdropped TLS session contains a secret cookie value, which identifies you to Amazon. We can use this to impersonate you and do whatever we want, just as if we were able to tamper with your TLS session. The apparently more limited permissions are a mirage.

> But it's essential in practice to remember that overwhelmingly the environment people are interested in is the Web, and on the Web those aren't actually different things again, in practice.

If the problem is the web, then the problem should be addressed by the web, not TLS.

...and of course, part of the problem is that we've conflated nearly everything to the "web", so thinking that the web is somehow removed from this problem is a mistake.

> We eavesdrop your purchase, we were prevented from tampering with it. But, this is the Web. The eavesdropped TLS session contains a secret cookie value, which identifies you to Amazon. We can use this to impersonate you and do whatever we want, just as if we were able to tamper with your TLS session. The apparently more limited permissions are a mirage.

Yup, that's a legit problem, though I'd argue it actually further reinforces my point. This is a good example of a problem that stems from conflating multiple security concerns, rather than addressing them as separate concerns.

If there's an acknowledgement that your authenticated & tamper proof session is fully exposed to eaves dropping, then you start using the authentication mechanism itself to identify people or you develop an additional identification mechanism that is resilient to replay attacks (both of which are entirely possible) or people just inherently become aware of the reality that buying stuff on Amazon from an eavesdropping environment means that they will be impersonated on Amazon. What you don't do is think a "secret cookie value" is a secret when it so clearly is not.

The reality is that cellular data, satellite data, rogue WiFi, and fixed wireless all exist.

Transparency and Auditability are real, but they absolutely require endpoint security.

If you don't control the endpoint you don't know if it has an LTE chip, and if you do control the endpoint you don't care.

DPI security middleboxes for the enterprise are always and without exception snakeoil and/or checkboxware.

> DPI security middleboxes for the enterprise are always and without exception snakeoil and/or checkboxware.

If you don't care that your broker may be selling you down the river, or whether information is being leaked to an intelligence operative, that's a fair assessment. Unfortunately, much as we might wish otherwise, there can be conflict between privacy solutions and security solutions. The trick is to find a way to service both without compromising either.

If it's legitimate security it should've done via an explicit proxy not an attempt to break into every conversation and hope it works right 5 years from now.

Yeah, it's that kind of lack of understanding & consideration of the context that got us in to this mess.

I deploy these kinds of network security solutions to large enterprises as part of my job (typically healthcare customers, from a network MSP/MSSP VAR perspective these days) and as much as I like being paid to continually "fix" broken implicit systems it's not the way to go and never should have been. Either you control the systems or you don't, if you don't fully control you're going to fail at reliably breaking into the conversations to a meaningful level and you probably shouldn't have such critical information on systems you don't control anyways. That or you're trying to control/inspect someone else's system which is exactly what these things are trying to prevent.

> Yet we know a transparent, grokable, mungeable, backwards-and-forwards-compatible protocol that plays well with middle boxes is doable: HTTP

Not all traffic is web traffic, maybe not even most of it. There are so many things that would have absolutely wretched performance if you tried to implement them with HTTP instead of TCP. HTTP is way too generic to be truly performant.

I don't know that I'd hold up HTTP as an example of the way forward. ;-)

That said, the TLS v1.3 thing has been interesting. I tend to look at it more as a case of projecting one security context onto everyone else's security context, and the consequent problems involved. Security is tough like that.

> The “ossification” that QUIC deals with is primarily about internet routers who decide to dig into their IP packets.

Being 'opaque' to the network is not a foolproof choice, since it also makes stuff like NAT, congestion control and QoS a lot less robust. By exposing connection identifiers to the network, SCTP gains a lot of flexibility wrt. such scenarios compared to QUIC.

> it also makes stuff like NAT, congestion control and QoS a lot less robust

That's a feature, not a bug. NAT is cancer. It killed the open internet by enabling a particular lazy form of "security" that forced everyone to hand control of their most important data to cloud services.

QoS is worse but has less fallout due to less deployment.

>NAT is cancer. It killed the open internet by enabling a particular lazy form of "security" that forced everyone to hand control of their most important data to cloud services.

Sorry I dont follow. What Cloud Services?

All cloud services.

Personal computers and devices are not allowed to talk to each other on the internet. They are only allowed to talk to specially designated servers -- specially designated by virtue of having a public IPv4 and open port. Enforcement happens through liability: if you put data in a cloud service and the cloud service gets hacked, it's the cloud service's fault, but if you run a program on an open port and the program gets hacked, it's your fault, not the program's fault. This is an arbitrary social choice that follows directly from enshrining NAT as a "best practice." Even the term "open port" presupposes that there is something unnatural about plebian computers talking to each other on the open internet! This effectively forces services to be centralized, or at least pushes very strongly in that direction.

Of course, we can quibble: what about video conferencing? What about gaming? They use direct connections! Yes, but they pay an extraordinarily steep complexity and reliability price to obtain them and the solutions still wind up using centralized servers. They are exceptions that prove the rule because they allow us to observe how "stepping out of line" is discouraged. It's an accidental line drawn by monkies rather than an intentional line drawn by lizard people, but it's strongly enforced and society-shaping all the same.

It's totally wild to consider the social and market consequences of NAT, the unassuming lazy and kludgy security hack. It got out of hand and reshaped the internet, the technology market, and society from the ground up! That's hardly even an exaggeration!

NAT is literally the only thing stopping every single personal device in the world from being hacked. It may be a kludge, but it's the best kludge in all of technology. And what do you have against QoS? Would it be better if your VoIP calls stuttered every time you loaded a fat web page?

> NAT is literally the only thing stopping every single personal device in the world from being hacked.

This is so wrong that it must be satire. NAT provides the very most basic protection possible, easily replicated (and surpassed) by every firewall that comes stock with every modern OS, and is completely insufficient (and not even necessary) to prevent personal devices being hacked.

I'll buy this for firewalls in end-user routers, but if you put the firewall on the personal device it's gonna get disabled the first time it supposedly interferes with a videogame or a movie stream or whatever. :(

NAT enables dogshit endpoint security, yes, but that's the problem. Without NAT as a crutch, endpoint firewalls and service security wouldn't suck. Compared to other feats of security engineering the industry has undertaken in order to make centralized cloud services happen, namely JavaScript and the ability to safely run untrusted code from the nastiest corners of the internet at high performance, this would be downright trivial. It just never happened because we had NAT.

It's really not. Like, really, really not. I won't try to make the case here, but just know that assertions like that aren't going to come across as credible. The systemic effects from NAT have actually made things worse.

No it isn't, that's basic stateful firewalling. NAT in no way provides security. At all.

The fact that a kid with a port scanner can't remotely connect to and exploit your printer is due to NAT not being able to route the traffic, whether you have a stateful firewall or not. The majority of internet-addressable devices in the world do not have firewalls. Hence, NAT keeps most devices safe from drive-by RCE. There are plenty of attacks to get around that, but by default, nothing else protects random devices like NAT does.

> The majority of internet-addressable devices in the world do not have firewalls. Hence, NAT keeps most devices safe from drive-by RCE.

According to System Preferences, my Mac's firewall is currently turned off. Here are the current non-NATed IPv6 addresses on it:

   $ ifconfig en1 | grep inet6 | grep -v fe80
 inet6 2607:f2c0:93c7:fe00:144f:b9a3:bcd4:ef6 prefixlen 64 autoconf secured 
 inet6 2607:f2c0:93c7:fe00:e04c:f356:f2:b9e2 prefixlen 64 autoconf temporary 
You should be able to ping(6). Can you connect to tcp/22? If not, it shows that my router's stateful packet inspection (SPI) works even when NAT isn't present.

* http://www.ipv6scanner.com/cgi-bin/main.py

The results of an nmap of one of those address from the system itself:

    22/tcp   open  ssh     syn-ack
    111/tcp  open  rpcbind syn-ack
    1022/tcp open  exp2    syn-ack
    2049/tcp open  nfs     syn-ack
If you have a device doing NAT, then that device is already doing SPI, and so the NAT part is redundant. It is the SPI that gives you the security.

Well first, IPv6 doesn't need NAT, and mostly never uses NAT. If IPv6 devices used NAT, they would be protected. But since they aren't using NAT, they aren't protected. You're right that devices that enable IPv6 (and network routers that give out IPv6 addresses) are not protected by NAT, but luckily it's still a small amount of traffic/devices worldwide. But this is changing, and so NAT won't help people for much longer.

Second, SPI isn't used for security here. It's true that SPI is used by NAT, and that SPI can be used to prevent, say, a specially crafted packet from passing from a public network through a router into a private network. But something other than SPI has to actually enforce that. That "something" is usually extra features of the network stack, like disabling source-routing, reverse-path filtering, ARP filtering, etc. All SPI is doing is helping the NAT engine track the connections it is translating.

For "good faith" traffic (that is, not crafted by an attacker to work around crappy NAT routers) NAT still provides security. Actually it's not NAT at all that's providing the security - it's just the routing. You can't pass traffic from a public network into a private network, and almost all devices behind NAT are on private networks. So NAT provides security merely by keeping devices on a private network.

> But since they aren't using NAT, they aren't protected.

I just demonstrated that protection exists without NAT.

> […] but luckily it's still a small amount of traffic/devices worldwide.

I don't think this is correct. Just about all devices on a cellular/mobile network nowadays are probably using IPv6 natively, with CGNAT for IPv4. That's a lot of devices.

* https://blogs.akamai.com/2016/06/preparing-for-ipv6-only-mob...

An experiment: temporarily disable Wifi on your cell phone, go to your browser, search of "what is my ip address". Chances are that you'll see an IPv6 address there: if using Google they'll return something an IPv6 address at the top, and a list of different web sites that do the same thing. If you go to a website you'll probably get an IPv6 and an IPv4 address.

> But something other than SPI has to actually enforce that.

Connection tracking is sufficient for most modern protocols (i.e., not (active) FTP):

> A stateful firewall keeps track of the state of network connections, such as TCP streams, UDP datagrams, and ICMP messages, and can apply labels such as LISTEN, ESTABLISHED, or CLOSING.[2] State table entries are created for TCP streams or UDP datagrams that are allowed to communicate through the firewall in accordance with the configured security policy. Once in the table, all RELATED packets of a stored session are streamlined allowed, taking less CPU cycles than standard inspection. Related packets are also permitted to return through the firewall even if no rule is configured to allow communications from that host.

* https://en.wikipedia.org/wiki/Stateful_firewall

* https://en.wikipedia.org/wiki/Firewall_(computing)#Connectio...

> Actually it's not NAT at all that's providing the security - it's just the routing.

Yes, that's exactly my point. And as part of routing packets one can inspect their headers and drop or allow them as desired.

I think we're sort of agreeing on the same things. NAT protects devices, and stateful firewalls protect devices, and if you have neither, you're not protected. My point was that more devices exist behind NAT than come bundled with a stateful firewall, hence NAT protects more devices. The NAT devices/routers might additionally have a stateful firewall (which really only protects the router, not the other devices), but I wouldn't trust it as much as the security of a non-routable network.

> My point was that more devices exist behind NAT than come bundled with a stateful firewall, hence NAT protects more devices.

The popularity of a mechanism does not necessarily correlate with the effectiveness of its security. The fact that we're using IPv4 with NAT+SPI versus IPv6 with SPI is simply an accident of history.

If IPv4 had been designed with 64-bit addresses, or even 48-bit ones (like Ethernet MACs, which we still haven't run out of), then NAT probably would not have been invented, and we'd be using 'simple' (SPI) firewalls (see Cheswick 1994).

> […] but I wouldn't trust it as much as the security of a non-routable network.

Non-routable networks are not more secure than routable ones with SPI; at best the two are equal IMHO. Non-routable networks may actually be worse because of a false sense of security: all it may take is one end-point compromise and the enemy is one the other side of the moat.

Yes I agree with this but do note that in your gp you stated "every single personal device", not "half of all personal devices"

Even stateful firewalls don’t protect you from being hacked. Hacks occur when you run untrusted code on your machine. That untrusted code can easily bypass your stateful firewall by initiating a connection to a C&C server.

Thankfully, this is going away now with IPv6.

Unfortunately not. Most consumer and SOHO routers have reflexive ACLs enabled by default, and your average Joe doesn’t know or care to turn it off.

Oh, interesting, what is it exactly ?

Surprising that with over a year of discussions about this subject I only first hear of it ?

Would this be the reason why at least tens of millions of consumer routers don't have IPv6 firewalls (technically, opt-in firewalls, but you know how that goes...), and don't have their networks hacked ?

Wait, does it even work with UDP, which doesn't have a concept of "connection" ? (Especially IPv6 (no NAT) UDP ?)

I want to believe!

Well, IPv6 seems to be today (both in popularity and speed of adoption) where UTF-8 was about a decade ago. So, hopefully in a decade..?

NAT is a hack that only ever existed due to IP address exhaustion, and I’ve yet to find a good reason why congestion control and QoS handling in a router needs anything beyond what is already in the IP headers. SCTP’s flexibility is irrelevant since nobody uses it, arguably because of how much the Internet has ossified around TCP and UDP.

Hey, that's not true. People use it... in the telecom industry and everyone using WebRTC's DataChannel (SCTP over DTLS actually).

The protocol is probably mostly fine, but there are no quality library for using it in a general purpose way. Because of that, no one can really use the protocol to its full designed capabilities anyway (and the ossified network nodes don't help either).

It doesn't matter if SCTP has good features if no one is willing to implement quality libraries for it. The protocol exists for a while and only one general purpose library exists for it that's widely used. And it's not without a lot of flaws and limitations.

There are good reasons no one wants to invest in it, and I'm sure they did consider it.

I am sure there are even more implementations that I am not aware of.

* https://github.com/pion/sctp

* https://github.com/aiortc/aiortc/blob/main/src/aiortc/rtcsct...

* https://source.chromium.org/chromium/chromium/src/+/main:thi...

* https://github.com/sctplab/usrsctp

People don't make these decisions for technical reasons only. Career wise it is a bad choice to spend your time working on pre-existing technologies. You don't become a distinguished engineer by iterating on existing technologies. You become one by being the creator of something new.

I think QUIC is great and does a good job solving the problems it was designed to solve. It is disingenuous to pretend these decisions were made only for technical reasons.

3 of those are related to DataChannels in WebRTC, which at this point is baked into the standard and can't be replaced by another protocol. The SCTP in DataChannels is quite limited, that subset is reasonable to implement somehow.

Only the last one is really general purpose and has had a lot of security issues as well.

Fun fact: I do work on one of those implementations and with another one, and it's out of necessity rather than a career choice.

I know you understand libusrsctp Florent. I created Pion, I feel prety comfortable with SCTP and WebRTC in general as well.

RtcQuicTransport tried to replace SCTP and it didn't work. It would be possible to replace SCTP if something better was available. SCTP has lots of great stuff like FORWARD-TSN. QUIC didn't offer anything compelling over it. Also would come with a huge cost of making WebRTC larger and losing interop with all the existing clients.

libusrsctp has had security issues, but I don't think the protocol is the problem. The issue is C/C++. QUIC implementations are going to have the same class of bugs. Chrome/libwebrtc has plenty of security issues in other areas besides SCTP. Rust/Go doesn't fix everything, but one less thing to worry about at least.

If Google has a chance to use an industry standard or use their clout to create a new standard, you get one guess at what they're going to do.

Having a standard doesn't mean you're not allowed to innovate and create another one.

SCTP dates from RFC2960, first draft was published in 1999. It had enough time to get traction, and it didn't. Why would anyone build something new with it now, knowing that it doesn't answer some of the issues with the current Internet?

Even WebRTC's DataChannels were introduced at the IETF in 2012, that's 9 years ago (although ironically, the RFC just got published in January...).

Correct me if I'm wrong, but bypassing NAT, congestion control and QoS is also a benefit when you want to build things like P2P services where you want a direct connection between peers and full control over the connection itself. QUIC will help a lot with those use cases.

I'm happy whenever anyone brings up SCTP. Such a maligned protocol. Congestion control and QoS, however, are a loser's game. If you're running into realistic issues meeting demand due to statistical multiplexing breaking down to the point you must rely on caring about the internal state of traffic streams, you should have thrown more link hardware at the problem long ago. Many dumb pipes > fewer smarter pipes.


Oh, didn't you know, IPv6 was going to solve everything. No more NAT. We're just going to assign everyone an IP at birth so people can continue to ignore the User-Agent/User distinction.

QoS has its uses, and you don't have to look any deeper than TCP/UDP headers for it to be good enough. It allows VOIP/IPTV services to continue working under DDoS conditions for example (ISP level).

IP TOS flag and TCI field in Ethernet header (802.1Q header) should be sufficient, without looking at TCP/UDP/SCTP headers.

NAT does not need anything more than port numbers or UDP cannot be used with it.

As for QoS going beyond port number, source IP and rate of packets asks for troubles.

I wonder : IPs are considered PI (heh). So does this mean that badly implemented QoS can end up violating RGPD ?

Are deep packet inspection and modification boxes really that prevalent outside of the Great Firewall of China and its imitators (Russia has something similar now)?

If QUIC gets in the way of these countries they will simply set up a decrypt-and-recrypt TLS proxy and block connections that don't honor its certificate. Lots of banks and trading firms already do this to their employees for regulatory purposes; there is no technical hurdle.

> there is no technical hurdle

As someone who has been forced to use ZScaler, there is a slew of technical challenges. 80% of the traffic can be handled "easily" with a custom CA cert, and the rest have varying issues ranging from client problems to server problems to problems in the proxy itself. Just trying to clone a GitHub repo over HTTPS breaks with ZScaler.

I can't comment on this specific product, but if the transparent proxy is causing you so much grief why not just explicitly configure your machine to use the HTTPS proxy?

HTTPS proxies are not voodoo. There are plenty that work quite well.

Actually there is "voodoo" in both transparent and non-transparent proxies, but the bigger problem is that ZScaler's entire raison d'etre is to be a transparent proxy using Carrier-grade NAT. But the point is, ZScaler is a widely-deployed commercial solution, and it does indeed have technical hurdles. Whether it should theoretically work or not, I'm telling you in practice there are problems, and this shouldn't be dismissed out of hand as an outlier.

You really seem to have a beef with this one particular commercial product that I don't have access to.

I've used plenty of other HTTPS proxies and the well-written ones all worked fine. Many are open source. I think you need to direct these comments to whoever is charging you money for this "Zscaler" thing.

But isn’t the “damage” these middleware layers do actually good optimizations/features that people pay for? The ossification is the problem, not the ability to optimize IMHO.

> Every app has to implement it itself rather than calling a syscall and letting the OS deal with its complexities (same as for TLS, making fewer apps implement it without a lot of extra work). Which also increases context switching

Nowhere does RFC9000 say that it has to be implemented in user-space. A kernel-space implementation of QUIC would be conforming. Any OS kernel project (whether open source or proprietary) which would like to develop one is free to do so.

And why does it have to be handled in the OS, if a library would do just fine and provide more flexibility?

Well you know that someone is gonna implement it in ebpf sooner than later.

As someone with experience in eBPF and QUIC I can tell you it will be a tough one. QUIC is very complicated and requires some rather sophisticated (and long running) algorithms, and eBPF is [intentionally] rather limited on what it can do.

However there are some interesting use-cases for eBPF in the context of QUIC. E.g. it can definitely be used to route packets by connection IDs.

You're right. I think on the overall we'll gonna have to learn to restrain/harness ebpf somehow, since it will keep growing.

We're already starting to see some complex applications appear over tc+ebpf and I believe I've seen some tcp congestion algorithms prototyped there? Those can be somehow complex already. Crypto could be offloaded to dedicated 'nodes' (as you find in dpdk for example).

But, since you can now chain epbf programs I can imagine some kind of task-graph programming model emerging there allowing for very complex applications. Maybe even specific complex kernel modules that can only be called through ebpf...

In fact I already tried putting most of a high-throughput packet processor there and it's really fun, this feeling of being in 1998 with my first turbo pascal + asm programs. Hard to debug, though.

I hope we'll be able soon to converge on sw archs similar to tbb::flow or cudagraphs or vpp/dpdk pipelines.

The next years are going to be very, very interesting.

> I can imagine some kind of task-graph programming model emerging there allowing for very complex applications

This is why I hate technology. It doesn't need to be this complicated.

It is not really question of OS vs library, but question of standardized API vs ad-hoc API. Most basic networking API is standardized (sockets in POSIX API).

OS's defining the "standard" is very much a 1990s thing; at this point libraries and languages are much more relevant to standardizing things. If you're writing rust, or ruby, or javascript, or C#, or python, or whatever, you're going to be using what your language ecosystem provides you. OS libraries are usually only relevant to C developers writing something low level, and usually they need to write a lot of wrappers and #ifdefs to handle every platforms quirks anyway.

How do you think those languages actually implement those features? Because I can assure you that they aren't bundling their own userspace TCP stacks.

What you're saying is that you personally don't need to write network code that performs socket(), bind(), listen(), accept(), send(), recv(), etc. But the programming language and library you're using is using those calls, to talk to the kernel, which is doing a lot of the work, so your language/library doesn't have to. And it does that using a standard API.

This enables less context switching, faster processing, more uniform behavior, and more portable code. It also means your language/library does not need to implement a TCP parser, which can be quite complicated, and it means the kernel can maintain the more complex network settings that affect your TCP connections, benefit from a shared buffer/cache, network filters, routing, etc. Off-loading protocols into the kernel (or even better, the network card) is a boon to the whole OS and to your individual application.

Because if it's not in the OS, people are going to cheat more often on congestion control to prioritize their own streams. It's a tragedy of the commons. It's only because TCP is ossified and usually in the kernel that we haven't seen a race to the bottom already.

I'm pretty sure that's not how congestion control works... My ISP doesn't honor my congestion control flags. Whatever congestion control I set up ONLY changes things on my side of the ISP connection. And yes, within my LAN, I definitely can prioritize various streams.

Why do you think it is that we do congestion control at all? Why doesn't every program spam UDP packets as fast as it can write them to the NIC?

ever heard of http.sys? because they can

The ossification has nothing to do with the distinction between userspace vs kernel. The kernel can implement layer 5 (there are in-kernel HTTP servers!), and userspace programs can implement layer 4 (using raw sockets).

It has to do with the distinction that middleware boxes make between layer 4 and layer 5. NATs and firewalls and assorted traffic optimizers tend to vomit if you pass anything over IP that isn't TCP, UDP, or ICMP.

UDP is, by luck and design, a very thin shim on top of which a "layer 4.5" protocol is very easy to implement. Whether this layering is done in the kernel or in a userspace library is an implementation detail, and one that will probably be obscured even from many userspace programs.

Ironically, one of the proponent documents of QUIC and HTTP/3 (https://http3-explained.haxx.se/en/the-protocol/feature-udp) actually calls out how QUIC doesn't solve this problem either. Middleware boxes are blocking or deprioritizing UDP packets already, so UDP based 4.5 protocols can't solve the very problem they claim to.

Circumventing badly behaving middleware is obviously not actually possible. You can't account for every protocol-breaking/limiting decision every middleware vendor has ever made, and the "they'll come around" argument is a better argument for NOT using QUIC.

> Every app has to implement it itself

No? I'll just link against one of the cross platform libraries that implements it, the same as for any other networking protocol. (Ex https://github.com/microsoft/msquic)

Also see: https://github.com/quicwg/base-drafts/wiki/Implementations

> same as for TLS, making fewer apps implement it without a lot of extra work

... do you not just link against a relevant networking library at this point? If not, why?!

my guy out here rolling his own TLS implementation

I dont know about you but I implement TLS in x86 assembler for breakfast.

The IETF doesn't care about OSes.

What difference is it for the caller if the interface is inside the kernel or in a userspace library? What if they can't tell?

Why can't QUIC be implemented in the Kernel? Just because something can be implemented in user space does not mean that it has to.

Anything can be implemented in a kernel. But unless it's natively supported in other kernels, networks, and systems, that just means that one kernel is easier to work with.

The big idea behind TCP/IP is that it's a stack which you can use anywhere and everywhere. Not just any network, but any host. To do that, you need it to be portable and ubiquitous. And there is no protocol above OSI layer 4 that is ubiquitous. By nature of being above UDP, QUIC is layer 5 or higher. And it's got more features than other protocols do, meaning it needs more functions.

So not only is it unlikely that it'll be supported like the layer 4 protocols, it's unlikely to be supported by the Berkeley/POSIX sockets API, the one portable network interface specification for every operating system. One solution to all this would be to make QUIC a layer 4 protocol and update the sockets API. But nobody wants to do this because it's hard.

So, we could just have everyone add a layer 5 protocol to their stack, implement it in kernels, NICs, routers, etc, update the sockets API, and wait a few years for adoption. But will that happen? Or will the industry just go "let userspace handle it" like with TLS?

I really do not see how the layers matter. They are just an artificial categorization. E.g. SCTP is both a layer 4 protocol and a layer 5 protocol (SCTP-over-UDP) and as far as I can tell from some quick googling at least both Linux and FreeBSD implement both in their kernels. What matters is if various libcs add QUIC or not.

And the Linux kernel implements other layer 5 protcols like WireGuard.

SCTP can also be run over DTLS over ICE over UDP. How many layers is that?!

> The big idea behind TCP/IP is that it's a stack which you can use anywhere and everywhere

For what its worth, Windows 3.x did not come with a TCP/IP stack installed; it was an optional extra. (In the days before Windows 3.x you had to pay for the TCP/IP stack from your network card vendor as an optional extra.)

It wasn’t until Win95 that the OS came bundled with TCP/IP by default, and you could still install other networking protocols like Novel Netware or AppleTalk.

TCP/IP didn’t become the default stack until late into the ‘90s and the “everywhere by default” wasn’t really true until just before the turn of the millennium.

So while QUIC might be in the “early adopters” space for now, everything started that way once. I have no doubt that by 2030 we will see QUIC as a kernel feature in many major operating systems.

I think some versions of Windows 3.x did come with a TCP/IP stack, though it wasn't installed by default even with networking. You had to explicitly install it.

windows for workgroups 3.11


> By nature of being above UDP, QUIC is layer 5 or higher.

QUIC could also be encapsulated into IP without UDP, with a small modification: always use the framed mode. Hence it is actually a layer 4 protocol.

> By nature of being above UDP, QUIC is layer 5 or higher.

That doesn't seem like a fair characterization. In some TLS-based VPNs, IP is sent over TLS – does that make it a layer 7 protocol?

This is a tunneled scenario. By definition, you cannot not sensibly use it draw conclusions about network layers.

Would it be very wrong to consider QUIC a new type of layer 4 protocol encapsulated in another one (UDP) for compatibility with NATs and other middle boxes?

SCTP is also usually encapsulated in UDP when used in WebRTC data channels, yet I've never heard it being described as anything other than a transport layer protocol.

> it's unlikely to be supported by the Berkeley/POSIX sockets API, the one portable network interface specification for every operating system. One solution to all this would be to make QUIC a layer 4 protocol and update the sockets API. But nobody wants to do this because it's hard.

I can't see why a user-space library couldn't hook the Berkeley sockets API calls and redirect them to a user-space implementation for IPPROTO_QUIC sockets. You may need to use some trick to distinguish native kernel sockets from user-space-implemented ones. (e.g. if you know the kernel only allocates FDs in a given range, use numbers outside that range to identify your user-space-implemented sockets). Someone could add a framework for doing this to the C library.

Actually I believe Windows has a built-in facility for doing this. To implement a new transport protocol in Winsock, you need to provide a transport provider DLL. Normally you'll also have a kernel-mode device driver, which the transport provider DLL calls, and the meat of the protocol implementation will be in that kernel-mode device driver not in the transport provider DLL. But I don't think it has to be that way – I don't believe there is anything stopping a Winsock transport provider from being written entirely in user-space.

Our QUIC implementation MsQuic can run in both kernel and user mode on Windows. A PAL allows the core protocol logic to be agnostic.

I think from their perspective the (Windows and Android) kernel is what causes ossification so moving transport into the app, along with encryption, solves it.

There are also the "middle boxes" that networking researchers talk a lot about. Such devices sit in the middle of a link and easily become unhappy if the packets transmitted do not fit some (possibly outdated or buggy) predefined scheme. Think of cooperate firewalls with "deep pack inspection" that intelligently shut down connections they do not like. Once all middle box vendors start to assume a certain way that a protocol (say, TCP) should behave, it's impossible to change the protocol because it will break the middle boxes.

Encrypting QUIC datagrams prevents middle box vendors from assuming anything about QUIC (at least the encrypted part), so that QUIC can change if there's a need in the future without worrying about supporting legacy middle boxes. Although I do agree using UDP does not allow QUIC to break out from any ossification in UDP itself.

Yes, exactly. See this lwn article for more discussion of ossification: https://lwn.net/Articles/745590/

> SCTP has been around for years, but middleboxes still do not recognize it and tend to block it. As a result, SCTP cannot be reliably used on the net. Actually deploying a new IP-based protocol, he said, is simply impossible on today's Internet.

Well, IPv6 shows that it's hard, but possible.

Also see how new broadcast "protocols" were forced on TV manufacturers by various governments worldwide.

Most enterprise security vendors at the moment are advising their customers to block QUIC at the perimeter to force fallback to HTTPS so their TLS decryption can function.

There are valid, ethical reasons for an organization to want to see unencrypted network traffic at their perimeter, and until that problem is solved, you better not go QUIC-only if you are in the business to make money.

You're forgetting the rest of userspace that exists between the application and the kernel.

> moving transport into the app, along with encryption, solves it

This is only a good idea if the application's development moves faster than both the kernel and the rest of the userspace platform - which is a fine strategy if you're targeting old or obsolete platforms, but probably a bad idea if you're targeting currently-supported operating systems with a long support lifetime (like Windows).

It's a bad idea when a major security issue or compatibility problem with a transport or encryption library pops-up (which happens all the time, btw) - as it's part of your application then you don't benefit when the OS vendor ships a security update that your application gets for free. If you used dynamic-load+linking at runtime instead of a statically-linked encryption library then you might get lucky with a drop-in .so/.dll replacement without needing a rebuild, but YMMV.

Doing everything in the app instead of letting the parent platform handle it by default is kinda the software-engineering equivalent of libertarian ideological thinking: it only benefits you if you (and your team) really are far better than the state - otherwise you'll quickly run into problems - and it isn't suitable for the vast majority of the ecosystem.

Let's be honest; we're talking about Chrome here. Chrome auto-updates and Windows (pre-10) did not so bugs will be fixed faster in Chrome. Everybody else is along for the ride.

Right - so in Chrome's case then Google using QUIC as an in-app service is probably fine, as indeed, I trust Google to ship fixes and updates much faster than Microsoft. Also it's far more likely that enterprises have Windows Updates blocked or delayed than Chrome updates - and with Microsoft updates you have to wait for Patch Tuesday every month unless it's something really bad.

My point still stands though, that if you aren't prepared to support an "evergreen" software product for the life of your users, then you should still support OS-provided services, even if you do have your own QUIC support baked-in to your product.

I agree that for updatability it should be something that can be shutdown, reloaded, restarted as simply and with least risk possible. Now I'm thinking that it'll be easier to checkpoint/restore a quic connection than any other thing based on tcp or hidden in the kernel, so you could even do zero-downtime update.

I'm thinking let's just use something like netty or zmq over a simple ip+icmp interface, socket or not. Instead of having everyone learn some ossified quic kernel api... And since everything will soon move to io_uring let's wrap that thing in high-level APIs.

There is also SCTP at layer 4 but it doesn't get much use outside certain domains, so certain OSes have deprecated it.

Also as others pointed out applications will just use a library as they do now with OpenSSL.

because of the ossification that makes it hard/impossible to use.

I assume you're referring to the disuse of SCTP?

That's not because of ossification, it's because of middleware boxes like NATs and firewalls that will refuse to pass anything that isn't TCP, UDP, or ICMP.

Which is an example of ossification, in this case at the IP level.

From context, I assumed detaro was referring to an ossification of SCTP.

Ah, yes, that wasn't quite clear, sorry. No, I was referring to the ossification of middleboxes/OSes/knowledge/... that made it unviable to support a new IP protocol alongside the existing ones for the general internet case.

Sctp over UDP is a thing and still hasn't gained much general public use over the net... Sadly!

SCTP over DTLS over ICE over UDP is used quite a lot though!

That's how DataChannels in WebRTC were specified a while ago, and it's the only mechanism to do browser to browser communication.

Oh yes, when I discovered SCTP I felt so sad for all time I and hundred others had implemented part of its featureset over the last 30 years... I mean, it was packed with great features like separate streams, packet-based, multi-homing and multipath. It felt so right playing with it. Alas, even in my company I failed to convince it was a great replacement for layers upon layers of working udp code. I think it came too late...

> Every app has to implement it itself rather than calling a syscall and letting the OS deal with its complexities

I think for almost all developers, library support for features is much more relevant than OS support anyway. How many of you are directly writing winsock code? Implementing it on UDP just means that you don't need to update a lot of ancient routers and firmware.

I don't see "ossification" being a problem. Things haven't changed much at the transport layer because largely it just works.

> Things haven't changed much at the transport layer because largely it just works.

There is a reason this isn't built on top of TCP, because TCP has so many problems the moment you try to do anything non trivial with it.

UDP on the other hand is kind of redundant since you already have IP packages, so you have a tiny package based protocol on top of a different package based protocol. Also you don't get any ordering information in UDP, which is kind of weird since the IP layer has to handle ordering to deal with package fragmentation, which is a fun feature that some hardware sees as a request to reorder your packages. I may have some PTSD from dealing with "smart" switches and a bit of software that did not expect package reordering issues on a tiny in house network.

« Every app has to implement it itself rather than calling a syscall » It actually will be part of the Windows kernel similar to Http.sys

i think the intent is that normally you would do a lot of syscalls in order to do networking; with QUICK you would currently be forced to do do zero-copy networking and pass the frames to the kernal via pf-ring; things don't become simpler, as the network layers will have to be handled by a user mode library, however they would potentially be much faster.

> forced

?? why shouldn't quick be able to use a udp socket?

you can do that, if you can to accept the overhead of a system call per outgoing/incoming frame.

Those work and will reduce the system call overhead. But testing showed that it isn't actually the main culprit (e.g. you might gain 5% efficiency by going for it).

A far bigger bottleneck with the kernel stack is that e.g. the route lookup, iptable rules and similar things will be evaluated per packet - which is taking up most of the time. That will happen independent of you deliver one packet per system call or multiple of those.

UDP Generic segmentation offload (GSO - https://lwn.net/Articles/752184/) reduces that overhead by amortizing all in-kernel and driver operations over batches of datagrams. It makes a far bigger difference in efficiency than purely reducing syscalls (e.g. to +100% efficiency - but it will all depend on the other work the application and QUIC stack does and what drivers support).

You can always use sendmmsg(2), writev(2) or other iovec-based APIs.

> Every app has to implement it itself rather than calling a syscall and letting the OS deal with its complexities (same as for TLS, making fewer apps implement it without a lot of extra work)

TLS is available in the Linux kernel.


That is merely offloading for the transfer after userland has handled the entire connection setup, key exchange etc.

Have previous protocols (TCP, UDP, etc) been in widespread use before their respective RFCs?

The QUIC Wikipedia page[0] makes it sound like a substantial amount of traffic, due largely to Facebook, Google, and the Google Chrome browser, uses QUIC already. Facebook claims 75% of their traffic (for the app?) is via QUIC[1]

[0]: https://en.wikipedia.org/wiki/QUIC

[1]: https://engineering.fb.com/2020/10/21/networking-traffic/how...

> Have previous protocols (TCP, UDP, etc) been in widespread use before their respective RFCs?

As other people have noted, yes.

One of the interesting facts about the IETF is that they actually require two existing implementations in order to advance from an RFC to an official "Internet Standard":

The IETF Standards Process (RFC 2026, updated by RFC 6410) requires at least two independent and inter-operable implementations for advancing a protocol specification to Internet Standard.

I suspect that this requirement for two independent implementations that can actually talk to each other is one reason the IETF has a relatively solid track record.

There are less than 100 Internet Standards (https://www.rfc-editor.org/standards).

With hindsight, it's kind of funny that 8 of them are about telnet.

I've been dealing with various things that uses Telnet, mostly "legacy" services that has always used Telnet and there is no reason to switch them. Why is it funny? Telnet is still widely used and has been widely used for a long long time.

> Have previous protocols (TCP, UDP, etc) been in widespread use before their respective RFCs?

In general, yes: modern RFCs should address existing practice.

The very early RFCs were just that: requests for comments, i.e. memos. Sometimes reflection on existing practice, sometimes design discussions.

For your specific examples (TCP and UDP): they were part of a redesign of the Internet protocols (well a redesign of the ARPANET's protocols TBH, NCP, with the intent of enabling smooth internetworking). So those RFCs reflected the specification of new protocols, though some experimentation had been done.

I remember the transition day. The first machines, IIRC were ITS machines that some had hoped would not make the transition at all! I don't believe there were any Unix machines on the net that day -- Berkeley sockets hadn't yet been released.

> Have previous protocols (TCP, UDP, etc) been in widespread use before their respective RFCs?

Yes. Many early RFCs were simply documenting things as they were, not specifications of how they should be.

Yes. The first TCP RFC supplanted ~8 previous IEN specifications. All kinds of protocols have found wide use before their RFC.

TCP is documented several times, in different states, starting at least in 1974. So depending on how you define "widespread" and "respective RFCs" you could probably argue for either position.

Is it possible to compile quicly cli (referenced in the blog post) with musl instead of glibc. I had to add signal.h and it then compiled successfully (using openssl-1.1k) but I got illegal instruction core dumped when executing cli.


There are a few Rust alternatives for QUIC. Anyone tried them and have comments.




Microsoft has an excellent implementation, too: https://github.com/microsoft/msquic

Why not https://github.com/quictls/openssl

This is what the Microsoft implementation relies on

That is just the tls portion.

Let's say I wanted to add HTTP3 support to my web service - perhaps a NodeJS one, for example. What's that look like right now, practically speaking?

Is it supported at all yet in stable web servers? Is it as simple as flipping a config property in my web framework of choice? Does it only work on Linux, or all major OSes? Or none of them yet out of the box? Does it perform as well or better on all axes, or are we waiting on hardware support or library refinement? How much of a problem will ISP infrastructure be? If I get it working, will it significantly benefit users right now?

Not expecting comprehensive answers, just trying to get a feel for the current state of things

> Let's say I wanted to add HTTP3 support to my web service - perhaps a NodeJS one, for example. What's that look like right now, practically speaking?

Probably the same as it did/does with HTTP/2 - stick a reverse proxy in front that supports it. I think Nginx and Caddy have experimental support.

Wake me up when multipath support is included... Till then I prefer TCP b/c I can use MPTCP to utilize all of my rural LTE bandwidth for each single flow (e.g. downloading a file)

It will never ever be included. That's not how protocols & specs work. They are modular and build off of one another. The linked article includes some examples: RFC9001 is "Using TLS to secure QUIC", RFC9002 is "QUIC Loss Detection and Congestion Control." These are expected capabilities of most users, but they are still defined out of the core spec.

Similarly, there is steady progress on a multipath extension to QUIC[1]. There is also a website https://multipath-quic.org [2] covering advances in multipath quic. Multipath capability, like TLS, like the standard congestion control, will never be included in RFC9000 / QUIC core. But it's advancing.

And, I'd guess, based off the connection-less nature of QUIC (and it's UDP underpinnings), it stands a good chance of being significantly better than MTCP.

[1] https://datatracker.ietf.org/doc/draft-deconinck-quic-multip...

[2] https://multipath-quic.org/

Sorry for incorrect wording. I don't need multipath to be present in the same spec as the QUIC core. What I meant is when I will be able to have slightly less then 150Mbps over QUIC for e.g. a single file if I have 3x50Mbps connections. MPTCP can do that for me, even if the server doesn't support it as I can use something like OpenMPTCPRouter [1]. I hope multipath QUIC will be able to do that too, but it's not ready for production use yet, if I understand correctly.

[1] https://www.openmptcprouter.com/

Seems like there are quite a few options that show up pretty quickly when searching for multipath udp e.g: https://github.com/angt/glorytun

Would that work for you?

Posted a detailed explanation slightly above... Glorytun will not work to well with TCP b/c it doesn't provide separate congestion control for each path like MPTCP does.

I just really wish Google would have built QUIC on top of SCTP. They had the clout, and opportunity to push for real end-to-end SCTP support across the internet. Tunnel it over UDP for IPv4 and userspace implementations but use native SCTP where possible like over IPv6 before middleboxes abound that only support TCP, UDP, and ICMP. There's too many NAT implementations, firewalls, etc that don't support SCTP on IPv4 but that's not set in stone for IPv6 yet.

SCTP would have been great for all sorts of applications other than QUIC too. It's got built in multihoming for seamless handoff in mobile environments to keep a persistent connection across e.g. WiFi and cellular. Want multiple streams with no HoL blocking? SCTP is message based and can deliver multiple "streams" simultaneously over a single connection and you get free message boundaries to boot instead of just a plain stream like TCP. Want unordered datagrams? It's got that too. Even partial reliability for a subset of messages in a connection, i.e. reliable metadata for an unreliable live low latency video stream. The four way handshake also basically eliminates the potential impact of a SYN flood.

They wanted a protocol as opaque to middleboxes as possible. SCTP probably isn't, that was not an issue back when it was created.

Most of the point is removing round trips from setup. SCTP doesn't do that.

What Google wants for itself should be irrelevant to the broader Inter-Network standards.

Plenty other people and organizations want that too. QUIC at IETF was far from just Google doing something, and large parts of the relevant communities agree on the middlebox issue, people pushing "but we need to be able to mess with traffic!!!" are luckily waay in the minority there.

> It's got built in multihoming for seamless handoff in mobile environments to keep a persistent connection across e.g. WiFi and cellular.

From what I can tell from the QUIC RFC, it also supports this. See section 9, "Connection Migration" (https://www.rfc-editor.org/rfc/rfc9000.html#name-connection-...).

When your phone or whatever switches network, it can initiate a connection migration. Assuming it's the client, which would be the common case.

Both ends need to grok MPTCP. Which services and devices meet the requirement?

I use OpenMPTCPRouter [1]. It's a bit of kitchen sink so I'm thinking of replacing it with something simpler, but the basic idea is that you have a TCP->MPTCP converter running on your router and MPTCP->TCP converter running on some cloud instance.

[1] https://www.openmptcprouter.com/

Wouldn't QUIC work all the same with that setup, considering that you're not using MPTCP end-to-end? Or do you have a SOCKS or HTTP proxy on the other side of the aggregated links?

Also, why do you need MPTCP for this? Aren't there simpler methods for this kind of setup, like LACP? Or is OpenMPTCProuter, like OpenVPN, one of those things that everybody uses because everybody else does, with more informal documentation and help?

I haven't done this myself but it looks like OpenBSD would support LACP out-of-the-box w/ the aggr(4) or trunk(4) devices. See https://man.openbsd.org/aggr.4, https://man.openbsd.org/trunk.4, and https://undeadly.org/cgi?action=article;sid=20190710071440. I presume standard Linux distros have similar support.

EDIT: To be clear, AFAIU you'd also need to combine the LACP pseudo-devices w/ something like tpmr(4) and etherip(4) to actually bridge the peers. See https://www.undeadly.org/cgi?action=article;sid=201908022358... which describes tpmr(4) as providing a simpler complement for use in LACP-over-IP setups, where the normal bridge(4) device would require too much additional configuration. Again, I presume Linux has analogs.

OpenMPTCPRouter uses shadowsocks [1] and its ss-redir as a kind of a transparent TCP proxy. The advantage of shadowsocks is that it creates separate outgoing TCP connection for each one it proxies, resulting in substantially better performance than SOCKS. And, of course, shadowsocks has MPTCP[2] support.

The problem OpenMPTCPRouter solves is this: you do

curl -O https://somewhere.example.com/largefile.tar.gz

(just a single TCP flow) with 3x50Mbps LTE links, or, better, 80Gbps+40Gbps+30Gbps links and you get a bit less than 150Mbps download speed.

LACP doesn't solve it at all for several reasons.

1) LACP assigns each flow to a single network interface so as to avoid packet reordering

2) Even if you use other bonding mode that can spread the flows over several network interfaces, you'll get low throughput due to the aforementioned reordering and also due to the congestion control, more on that below.

Bonding works well when you have several "equal" links, which is not the case with LTE, where the speed may be quite different across the links and also vary with time.

Last but far from least, there is congestion control mechanism which is present both in QUIC and (MP)TCP, which makes sure the transfer is as fast as possible without harming other connections on the same link(s). This mechanism as it is used in QUIC and plain TCP is not very good at handling multiple aggregated unequal links, so no matter which way you do the aggregation on L2 (bonding) or L3 (things like glorytun), the performance is going to be suboptimal.

What MPTCP does is establishing a separate subflow (think TCP subconnection) for each of the available paths, with its own separate congestion control, and spreading the main flow over multiple subflows byte-by-byte in an optimal way. This gives maximum possible utilization of all of the available links, and prevents the congestion in an optimal way, too.

[1] https://shadowsocks.org/

[2] https://www.multipath-tcp.org/

I don't know about services, but MPTCP is available on Apple's OSes and out of tree patches for Linux and FreeBSD (at least I don't think it's been merged into the main tree for either of those).

Apple took a little bit too long to make it available for applications on iOS, so it didn't line up with my access to large numbers of end users and full control of the server stack, so I sadly don't have any experience with it, but it's there.

From reading the specs a while ago, I believe QUIC ended up with support for changing IPs, but not multiple IPs simultaneously, and I believe the server only can change IPs during session establishment, not with an established session.

It is a shame it’s not more widely deployed. It makes so much sense to be able to seamlessly handle transition from Wifi to Cellular.

It feels like Google is supporting some notion of this given Meet has seamless handoff as you leave Wifi.

I actually believe it's their AI gap filling approach that may account for better perceived handoffs there. It was introduced on duo, and was supposed to hit meet sometime.

What ?! You mean they're using this standardisation excuse to gain a competitive advantage with a proprietary extension? Shocking....

> QUIC was developed through a collaborative and iterative standardization process at the IETF after almost five years, 26 face-to-face meetings, 1,749 issues, and many thousands of emails.

26 face-to-face meeting in 5 years is an extremely low number; I'm glad to see that the developers of the Internet embrace remote work :)

IETF work is conducted mostly on email lists, hence the "many thousands of emails".

For some newer work like QUIC, GitHub is used to maintain a more to-the-minute shared view of the documents, and then again as mentioned in the text you quoted, GitHub Issues and PRs are used to manage the document, particularly by the most active participants.

https://github.com/quicwg/base-drafts - of course raising issues or PRs for them now won't do anything useful for you, because these RFCs were published. But you can see there were thousands of commits, one of the last being Martin Thompson's minor typographical tweaks summarised as "DOES IT NEVER END?!?".

These face-to-face meetings are probably the IETF meetings (https://www.ietf.org/how/meetings/), which seem to be around 3 per year (https://www.ietf.org/how/meetings/past/), plus a few extra quic-specific meetings (https://www.ietf.org/how/meetings/interim/).

I wish the QUIC design was datagram-first... so many client/server implementations don't implement the datagram extension properly because it's... well an extension, and for most applications, reliable sequencing should be eschewed in favor of client-side reassembly. IMO QUIC was far too influenced by the legacy of how HTTP works when there was an opportunity to leapfrog a generation of bad decisions.

> so many client/server implementations don't implement the datagram extension properly

I'm genuinely interested in what aspects you think are (or were) not implemented properly.

I wrote most of the implementation of quiche's datagram extension and, well, if there's something that can be improved, I'm all for it.

(aside: I think the standard should've considered an option for datagrams to opt-out of congestion control, given that most applications which want to take control over datagram sending probably also want to take control of available bandwidth and control congestion differently - e.g. by dropping encoding bitrate instead of queuing - but it is what it is).

TBH I haven't looked at the quiche implementation specifically (not being as familiar with rust) so I should check that out.

The main issue I saw with other implementations were mainly of the performance/optimization variety. The "blessed path" were definitely connection streams and I was seeing throughput drops when experimenting with datagrams.

> IMO QUIC was far too influenced by the legacy of how HTTP works when there was an opportunity to leapfrog a generation of bad decisions.

I don't think this is fair. IETF QUIC doesn't have any references to HTTP anymore. It mostly provides bidirectional flow-controlled data streams - just like TCP. From an application point of view QUIC is pretty much TCP with the ability to have multiple concurrent streams after the connection was established. And reliable streams empower most higher layer networking protocols - not just HTTP. Implementing reliable flow controlled streams is not an easy task, therefore I think it was the right call to standardize this.

Regarding the datagram extension: It's very easy to implement cocmpared to streams. If you have a need for it, consider contributing it to the QUIC implementation of your choice.

For the datagram extension: If you have

Can you elaborate more? Is there some more information about this datagram-first?

Very much like HTTP 2/3, it solves primarily Google's problems.

Users problems loading all those bloated JavaScript heavy sites

I am still wondering, isn't TCP hardware accelerated?

Maybe QUIC would also mean the CPU working harder, but I'm not really sure.

I would be nice to have a thoughtful conversation about the security implication of QUIC: Wikipedia says packets are encrypted individually.

I'm also curious: is QUIC exclusively aimed towards HTTP2 or can it be used for real time applications, like real time gaming or video conferencing?

You may be interested in this: https://techcommunity.microsoft.com/t5/networking-blog/makin.... The gist is that yes QUIC has higher CPU usage but all OS platforms are investing in UDP hardware offloads and optimizations to level the playing field. While the only IETF standard that will come out is HTTP/3, our implementation MsQuic powers both HTTP/3 and SMB proving the general purpose nature of the transport. We are not there yet in terms of an application that's only powered by QUIC because UDP reachability is not 100%, so you need a fallback. Most apps will either use HTTP/3 and fallback to HTTP/2 or use QUIC directly and have to fall back to secure L7 over TCP.

For UDP, Linux supports GSO and GRO, allowing applications to send and receive "super"-packets which are split up and reassembled by the kernel or NIC, depending on what's available. Whether or not QUIC implementations utilize it, I'm not sure.

I may be wrong, but I believe someone explained to me a while back that one of the points of QUIc was that it was meant to be implemented purely in software as opposed to hardware.

Should have waited for 9001.

Oh my mistake, they have several RFCs from 8999-9002.

Sure, and that's one good meme. But we also get to make some HAL 9000 jokes about Google now, and that's even better (though ironic, given that it was intended to be IBM jokes).

Borg vs HAL9000 - that may be an absolute blockbuster ;)

From the article :)

> The IETF just published QUIC as RFC 9000, supported by RFC 9001, RFC 9002, and RFC 8999

Haven't heard this meme for a long time now. Reminds me of the old days when my main email was muahahaha9001@inbox.com simply to get laughs out of people. (That email service shut down roughly 5 years ago.)

Yeah, they form a "cluster" of documents: https://www.rfc-editor.org/cluster_info.php?cid=C430, so that the earlier-numbered RFCs can reference later ones. For instance, rfc9000 normatively refers to rfc9001 and rfc9002.

Curious, why isn't it numbered the other way? My programming brain is telling me that if you're reading rfc9000 you can't reference rfc9001 because it hasn't been initialized yet. :D

It's in a lot of ways more about not being defined (but possibly declared) yet, rather than initialized. You'd be super sad if you could not reference a function defined later in a source file.

Yeah, I'm absolutely great at parties!

Ok, that explanation works for me. And if there was ever an appropriate time to be pedantic it would be in a discussion on RFCs. :)

> You'd be super sad if you could not reference a function defined later in a source file.

Would you? Most of the environments I'm in requires you to define the thing before you can use it, I'm not that sad about that. In fact, that seems to make a lot more sense than that you can define something after you use it, I'd expect that to lead to a compiler/interpreter error.

So no recursion?

You totally can.

Remember, it's all about tokenizing. You're associating chunks of machine code to tokens that tell the machine where to go to get the actual definition, or where to plop the machine code if statically linking.

Nobody said that RFC numbering implied any sort of temporal order. You are the Linker. Be the Linker.

Depends on the compiler/interpreter. Function body could be parsed/evaled after the initial function definition.

What environments are those?

E.g. Python

Perhaps I should clarify: rfc9001 normatively refers to rfc9000 (and vice versa). There are cyclical references between RFCs, so there isn't any topological ordering.

It works out because RFCs convey ideas, not code. The authors and editors take care to make sure the cyclical dependencies don't imply circular definitions or other incoherences.

It works in code too. It’s the difference between declaration and initialization. Very commonly existence of a symbol is declared in a header so that it can be referenced only later does the real implementation get linked in.

Is QUIC really worthy of the over 9000 meme? I can’t help but feel we wasted a once in a lifetime opportunity.

I mean every RFC over 9000 is an "over 9000" opportunity.

To be used as 9000 meme it should have been "over 9000"


As GP’s edit pointed out, it actually is RFC 8999-9002 - so yes, it’s over 9000.

No, that's just a sine-wave centered around 9000 where someone forgot to take instrument precision into account.

It appears they've fixed the link to https://www.fastly.com/blog/why-fastly-loves-quic-http3.

A quick google search surfaces the correct link. My 10 second root cause analysis: they changed the title of the blog post.


No, they just for some reasons missed /blog/ part in the path.

Off topic, but doesn't GDPR stipulate[0] that cookie consent dialogs cannot have non-essential options that are by default opted in? This page seems to have those.

> The use of pre-ticked opt-in boxes is invalid underthe GDPR. Silence orinactivity on the part of the data subject, as well as merely proceeding with a service cannot beregarded as an active indication of choice.

[0] https://edpb.europa.eu/sites/default/files/files/file1/edpb_...

Huh, I didn't know that, I'd wondered why so many sites defaulted them all to unchecked.

Fastly's one seems a pretty dark-patterned in general, throwing up a primary-styled "allow all" button (and simultaneously pushing the switches out of view) if you disable one of the options.

I now pre-judge a site based on its styling of checked/un-check-all functionality. Things that try hard to make me check all (or hard to uncheck stuff) immediately seem skeezy.

After all the TCP and UDP specific attacks (SYN Floods and Reflection-Amplification), I am really curious which QUIC-specific DoS attacks will be created by some talented hackers.

At least the security considerations of the QUIC RFC are ~18 pages long, so this might take some time.

A core intent is that you should not be able to cause a QUIC peer to spew vast quantities of stuff to somebody it hasn't authenticated, so that it can't be used for a large amplification attack. If you emit even very small malformed packets perhaps from a forged address, a QUIC peer is still more or less obliged† to tell you there's a problem, and so if the malformed packet's origin address was forged, the real owner of that address gets an unexpected QUIC message saying there's a problem, and it might respond in kind - it's important this doesn't end up like two half-deaf retirees both shouting "What?!" at each other forever because one of them thought the other said something.

† What WireGuard does here is just ignore everything that doesn't authenticate. This makes diagnostics tricky. Why doesn't it work? No idea. Maybe try again?

And anyone who ever operated a stateful firewall cried a little bit inside.

I would argue that with 200 Gbps NICs starting to become mainstream, the era of stateful firewalls in general is over...

No see, the NIC now includes a stateful firewall so you can implement it at each server at line rate.

Exactly. But every flow still uses memory, and with TCP you know when the connection has ended and can reuse it immediately. With general UDP (and QUIC because they intentionally put this in the encrypted part) the only way to do it is with timers. So you have both problems of state sticking around for a lot longer than needed if the timer is too long, as well as the state going away too early and breaking flows that wouldn't be broken otherwise because the timer's too short.

I get why they made it hard for middleboxes to modify the connections, but they didn't have to make it hard to observe the state of the connection in order to do that.

NIC vendors will happily help you offload your stateful firewall to hardware.

Will "Unreliable Datagrams" in QUIC support some sort of TLS?

Sure does. The protocol only transfers authenticated encrypted data by design which is slightly different in than how TLS behaves but with the same ultimate outcome (your traffic is encrypted).

Wow that's a huge RFC. I wanted to read it but now I'll have to set aside the better part of a day. Not to generalize but the best RFCs are amazingly concise.

if we could magically do not care about compatibility

how we could redesign/rewrite internet now in order to make it better?

what could've been done better?

how better performance could be?

I would start by:

- Not allocating giant slices of IPv6/4 space to corporations or government agencies. Everything must be IPv6.

- Adding proper authentication to BGP.

- Taking off the super powers of CAs

- Not piling up standards on DNS.

- Killing ICANN because it turned out to be a greedy senseless organization.

- HTTPS is the standard, and email servers just follow the auth/encryption just like HTTPS.

- Curve/Ed25519 over P256/386 and RSA.

- TLS 1.3 is the standard, and remove all compromises made to keep the middle boxes happy.

- No PSL. Browser same-origin modal everywhere.

If you want a PKI, the Certificate Authority role comes with that. If Alice wants to know something about Bob, either Alice must have this aprori knowledge, or else some third party Trent trusted by both Alice and Bob is involved. The Certificate Authority is Trent.

TLS 1.3 has the "compromises made to keep the middle boxes happy" built into it. A Semantically equivalent protocol, which is spelled differently (and thus incompatible with middle boxes in the real world so you could never deploy it) would have the exact same provable security properties, but that isn't TLS 1.3. ekr's "Compact TLS" is roughly this.

The trick most often seen in clients (particularly web browsers) implementing TLS 1.2 and earlier, of downgrading and trying again, does not exist in TLS 1.3.

If you call a TLS 1.3 server, falsely believing that it is only allowed to speak TLS 1.2 to you and so you should not claim to know TLS 1.3, the server will, unsolicited, signal that it does know TLS 1.3 anyway and your client drops the connection. This feature, called Downgrade Protection, was not enabled in Chrome for about a year because - of course - middlebox vendors can't do anything right, but after a 12 month "fix it or else" warning Chrome enabled the feature.

Well, I expected more radical/foundational changes than "upgrade to $modern standards" :P

e.g change TCP to $YOLO PROTOCOL, get rid of OSI for something different

My own radical ideas would be:

1. Change the order of the header fields to have the destination first, then the source;

2. Make the behavior on too large packets be "truncate but still forward" (setting a flag saying this was done), instead of fragmenting or dropping with an ICMP message.

Interesting RIPE labs article about a more foundational change that could be made: https://labs.ripe.net/author/hausheer/scion-a-novel-internet...

Is the reason why this got the number 9000 anything to do with the BFG 9000 weapon in Doom?

What needs to change to make use of this? Do any opportunities exist for open source projects?

The link to QUIC in that article goes to a 404 page.

'QUIC' is quarter-inch cartridge. I approve of the name-change. Two computer technologies with the same name is A Bad Thing.

... Vegeta memes incoming

Guys, we all miss the ITS OVER 9000 memes.

Google is not Requesting For Comments at all. It is imposing its things. Despite the fancy number, this is not an RFC at all, in the ancient sense of the acronym, nor it has the same design principles of simplicity and orthogonality.

QUIC has been an IETF draft for years and they were definitely accepting contributions and feedback, and a lot of efforts were not driven by Google at all.

As far as orthogonality goes, it seems to be pretty orthogonal to different kind of uses, I'm not sure why you're saying it's not.

On simplicity it's definitely not simple, but given the problems it is trying to solve it's probably as simple as it could be.

Although Google did come up with the original specification their influence over the IETF specifications is not overarching nor absolute; gQUIC is substantially different from IETF QUIC which has caused Google themselves to have implementation and interop issues in their deployments. OS vendors, CDNs, academic researchers, and individuals have all contributed to the resulting specifications that have emerged in a standards body that is by far the most open as the requisite to participate is a functioning email address. All matters and decision making are publicly available for observation and scrutiny.

There is no requirement for people to implement this protocol and it won't be appropriate to every use case that exists today. TCP, UDP, HTTP/1.1, and HTTP/2 are not going away - all continue to undergo standardisation, research, and new implementations.

(Disclosure: I'm not a Google employee, but I have participated in the QUIC and httpbis working groups)

I have spoken with Daniel Stenberg about this and if I recall correctly he disagreed and said that the QUIC which ended up being standardized is quite different from Google's QUIC.

I'd be interested in knowing what major design directions were changed/deleted/added.

For example, gQUIC has its own custom encryption "QUIC crypto" while as you will see in these RFCs the final IETF QUIC design uses the existing TLS 1.3 (existing now, it didn't exist when Google made gQUIC) for encryption.

Or at a philosophical level, gQUIC is an HTTP replacement, it doesn't do anything else except a better HTTP, so it doesn't need to care about any features you might want for, say, SMTP or IMAP or IRC unless it wanted them for HTTP. gQUIC knows about HTTP headers. But the IETF QUIC is a replacement for TCP not HTTP. That's why there's an entire HTTP/3 standard in this bundle of RFCs which sits on top of QUIC as its first "customer".

Thanks for the reply!

It's coming back now; I definitely remember reading some early drafts and thinking "boy it's sad this is just HTTP 3.0-rc1". I always liked SCTP; I'm glad IETF-QUIC has moved back towards being (sort of) SCTP-over-UDP as well as a better HTTP.

On the encryption front I don't remember reading about a non-TLS version (I probably missed it). I'm kinda sad that got dropped. TLS is just crazy complicated in ways that are unnecessary for non-web things like ssh and wireguard.... X.509 and the whole certificate authority trainwreck.

I really hope we get a "QUIC with non-TLS encryption" at some point in the future.

I don't buy the theory that the Web PKI or some equivalent is "unnecessary for non-Web things". If anything I think there have been a few places where there's a nasty security gap because somebody didn't just stuff the Web PKI in there, even though it wasn't necessarily a perfect fit, because what we got instead was nothing ie no security.

An example of that first: Does your phone try to connect to SomeBrand WiFi networks? How does it know if this is a "real" SomeBrand WiFi network? It doesn't. In some scenarios this means you leak valid credentials for the "real" SomeBrand to bad guys. In most cases you at least leak identity information. That's not great news. If the WiFi network was named example.com the WiFi AP could of course present a certificate for example.com, leveraging the Web PKI.

Now, there are people who want to do QUIC without TLS, mostly what they want to do is NOISE. Like this: https://www.cryptologie.net/article/467/quic-noise-nquic/ But again, I think something like SSH is the exception and the Web PKI is more applicable to most uses for QUIC.

@KirillPanov -- you might find this article useful: https://www.fastly.com/blog/maturing-of-quic

Thanks! After reading the link you provided, I found out that gQUIC's FEC (Forward Error Correction) was removed from IETF-QUIC, which is a minor bummer.

But it appears that the removal is not permanent, and that it was removed mainly because it is something that can be added later ("QUIC 1.1") in a backward-compatible way.

The other really interesting part was the "spin bit". It's intuitively clear how it could be used by ISPs to monitor the latency impact of their decisions. Unfortunately the cynic in me knows somebody will find a way to use it for fingerprinting. Room 641A really ruined the ISP's credibility in a permanent way.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact