Hacker News new | past | comments | ask | show | jobs | submit login
HTTP/3: from root to tip (cloudflare.com)
340 points by jgrahamc 87 days ago | hide | past | web | favorite | 94 comments

Here’s another good intro, from a few days ago, by Daniel Stenberg of curl fame: https://daniel.haxx.se/blog/2019/01/23/http-3-talk-on-video/

He also has a free book about HTTP 3, over here: https://daniel.haxx.se/blog/2018/11/26/http3-explained/

Is it the case that HTTP/3, like HTTP/2, will not change the semantics of HTTP in any way?

One thing that drives me nuts, for example, is that Expect: 100-continue exists but has no criticality, so a user-agent can't tell at all if the 100 continue will be coming, or if a 2xx, 3xx, 4xx, or 5xx response will be sent immediately. In principle the user-agent could use OPTIONS to detect if the server implements Expect: for the target resource, but there is no guarantee that a server that supports Expect: will note that in responses to OPTIONS. Some clients implement a timeout then begin sending the request body, but suppose when they do so the server's 3xx/4xx/5xx response is already in-flight? Then the server will expect a new request, not a request body, next from the user-agent. What a mess.

And, of course, Expect: is a very important feature. It's needed to avoid sending data that cannot be recreated, or without having to cache it in order to resend it if the request failed with a 401, say.

What I'd like to see is a hard requirement that OPTIONS must indicate whether the server supports Expect:. I'm not sure much more than that can be done to fix the problem. But obviously, a new transport for HTTP can't change this sort of semantics, as one might be gatewaying to/from HTTP/1.1.

Is that really a HTTP/[2|3] problem, or a problem of those servers?

As far as I understood the HTTP/2 spec (have read this, but not HTTP/3), it is allowed for servers to send an arbitrary amount of informational headers (incl. 100-continue) before sending the actual headers and response body. So everything to handle these kinds of requests should be in place.

If the server ignores the "Expect: 100-continue" there is obviously nothing the other side can do, but that's the same in HTTP/1.1.

What I am a bit unsure about is actually how things like "transfer-encoding: chunked" or the various "connection" headers are supposed to be handled in servers/clients/libraries that are supposed to handle HTTP/1.1 and HTTP/2. Is this supposed to be automatically set on the lower level, depending on the actual protocol type. And the application level HTTP client or handler on server just sees the body as a stream? I'm not sure if that happens currently in common libraries, or whether they would forward these things if a client is connected via HTTP/1.1 and not when it's connected via HTTP/2.

The real is that the server is allowed to defer decisions about authentication and authorization until it has seen some or all of the request body, therefore the server might not respond until it has seen part of the body, but the user-agent wants to know whether it can get an immediate response to a POST/PUT in order to avoid sending the body until it knows the server is ready to sink it, and the user-agent is resource constrained and cannot cache the request body, say:

  - the user-agent wants an early 2xx/4xx from the server
  - the server is allowed to defer
  - the user-agent has no reliable way to force the
    server to decide early -- the user-agent can only
    ask nicely for a 100-continue / early 2xx/4xx
  - and the user-agent has no reliable way to determine
    if the server does support Expect: 100-continue
What the user-agent wants is for the server to send any 3xx/401 responses early, else commit to not-401 by sending 100-continue. The server can still defer 2xx/403 responses, though preferably not 403s, until it sees some/all of the request body.

The only fix I can think of is to REQUIRE that OPTIONS responses indicate whether the server supports this or not. An HTTP/2 or 3 gateway to an HTTP/1.1 server (or vice versa) could be configured to indicate support / no support for Expect: 100-continue.

In particular, there is no way for a new version of HTTP to give a user-agent a way to force the server to support early decisions. The servers exist already, and the are as they are.

> What I am a bit unsure about is actually how things like "transfer-encoding: chunked" or the various "connection" headers are supposed to be handled in servers/clients/libraries that are supposed to handle HTTP/1.1 and HTTP/2. Is this supposed to be automatically set on the lower level, depending on the actual protocol type. And the application level HTTP client or handler on server just sees the body as a stream? I'm not sure if that happens currently in common libraries, or whether they would forward these things if a client is connected via HTTP/1.1 and not when it's connected via HTTP/2.

Section of RFC 7540 [0] covers most of what you ask except that it doesn't explain how chunked transfer-encoding maps to HTTP/2, and section 8.1.3 [1] shows an example of mapping chunked transfer-encoding onto HTTP/2 data frames.

Basically, in HTTP/2 all transfers are necessarily "chunked" into DATA frames [2] (necessarily because of multiplexing of channels!), and Content-Length -when present- merely indicates what all the chunks must add up to.

  [0] https://tools.ietf.org/html/rfc7540#section-
  [1] https://tools.ietf.org/html/rfc7540#section-8.1.3
  [2] https://tools.ietf.org/html/rfc7540#section-6.1

I am aware how bodies are streamed in HTTP/2 (I have written an implementation). What I am not sure about is how libraries on client and server side are expected to handle this kind of information. A Lot of servers and client libraries might display "transfer-encoding: chunked" to their user, even though that wouldn't be seen if the connection was established via HTTP/2. It's more of a question of "how to hide those protocol differences from the user while still allowing them to set and access all information if required". And the question whether the HTTP ecosystem already embraces that, or whether most libraries still expose all the HTTP/1.1-only ways to the user.

> What I am a bit unsure about is actually how things like "transfer-encoding: chunked" or the various "connection" headers are supposed to be handled in servers/clients/libraries that are supposed to handle HTTP/1.1 and HTTP/2

As you suspected these headers are 'special' and don't exist in HTTP/2.

Its annoying but the "check and then attempt" pattern just doesn't cut it. There are going to be unexpected issues so you end up always having to handle exceptions anyway.

You always have to handle exceptions, but if you can insist on early 3xx/4xx (at least 401) responses, then you can avoid caching any part of the request body. For example, consider something like:

  $ huge_fast_source | curl --negotiate -x POST ...
Here curl might respond to 401s by authenticating to the server, and if it can know that a 401 is coming, then it can simply not read from stdin (thus flow-controlling the huge, fast source) until it has seen 100-continue, which might require a second request. This is very nice.

The fact that it can't be had reliably is rather annoying.

I see your hypothetical but I can't its something I've missed.

If reads are not repeatable or can't be lost then you really should caching them.

If you can reset your stream, then just start the firehose. Presumably things like a 401 are rare so the waste is minimal.

I could see a use case for this when network IO is prohibitively expensive and needs to be minimized. You're right that there isn't a great solution.

Great article about the HTTP history!

Only one thing I would change: The article mentions the term "syntax" very often, but I don't it is the best choice. When I read syntax, I think about something human-readable, e.g. about programming language syntax. What is really meant is "wire-level encoding" or "serialization format". I think [wire-level] encoding might be the most common used term for protocols.

Which is also syntax. It’s the correct word choice.

This is not my understanding of the word "syntax". Indeed, if you read https://en.wikipedia.org/wiki/TCP/IP_Illustrated you'll not see the word "syntax" used to describe the protocol layout. So, basically I agree that the word is misused in the post. I think "layout" might work better.

It's one of those words whose meanings in the industry vary a fair bit contextually.

So, for example, ASN.1 defines syntax for expressing schemas (ITU-T x.680 and others in the x.68x series), and also encoding rules (x.690 and others in the x.69x series).

XDR is one standard for both, syntax and encoding.

In HTTP land syntax is used to refer to encoding. In HTTP/1.x, because the protocol is textual and specified in ABNF, syntax can refer to both things equally well. In HTTP/{2,3} this is no longer the case but it doesn't matter, we continue to use "syntax" to refer to the encoding.

While it may be correct as per the defneition of the word, it's usage in that manner is (as far as I can tell) extremely rare if not non existent in the industry (embedded systems, computer engineering, microcontroller work, etc). Therefore, even if it is correct in the literal sense, it's usage this way introduces so much confusion (due to it meaning something else in those fields), it's effectively incorrect.

Quite the contrary, inside the field of computer science it is exactly the correct term used. Practitioners may use different words, but theoreticians describe protocols as instantiations of a language, whose encoding is defined by syntax. If you work on the formal verification of protocols, for example, as many people developing a spec would be, you would speak in terms of the properties and tradeoffs of various syntax.

Tangentially related.. like HTTP, Websockets currently operate over TCP.

Is there plans to port them to QUIC as well? It's been many years since my networking class but I'd assume Websockets would stand to gain quite a bit as well.

Websockets (the protocol) are mostly defunct in comparison to HTTP/2 and QUIC. Porting websocket based code should mean converting it to HTTP{2,3}. The main reason that's not possible today is due to large existing library support for websockets on both the server side and directly in browsers.

I disagree. Websockets sweet-spot is still the "server wants to send messages to the client" usecase. This does not work (as well) with http/2. Dunno about /3, but it's really a fundamentally different model than the reques/reply of http.

Websockets are largely obsolete because nowadays, if you create a HTTP/2 fetch request, any existing keepalive connection to that server is used, so the overhead that pre-HTTP/2 XHR connections needed is lost and with it the advantage of Websockets.

Honestly curious: how would you use HTTP{2,3} to push a real-time message from server to client without a WebSocket? Are we back to just long polling?

Exactly. I think most people didn't use Websockets due to the theoretical performance improvement over HTTP 1.1, but because it's a different programming model.

I think it's very shortsighted to assume people should (or want) to roll their own streaming protocol when Websocket is already a great fit for many applications.

You can use server-sent-events or use do some custom streaming format inside the response body. The latter might not work well, due to the fact that arbitrary streaming APIs might not yet be exposed by all browsers (ReadableStream on fetch API: https://caniuse.com/#feat=streams).

I thought all XHRs support streaming, and they should run over HTTP/{2,3}. Is that not right?

Afaik XHR allows you to access received data in a streaming fashion, but the XHR type will always buffer the full received payload internally. So if you try to stream a large amount of data over it you will eventually run out of memory. If you want to consume a potentially endless stream with custom encoding you need the fetch API with streaming support for that (https://developer.mozilla.org/en-US/docs/Web/API/Streams_API...), as far as I understood. Or websockets.

That is one reason that e.g. gRPC streaming support doesn't work trivially in browsers.

gRPC is already HTTP/2.0 and has bidirectional streams. I imagine it would work like that.

Browsers are almost at the point where they can just talk gRPC to the backend, but gRPC uses HTTP trailers which browsers do not support.

I don't think, from a networking standpoint, that websockets was really necessary. A browser and a web server already have a socket open that they send messages in either direction on. It's just so much of the web was designed around "client asks for something, server sends the result and closes the connection" that some transition period was necessary to distinguish between the two cases. (You didn't necessarily provision your web server for maintaining hundreds of thousands of open connections.) But now that people are used to the idea of connections sticking around in the context of HTTP, it's no big deal to send arbitrary data over that connection.

But can the client in HTTP/2 send more data to the server after it has received something, without having to send a full HTTP request? If not, I don't see how something like agar.io, which requires constant updates to the server on every mouse movement, can work without tremendous header overhead.

Not really, it's rpc and websocket is not. I looked into this for a project and neither http/2 or gRPC does the same thing : fully bidirectional, long-lived stream with no weird semantics or hacks.

There is still a place for WebSockets in HTTP/2. The fairly recent RFC 8441 defines a mechanism to reserve a single HTTP/2 stream for bidirectional WebSockets.

It is feasible for a similar mechanism for also work for HTTP/3, just no-one has gone to the effort yet.


Thanks, this is basically what I was looking for.

With quic I'm guessing you could have multiple Websocket and http3 streams within one connection and have them not block eachother on packet loss, which is pretty neat.

It's true that HTTP/2 could eventually replace lots of websocket use-cases, and integrate better with other HTTP code (e.g. frameworks, authentication, etc).

However currently if you need streaming and message ordering guarantees in both direction, you are currently still limited to websockets. HTTP might become more useful once body streaming APIs in both directions are available in browsers.

A thing I always notice about new tech is a failure to simply and concicsely explain what it is. I had to click through a few links and I still couldn’t find a one paragraph summary of what quic was. If it doesn’t have a large number of influencers mind share it won’t move quickly (don’t mean that absolutely)

The second paragraph opens up with a “what is quic” and links to another post. Perhaps give that a read?

You can read about IETF QUIC here https://datatracker.ietf.org/doc/draft-ietf-quic-transport/

HTTP over QUIC is now HTTP/3. "HTTP/3 is just a new HTTP syntax that works on IETF QUIC, a UDP-based multiplexed and secure transport"

QUIC is likely to be implemented first by large internet service providers (Google (who made it), Amazon, Cloudflare, etc.) and there slowly come to all the various open source projects and various software installed on every computer.

Just having the major internet services and browsers implementing it already makes a sizable difference.

A crude, TL;DR is that QUIC is just TCP over UDP.

That’s all I wanted lol thanks

I'm curious on why the upgrade mecanism to tell the server you understand and prefere HTTP3 is different from the one used to upgrade to a Websocket connection? It doesn't seem to use the "Connection: Upgrade" and "Upgrade: XXX" headers but rather "Alt-Svc"...

I'm a total newbie on the topic, so I'm curious about this.

Because you aren't just switching the 'format on the wire', you are switching from TCP to UDP, which requires a new IP connection.

The proliferation of future protocols that will never fully be adopted is simply creating an fractious and overly complicated mess.


IPv6 migration. This is the best case, given that it was a necessity to combat the exhausting pool of IPv4 addresses. Even such only ~25% client adoption after 10 years. And worse support from websites in the alexa top 1,000,000 (https://www.internetsociety.org/resources/2018/state-of-ipv6...)

HTTP/2. Roughtly 35% adoption (by site) in the Alexa top 1,000,000. https://http2.netray.io/stats.html

HTTP/3. Obviously less.

Now I'm not trying to make this point from the position of the grumpy old man who doesn't want to learn something new. I love new tech, and obviously the research and development of new protocols over the past 30'ish years has created an explosion of new industry and communication patterns that have indisputably changed the world and shaped modern life.

However... the marginal benefit of each protocol advancement is directly related to it's usage (so I assume). The benefit the world saw (from the user perspective) when switching from HTTP/2.0 from HTTP/1.1 was arguably zero (I'm not saying it was exactly "zero", but I could argue that it was imperceptible). With bridge technologies like WebSockets and BOSH/Comet/"Hanging POSTS" it could be argued that HTTP/2 was effectively a bandwidth optimization in most users lives (those users that use the sites that offer support).

However, the additional complexity for developers to leverage these optimizations is detracting from the development of otherwise useful service offerings in the form of opportunity cost. This is because nearly every optimization inherently incurs a complexity cost, e.g. supporting both HTTP/1.1 and HTTP/2 is obviously more complex than simply supporting the former. And you are supporting it even if your deployment process abstracts the details away from you, e.g. debugging customer issues.

Formulating HTTP/3 when support for HTTP/2 is below 30% is simply an example of standards bodies working on a tiny "easy" problem (make the web go faster.. which is important) rather than face the bigger problem, i.e. helping to scale the worlds service providers a seamless migration path to adopt said new technology.

/rant from a grumpy old man

I think adoption could be improved so much if the standards commitee released a javascript and C library for whatever they end up deciding. The javascript implementation should be heavily commented, super easy to understand and basically be the documentation of the standard and the C library (or libraries) should be performance tuned tiny modules that can be called upon by other languages. You may ask: Why javascript? Well the reasons are too many to enumerate and immediately 50 people would come here to disagree and say that they prefer their own language but basically it boils down to there being no fancy features to get in the way of understanding an algorithm. The reasons for C are almost the same, it's easy to understand but with the added advantage of being interopable with just about anything on the planet and fast.

Even for http2 after many many years and lots of searching, you basically have to use nghttp2 as the only usable http2 C library and its source code is mixed together with various c++ components. Isn't it always like this? I mean don't these standards committees have to write the algorithms out anyway, why not make a usable module out of it in source code form? I suspect it's because in reality the standard gets sponsored by Google whose engineers have already written c++ implementations that are tied to dozens of their own libraries and subcomponents and integrated into the chrome build system etc etc. Well that still doesn't change the fact that it would be better for everyone else if they released a simple to understand implementation in js and C.

I'm not sure what you're talking about. Supporting HTTP/2.0 for my site was literally just adding the "http2" word to the listen directive of nginx. Since it's also used as a reverse proxy, the backend app server remained HTTP/1.1.

And you still notice the difference in site loading speed if your site has a lot of static assets.

How much traffic does your site serve? I'm not saying that turning the feature on is difficult. Anyone reading hacker news can go and configure their webserver to start serving this traffic....

but I can tell you that a site serving serious production loads (let's just say 100 qps)... nobody is just "flipping a config" and sending the protocol out the door.

HTTP/2 only just seems to be gaining traction. Isn't even a default on NGINX (hidden behind a server config option) and now we're moving to HTTP/3?

I think what makes http3 really desirable and convincing to migrate to is its improvements on mobile networking which is prevailing nowadays.

If you're interested on working on HTTP/3, QUIC, and the author of this post, we're hiring a Product Manager for this team in our London office: https://boards.greenhouse.io/cloudflare/jobs/1251515?gh_jid=....

Did anybody look into how this is going to work with proxies? It seems fairly naive to standardize this without having a solution for that.

QUIC connections are always encrypted and authenticated, so they should work just fine with a CONNECT-style proxy that supports UDP. (I don't know for sure; I'm just reasoning from first principles.)

EDIT: after thinking a few minutes, I'm not sure a "UDP proxy" is a meaningful concept except in very limited circumstances such as forwarding only to a single destination, because there's no connection to establish that a proxy server could know anything about. Maybe proxy servers don't have a place in a QUIC world.

Do you mean forward proxies? The HTTP/3 draft currently specifies how the CONNECT method is used. At present, it only allows you to create an onward TCP connection. So you have a QUIC connection between client and proxy, and a TCP connection between proxy and server. Like so

        QUIC         TCP
client <----> proxy <----> server

Unfortuantely, this means that a HTTP/3 client has to tunnel HTTP/1.1 or HTTP/2 to the target server. I realised this was a shortcoming last year and prepared an I-D called HiNT that explores the problem space and some candidate solutions:


This was presented, along with some nicer tunnel diagrams, at the IETF 102 meeting. Slides in pdf: https://github.com/httpwg/wg-materials/blob/gh-pages/ietf102...

Could be client HTTP/3 --> proxy HTTP/1 --> backend

gcloud load balancer docs: https://cloud.google.com/load-balancing/docs/https/#QUIC

TL;DR: The new QUIC standard being produced changes the way in which certain HTTP concepts are encoded; it's not just an added layer upon which existing HTTP/2 bits can written. Because the format changes, the protocol number must be bumped as well.

Does CloudFlare support HTTP/2 back to the Origin yet? Or are they still afraid this will destroy the selling point of their Railgun feature?


Clever way of avoiding the question.

How do you see argo tunnel fit in this scope? It obviously has other features (cert management, internal routing) - but you'd have to keep competing against http standards with your protocol, no?

Argo Tunnel doesn't 'compete against web standards'. I'm really surprised that people are somehow claiming that Cloudflare is against the open web in some way when we've done so much to progress it (e.g. through early support for SPDY and HTTP/2 and now with QUIC including an open source QUIC implementation in Rust).

Argo Tunnel does a specific thing: make a connection from a client's server to Cloudflare (rather than the other way around). This is helpful from a scaling perspective (why have a load balancer when we can do it?) and a security perspective (no open ports). If there was a standard that did the same thing as Argo Tunnel we'd adopt it.

I didn't mean to imply that it is competing with open web standards; I apologize if it came out that way. I was trying to make the case that for instance performance benefits in http3 wouldn't apply to the connection that Argo maintains to origin since it is a different protocol and what your thoughts was on this.

Ah. That makes sense. Thanks for the clarification.

We actually use standard protocols for the Argo Tunnel connection and we'd look at HTTP/3 as a possible mechanism for it. I agree that if there's a persistent connection the benefits of QUIC (and HTTP/2) are lessened compared to a browser doing a drive by on a web site. This is one reason the OP above is wrong about Cloudflare, HTTP/2 and Railgun.

Yes, the railgun comment was why I wanted to elaborate on the subject. Thanks for your insight.

This is bullshit. Entrenching TLS in web standards is just handing the keys to the internet over to malicious government entities.

We need security without TLS.

Ok, what do you want to replace it with? You can't complain about the existing system without offering an alternative. Saying the existing system sucks doesn't help move the Internet towards something better.

>You can't complain about the existing system without offering an alternative.

Yes, you can, and yes, you should. Or are you saying you shouldn't file bug reports unless you can provide a patch?

Get real.

You could embed public keys in DNS records for one. But I'll admit I was hoping someone who understands security better than I do would jump in.

All I know is that centralized certificate authorities act as choke-points malicious governments can use to censor information.

What has that to do with TLS? TLS doesn't specify where you get information which certs to trust from (indeed your "put keys in DNS" proposal exists as a standard, to be used with TLS, but used by nobody for HTTPS because DNS is either untrusted or secured by even more centralized structures)

And protocols are typically stacked cleanly enough that it'd not be a big problem to replace TLS if another standard comes along to replace it.

That exists, and is called DNS-Based Authentication of Named Entities (DANE) [0], and there are proposals for making use of that in TLS. There is no need to start over for transport security. TLS 1.3 is very well designed and will eventually support DANE.

  [0] https://tools.ietf.org/html/rfc6698

The "DANE" proposal has been around for over a decade, and was actually implemented (probably twice, but at least in Chrome) and then rescinded, because it in practice doesn't work. But even if it did work, it would still represent a further centralization of trust on the Internet, this time with the central authorities that matter being effectively controlled by governments. DANE is a dead letter.

[ For the record, not an invitation to a futile further debate, given your long-standing immutably-held views on DNSSEC. Since you'll probably say the same about my work to get DANE for SMTP up and running, we can stop here. All I can add is that those who say it can't be done should not get in the way of the people doing it... :-) ]

No, the major providers did not get together to do MTA-STS because DANE was bad. They did it because their existing DNS geo-balancing kit for e.g. google.com and yahoo.com does not offer an easy upgrade path to DNSSEC. Note that Microsoft has a dedicated domain (outlook.com) for email hosting, and can more easily do DNSSEC there without impacting their other "web properties". Note also that Google now MX-hosts many customer domains on "googlemail.com" rather than google.com.

So things are starting to change. Furthermore, there are now over 1 million DANE-enabled DNSSEC domains. MTA-STS is far behind, is not downgrade-resistant on first contact and uses weak CA-leap-of-faith DV authentication. It will probably be enabled at the biggest providers by the end of this year, but as you yourself said elsewhere, these providers are the threat, and if so, securing email delivery to the user surveillance empires is not necessarily that important. Mind you, they can play a useful role by enabling validation and helping to keep the TLSA records of receiving systems valid, and perhaps surveillance is not their business for paying customers...

The IETF draft itself says that the primary motivation for MTA-STS is to provide transport security when DNSSEC is "undesirable or impractical". You don't have to take my word for it.

There are a million DANE-enabled DNSSEC domains because there are registrars, particularly in Europe, that enable it automatically. Who cares? First of all, DNSSEC managed by your registrar is security theater, but, more importantly, the overwhelming majority of those domains do not matter. Who cares if some landing page in the Netherlands has TLSA records?

Meanwhile, the domains that really do matter --- the ones managed by the major mail providers --- are doing MTA-STS.

SMTP is not a success case for DNSSEC.

gmx.de, comcast.net, freenet.de, mailbox.org, posteo.de and tutanota.de come to mind as counter-examples as well as various universities, the German parliament, various Dutch government domains, and a couple of thousand self-hosted SOHO domains. But DNSSEC is axiomatically evil, so I must be wrong...

Versus Google Mail, Yahoo, and Microsoft? And Comcast is also an author of MTA-STS. So yeah, I'm pretty comfortable arguing that the verdict is in on this.

Recall: the argument you're responding to (you drew it up from downthread) is that DANE "definitely works for SMTP". Does it, now?

Yes, DANE works for SMTP. Let's talk again in 2020. Ciao...

Oh, I forgot; I made a helpful infographic for this last week:


Yawn, would you also like to arm wrestle? I concede that smug superiority gets more karma points than doing the hard work to make a difference.

Yes, only ~9 to 10 million domains are presently signed, and most of the larger ones are not (but comcast.net and cloudflare.com are not tiny, and gmx.de has over 10 million email users). Changing this takes time and effort. Users still need better software tools that make deployment easier and there needs to be less KSK deployment and rollover friction at the registrars and registries (i.e. CDS support). Some DNS hosting providers with outdated software need to upgrade their stacks, ... this does not happen overnight. Let's compare notes in 2020 or 2021. Infrastructure upgrades happen slowly...

Cloudflare sells DNSSEC services. Comcast is actively participating in standards that moot the most (or second most) important modern application of DNSSEC. You keep talking about 9-10 million domains "presently signed", but, as I keep pointing out, those were signed at registrars and are overwhelmingly irrelevant zones. The point of the infographic is that zones people actually care about --- not coincidentally, zones with giant, smart security teams --- resolutely avoid DNSSEC.

Let's compare notes in 2020 or 2021. Infrastructure upgrades happen slowly... I'll stop now and we'll both find something actually productive to do.

They sure do! This one has taken (checks notes) 24 years.

DANE definitely works for SMTP. Chrome's efforts at stapling DANE failed, but I think for the wrong reasons.

No. DANE did such a bad job for SMTP that the major email providers banded together and created an alternate, non-DNSSEC-dependent standard, MTA-STS, to work around it.

I'm sorry, but that's nonsense. MTA-STS is about sites that are having trouble deploying DNSSEC. DNSSEC != DANE. DANE depends on DNSSEC.

No, sorry, the MTA-STS draft itself is clear about this distinction. Also, your argument is circular.

DANE definitely does not "work" for SMTP.

The necessity for DNSSEC in SMTP is, I think, a desperate trope† that recognizes that the Web PKI has moved past considering DANE and begun investing in direct hardening for the X.509 system. SMTP is other mainstream protocol for which transport security couldn't be guaranteed, is mired in the late 1990s technologically, and doesn't inherit modern browser- and server- based protection. So: SMTP! SMTP is the reason we need DNSSEC! We must get DNSSEC deployed immediately so everyone can have secure SMTP!

Except, you know who doesn't agree with you? The people who the most important email services. Hence: MTA-STS --- a standard whose introduction spells out its raison d'être: to avoid DANE! --- and the mooting of that last fragile argument for deploying DNSSEC.

DANE is a dead letter.

I'm choosing words carefully

Certificate authorities are bad, but fixes are possible.

First of all, we have Certificate Transparency. This allows people to notice CA's behaving badly.

Secondly, we need certs to be signed by many different CA's. Note that this is just my idea, and I know of no intentions to standardize this. Currently, rescinding trust in a CA is a difficult and slow process. This is because rescinding a CA breaks a lot of infrastructure. If instead, certs were signed by multiple CA's, we could just drop a CA and most infrastructure should have signatures by other CAs.

Moreover, if I decided I don't trust the Hong Kong Post-office as a CA, I could drop them from my CA store without breaking the web. Perhaps I could even mark it as 'suspicious' and get notified when a cert is only signed by suspicious CAs.

By combing the two things, when a CA is compromised, Certificate Transparency should tell us about that in about a week, after which trust in that CA can be dropped almost immediately.

Just CT isn't enough. Consider what would happen if CT shows Lets-encrypt or DigiCert is compromised. We'd be forced to slowly drop trust, to allow many sites time to migrate away. Without the ability to drop a CA "like it's hot" Certificate transparency is toothless. It defends against CAs acting selfishly because getting caught by CT is bad for a CA. However, CT does not defend against coercion or compromise of CAs by third parties. Those third parties don't care whether the CA gets damaged. Under the current system they can get at least a month of signed certs for whatever domain they want.

DNS is even more centralized and tied to governments than CA's are.

The WebPKI is easy to compromise because there are so many CAs and NameConstraints cannot be reliably used so the whole thing is not hierarchical. Certificate Transparency (CT) provides the ability to detect some CA abuses, but is not a magic bullet.

DNSSEC is absolutely hierarchical and has strong name constraints. Today it has a single root run by ICANN, and run in such a way that hopefully the NSA and friends do not have access to the root key. Nothing stops other countries from legislating national alternative root zones, and nothing stops individuals from maintaining their own alternate root zones. We can expect eventually to see national alternate root zones, and we'll know that then that those states that impose them are almost certainly interested in being able to MITM. The ability to perform recursive lookups locally and use a root of one's choice will make MITMing more difficult.

DNSSEC, like WebPKI, is only as strong as the authentication of communications between domain owners/admins and CAs/registrars. With Let's Encrypt the WebPKI is quickly also becoming equally reliant on the security of communications between domain owners/admins and registrars anyways. There are efforts to improve that and reason to believe they'll succeed.

So at least as to non-nation-state actors, DANE is already more secure than the WebPKI.

QName minimization, and DNS privacy extensions (I wish the IETF would standardize DJB's solution for privacy), will make it very difficult for nation states with access to DNSSEC root zone keys to mount targeted MITM attacks. Non-targeted MITM attacks will be easier to notice than targeted ones.

In short, there's no silver bullet. The introduction problem is simply a very difficult one. But DANE looks to have better properties than WebPKI.

> NameConstraints cannot be reliably used

I wish that TLS 1.3 had mandated support for this going forward. Not just for public CAs, but also, I'd love to say "here is my company's internal CA root, only trust it for *.thecompany.example domains".

And DNS can be easily poisoned. China's GFW does DNS poisoning as the first step of blocking websites.

> Entrenching TLS in web standards is just handing the keys to the internet over to malicious government entities.

As opposed to cleartext, which hands your content over to everyone.

Also, certificate transparency means that it's going to be impossible for someone to MITM a TLS connection by forging a valid CA certificate for the domain without giant warning bells going off everywhere.

Could you expand your thesis a bit?

I think maxk42 is worried about the TLS certificate root zone which contains many governments. Just look here [1] and search for "government", you'll find various governments mentioned directly. Others are present indirectly, like the german government through the D-TRUST certificates that are operated by the 100% federally-owned company Bundesdruckerei.

In the days before things like Certificate Transparency and HSTS preload lists, this was very worrysome because any of these governments could just freely impersonate any website they wanted to decrypt their access to. TLS is very vulnerable to this. SSH still has the TOFU model that is less vulnerable and TLS+TOFU is possible with HPKP but Chrome removed it. I can understand the reasons for the removal but it still had negative consequences. However, Google/Chrome is very avid about Certificate Transparency and many high-value domains are in the HSTS preload list with hardcoded CAs so we are in a better situation than just 5 years ago and I think it is going to improve going forward.

So no reason to discard TLS entirely with some kind of nirvana fallacy, but of course, being aware of the problems is always a good thing.

[1]: https://ccadb-public.secure.force.com/mozilla/IncludedCACert...

Mozilla trusts eight "government" CA roots, associated with 5 countries in this list, China (Hong Kong), Spain, Taiwan, The Netherlands, and Turkey.

The Turkish CA is constrained by Mozilla (and so in Firefox, but may not be constrained in your downstream software that wasn't written by Mozilla and just uses their trust lists wholesale) to just Turkish official sites, government, education, that sort of thing. The others are not subject to any notable constraints for a global CA.

If you don't run Firefox or a Free Unix you probably trust a lot more government CAs. For example Microsoft's list includes 35 root CAs from Australia to Uruguay. It's hard to do a cursory up-to-the-minute examination of Apple's status because they provide a summary of information from the certificate itself which will often have been written by some nerd twenty years ago, leading to descriptions like "Government Root Certification Authority" and "CA Root" (those are both real examples from Apple's actual list) which don't tell a reader anything of value.

I don't regard it as useful morally to try to count a company owned by the government as "the government". Consider the film production company Film 4. This is wholly owned by Channel 4, which is in turn wholly owned by the British government. So we could say the British government made "Four weddings and a funeral". OK, delightful English romantic comedy, sure, maybe drives tourism. "Trainspotting". Maybe it's a warning about how dangerous heroin is? "A Field In England" - Um, now we're struggling, I guess it is set in Britain at least. But wait, "The Motorcycle Diaries" is about the early life of Che Guevara, and "12 Years a Slave" is about a guy who was enslaved, in America. No, Film 4 is just a Film Prodution company, it happens to be owned by the British government, but that's even less relevant than which billionaire owns your favourite hockey team.

Thanks. That was helpful. With letsencrypt it feels like this problem is not the problem it used to be, because most certificates are now self asserted.

While Letsencrypt is great, one more CA in the root zone isn't direct improvement of the situation. It's important to realize that any single CA in the root zone can, in theory, without asking anybody or doing anything else, create fake TLS certificates for any website they want and then subsequently use those certificates to perform Man-in-the-Middle attacks.

You don't need to specifically trust the CA that signs the certificate of your website: they don't get your private key or anything, only the public key which everyone is getting. Instead, you need to trust every single CA in the root zone.

That's at least the idea without certificate transparency (CT). The current SCT policy of Chrome for example increases [1, 2] the numbers of entities needed until a certificate is considered to be valid, and thus, any attack needs the cooperation of multiple such entities as well, making attacks harder to mount.

[1]: https://www.entrustdatacard.com/blog/2018/february/chrome-re...

[2]: https://fpki.idmanagement.gov/announcements/chromect/

Again thanks. Most clear.

I think his point is about censorship and mostly about the US, where it's done covertly and there is a lot of influence over organizations that provide tech infrastructure. The government can therefore force those organizations to "voluntarily" participate in censorship programs or deny the service to someone effectively censoring it for almost everyone. Google, Facebook, Twitter already participate in such programs, but of course avoid calling them government censorship.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact