
Update Regarding ACME TLS-SNI and Shared Hosting Infrastructure - okket
https://community.letsencrypt.org/t/2018-01-11-update-regarding-acme-tls-sni-and-shared-hosting-infrastructure/50188
======
mdhardeman
It looks like Let's Encrypt has come up with the best plan possible for the
goal of balancing mitigation of the security risks with compatibility for
users.

If their plan works out as they've laid out, and as certain recent commits in
the boulder source code would suggest...

It would seem they intend to:

1\. Allow renewal of already-issued certificates to same account holder to
revalidate via TLS-SNI-01 for some limited time period.

2\. Whitelist certain shared infrastructure providers who have lots of LE
certs in force and who have demonstrated that they are not vulnerable. I'm
betting especially for those that manage the whole certificate retrieval and
deployment process. It's not clear if this is temporary but longer than #1 or
"temporary" but very long term or "temporary" until we come up with another
inband process.

3\. Otherwise TLS-SNI-01 is gone.

Meanwhile, they'll work with the ACME WG to see if they can't figure out a
better TLS-SNI method which would not be vulnerable. That looks less and less
workable absent some special TLS extensions or ALPN and server support for
those.

For those who have options, it's worth pointing out that the purely DNS based
validation methods are literally closest to the facts that the CA's validation
process wishes to prove. Domain Control validated means that you're showing
effective control of the domain. Nothing says I control the domain like the
ability to add and remove data from the authoritative DNS servers for the
domain label (and children thereof) in question.

~~~
will_hughes
> Nothing says I control the domain like the ability to add and remove data
> from the authoritative DNS servers for the domain label (and children
> thereof) in question.

Unfortunately this makes life difficult for us.

We run a whitelabelled platform with tens of thousands of users - it's hard
enough to get many people to understand setting a CNAME record (vs just
setting an A record). Requiring them to update a TXT record _once_ would be a
big enough challenge, but doing it every 30-90 days is never going to happen.
The smaller customers would probably be happy with us hosting their DNS, but
we're not going to do that - the larger ones wouldn't.

TLS-SNI being gone and DNS being unworkable means we're left with HTTP only.

~~~
mdhardeman
But if you're a web post with 10k+ users, what's the problem with the HTTP-01
challenge?

You just allow .well-known/* to be passed on to reflect the challenge
responses you've generated for the client, while 301 redirecting everything
else to their [https://](https://) site.

I'm confused how that would be harder for a web host at that scale?

EDIT: I get people trying to run a server off their cable modem / rtr public
IP, and 80 might be taken by something other than the target the port forward
for 443 is going to -- and that's a problem for those use cases -- but that
kind of concern wouldn't exist in a significant hosting infrastructure.

~~~
will_hughes
> what's the problem with the HTTP-01 challenge?

Nothing, yet.

But who's to say that another similar bug won't be found in common shared-
hosting platforms that forces LE to turn that challenge off too?

~~~
mdhardeman
True.

Because almost all of the CAs utilize a web control mechanism, with many of
them probably having processes not as rigorous as HTTP-01, it is likely that
there would be significant backlash and a lengthier migration away from the
method for that case.

That said, anyone who can would be well advised to figure out how their DNS
based mechanism would work if it were ever needed.

As I and others have pointed out, there are clever and fully supported hacks
for validating dns-01 without dynamic control of the full domain zone. (CNAME
to another zone for the _acme-challenge labels, NS delegation to refer each
_acme-challenge label as an independent zone at a different NS, etc.)

------
niftich
The tls-sni challenges rest on the assumption that a hosting provider will
somehow ensure that a self-signed cert uploaded by the user contains
"truthful" information, even though the second half of the cert is blatantly
fake and is being abused to carry data. This is a bold assumption to make,
considering the entire point of CAs is to say that the information being
presented in a cert has been vetted, so why presume that anyone will vet any
claims in a self-signed cert?

That aside, Let's Encrypt had an exemplary response to this issue throughout,
and has made the right call here. A best-effort whitelist will enable a
smoother transition for those on some known-good hosts, while mitigating this
vulnerability and keeping their cert ecosystem uncompromised.

The work will now begin to add support to the other validation methods in ACME
software that's lacking them, and to engage with the other custodians of the
ACME protocol to rectify this particular flaw in design.

While the http-01 challenge is the recommended migration path, and likely the
easiest to automate with greenfield software, the dns-01 challenge is the one
with the fewest amount of intermediate assumptions -- such as the ones made
when designing tls-sni-*, which in this case turned out to be faulty -- and
represents the one most likely to be futureproof. After all, what better way
to prove you own a domain itself than being able to add arbitrary records to
it that all nameservers then echo back?

~~~
mdhardeman
I completely agree on the dns-01 challenge. Those who are migrating off of
TLS-SNI-01 and have a capability to standardize on dns-01 will be better off
in the long run.

As you point out, domain control validation is best performed by having the
application demonstrate control of the domain.

~~~
regecks
The problem with dns-01 is that it much of the time, it requires granting far
too much privilege to the system that requests the certificate.

This is because the great majority of DNS hosts do not provide sufficiently
granular permissions to only allow changes to _acme-challenge RRs.

e.g. Cloudflare, as far as I can tell, only gives you one API key which grants
all access to all zones.

e.g. Most domain registrars who offer DNS hosting who provide an API grant
access to all sorts of management functions, not just DNS zone changes.

e.g. Route53 IAM doesn't let you restrict to a single RR, you expose
modifications to the entire zone.

I am really not comfortable giving my web application these kinds of powers.

TLS-SNI was useful because it was relatively protocol agnostic, so some of
that flexibility is now gone.

~~~
schoen
> The problem with dns-01 is that it much of the time, it requires granting
> far too much privilege to the system that requests the certificate. This is
> because the great majority of DNS hosts do not provide sufficiently granular
> permissions to only allow changes to _acme-challenge RRs.

There's a cool solution to this that I learned from someone else on the Let's
Encrypt forums (where I often help do support). The Let's Encrypt DNS-01
validator will follow CNAMEs. Therefore, you can make _acme-challenge be a
CNAME to an arbitrary text record which can be in another zone (including a
zone dedicated for this purpose). For example, you could say

_acme-challenge.example.com. IN CNAME foo.acmevalidation.example.net.

Now an application can just have API keys to update RRs under
acmevalidation.example.net, which does not need to be used for any other
purpose (or even necessarily hosted on the same infrastructure as
example.com's own DNS). The CNAME can be created manually at the outset and
does not need to be updated for renewals.

This has been possible for a long time, but if it becomes more widely known
and more widely supported by client applications and DNS providers, it should
make use of DNS-01 authentication much more practical, and safer, for a pretty
wide range of people.

~~~
BillinghamJ
Another similar option would presumably be to delegate _acme-
challenge.example.com to different nameservers with an NS record, then give
your application the required privileges to control solely that nameserver.

~~~
mdhardeman
Yes. Or even to the same name server, breaking out each whole label starting
with _acme-challenge as its own independent zone, with its own access
policies.

~~~
teddyh
Yes, but there’s no need to have separate zones; you can grant update access
to subdomains and have the CNAMES point into one zone with a subdomain
dedicated to each separate actor which needs access.

Like so: Assume that Actor 1 has example.com and example.net. You then add
this to the example.com and example.net zones, respectively:

    
    
      _acme_challenge.example.com.  CNAME  example.com._.actor1._.your-special-domain.com.
    
      _acme_challenge.example.net.  CNAME  example.net._.actor1._.your-special-domain.com.
    

Then you give update access to Actor 1, but not to the whole “your-special-
domain.com” zone, but to the “_.actor1._.your-special-domain.com” subdomain.
The ACME system would then be configured to send updates to the correct
subdomains of that subdomain. Or “your-special-domain.com” could even be a
subdomain itself of another domain; it doesn’t matter.

------
jo909
From my point of view the big advantage of TLS-SNI is that it uses the same
protocol and port as 90%+ of certificate users want to use with the issued
certificate: HTTPS.

That is especially useful for webserver plugins. Also this is much better when
there are security policies that (for maybe misguided but well-intentioned
reasons) completely block or redirect all HTTP traffic.

What would be insecure about a https-01 challenge, that esentially works
identical to the http-01 challenge but allows any certificate?

~~~
pfg
> What would be insecure about a https-01 challenge, that esentially works
> identical to the http-01 challenge but allows any certificate?

There's a specific reason http-01 is HTTP-only, and it's actually quite
similar to the tls-sni-01 situation. In many of the major web servers,
including apache and nginx, the web server will use the first HTTPS vhost in
its configuration for any unmatched domains, unless you explicitly specify a
default vhost. In practice that means an attacker on the same hosting
environment used by the victim could get themselves in a position where they
control this default vhost and obtain a certificate for their domain. The
vhost order is often based on the alphabetic order of the domain, so that's
fairly easy to pull off. http-01's predecessor did allow HTTPS, but this
attack came up during the IETF ACME standardization process and, IIRC, was
fixed before Let's Encrypt entered public beta[1].

http-01 does permit the CA server to follow redirects to HTTPS, including to
ones with self-signed, expired or otherwise invalid certificates, so common
setups with HSTS and redirects to HTTPS are fine, you'll only be in trouble if
you can't use HTTP on port 80 at all.

[1]:
[https://mailarchive.ietf.org/arch/msg/acme/B9vhPSMm9tcNoPrTE...](https://mailarchive.ietf.org/arch/msg/acme/B9vhPSMm9tcNoPrTE_LNhnt0d8U)

~~~
jo909
But that behavior is true and exploitable for HTTP as well, isn't it? It is a
risk if there is no specific vhost config for the validated domain, which
means a customer pointed the DNS to the shared host without also configuring
that host to serve content for his domain from his account.

I realize in current real-world setups you would normally start with a HTTP-
only config and only later or maybe never configure HTTPS for that domain, or
configure both protocols simultaneously. And almost never the opposite where
you configure HTTPS only and someone else would be able to grab your HTTP
traffic. So that's still a good argument to do HTTP only, thank you for
explaining it.

I did not know http-01 would follow redirect to HTTPS, that is also really
good to know and should be a good way for some setups.

------
rgbrenner
Happy to see this. I was very critical of their plan to re enable Tls-sni and
I’m happy to see they reconsidered. They made the right call here.

------
mholt
For the record, I am pretty sure Caddy will be unaffected by this. Any
programs using xenolf/lego as their ACME client should be fine as well, as
long as one other validation method is still available. (lego also uniquely
supports a wide variety of DNS providers for automated negotiation of the DNS
challenge. Caddy supports them too, as long as they are plugged in and
configured.)

I've been asked if we'll turn off TLS-SNI in Caddy and the answer is no; as
their announcement says, some accounts will still be able to use TLS-SNI for a
limited timeframe until it is turned off completely. Caddy won't try the TLS-
SNI challenge as long as the ACME server doesn't advertise it in an exchange.

~~~
mdhardeman
One would think that most of the Caddy users who previously used TLS-SNI-01
could also avail themselves of HTTP-01 validation.

~~~
mholt
Yeah, I imagine so. Many sites still have port 80 open to redirect to 443.

------
zalmoxes
RIP

I have a few services which were using Go’s acme/autocert package. I now need
to update them to the HTTP challenge.

~~~
mholt
xenolf/lego arguably has the widest support in the sense of ACME verification
methods, but autocert might get other methods too:
[https://github.com/golang/go/issues/21890](https://github.com/golang/go/issues/21890)

------
AdamJacobMuller
I don't understand why TLS-SNI can't work fine if you just have the response
certificate be entirely distinct from the server name.

EG: LE sends a challenge of a.b.c.acme.invalid

You must reply with a certificate of d.e.f.acme.invalid

Doesn't that entirely mitigate the shared hosting issue since any shared
hosting setup will require SNI to match the certificate name that you reply
with?

~~~
mdhardeman
No, that’s the problem. A lot of shared hosts will allow any customer to board
a new website — as long as it’s not already taken on that hosting provider —
then allow you to upload any TLS cert for that.

So attacker requests to validate for a name that you have pointed via DNS to
that hosting infrastructure.

The names that you need to respond on and have certs for are then calculated
by attacker. Attacker, who is also a customer of same hosting service creates
the necessary “sites” and uploads the matching challenge response certs, and
successfully receives a cert for your domain.

~~~
AdamJacobMuller
None of what you're saying is an issue in my outline.

If I'm able to upload a certificate for a.b.c.acme.invalid, the validation
TLS-SNI request for a.b.c.acme.invalid will reply with a certificate for
a.b.c.acme.invalid and thus fail.

If I'm able to upload a certificate for d.e.f.acme.invalid, the validation
TLS-SNI request for a.b.c.acme.invalid will not match my uploaded certificate
and the challenge will thus fail.

I may well be misunderstanding the situation, but, I just don't see how.

~~~
mdhardeman
It is still an issue, actually.

You misunderstand how the TLS balancer chooses which certificate to present.

When the TLS connection comes in and presents a SNI name of
"a.b.c.acme.invalid", the balancer checks its configuration to see if the host
has a "website" called "a.b.c.acme.invalid". It discovers that it does. It
looks at what certificate in the database was uploaded for that website
configuration.

It doesn't actually check the certificate details at all....

It presents the certificate that was uploaded by the "owner" of the "website"
a.b.c.acme.invalid.

And if that name needs to present a certificate that says "d.e.f.acme.invalid"
then that is the certificate that the attacker will have uploaded for his
"a.b.c.acme.invalid" site.

There are numerous web hosts who would permit this and it would work just like
that.

The mechanism you're describing is similar to the changes in TLS-SNI-02. It
has already been determined that TLS-SNI-02 is deficient as it is vulnerable
to the attack I've parroted here.

The trouble is that the people who wrote both TLS-SNI-01 and TLS-SNI-02
apparently had little knowledge of the vast breadth of behaviors exhibited by
a plethora of shared web hosts. The assumptions they made, upon which all the
current TLS-SNI-0x protocols rely to provide security, simply are not upheld
and honored by the real world marketplace of web hosts.

~~~
AdamJacobMuller
Ok, that makes sense, thanks for explaining it a bit more.

That seems like an absurdly broken implementation, I get why LetsEncrypt feels
the need to disable it but I really hope they come up with some alternate
solution.

I really like the tls-sni authentication method as it keeps authentication
entirely inband to the final SSL goal. With HTTP you need to listen on/control
80+443, with DNS you need to control DNS. With tls-sni you need only control
port 443. I'm a huge fan of x/crypto/acme/autocert.

Custom ALPN-signaled protocol should be doable and should solve all of this, I
hope they do it.

~~~
mdhardeman
It is a rather limited protocol. Really, the implementation isn't so bad... In
a perfect world.

It's naive.

It imagines there was a whole different set of operating circumstances at
shared web hosts than the reality exhibits.

I actually had not read the protocol specification for that challenge as I
utilize http-01 and dns-01 on all my various systems. Then, when the early
report without details was released, I read the protocol and realized almost
immediately that there were several circumstances in the field which could
yield actual vulnerability.

They also made the mistake of failing to align to a promise which the other
mechanisms do make: the other mechanisms tie the validation directly to the
target domain label being authorized or a known child thereof. The TLS-SNI-01
and TLS-SNI-02 don't do that. And they knew that, because they wanted to be
able to perform a TLS-SNI validation without having to change server software.
I believe that was a bad decision.

The proposed TLS-SNI-03 ALPN "acme" extension that Mr. Rudenberg has put forth
will not be resilient to these attacks, ultimately. I think they should do a
real ALPN protocol and do the validation through that. But let's assume time
to market for that would exceed a year. In the mean time, people reliant on
TLS-SNI-01 are likely going to need to do something else.

In short, a mechanism which would achieve much the goal of getting validation
off of a single TLS port running the right software can happen, but I believe
it should borrow pretty much nothing from the current TLS-SNI-0x proposals.

It should be a whole new real ALPN protocol.

~~~
AdamJacobMuller
> I think they should do a real ALPN protocol and do the validation through
> that.

Agreed.

> But let's assume time to market for that would exceed a year.

I'm sure it would take a year or more for good packages for most languages to
exist, but, it doesn't seem so complex that it should take a year for it to
exist and for it to be usable if you're sufficiently motivated (EG: you're
willing to write your own software).

> In the mean time, people reliant on TLS-SNI-01 are likely going to need to
> do something else.

I just moved to using a commercial wildcard certificate. I didn't particularly
want to, but, for this and a few other reasons, LetsEncrypt became non-viable
for me.

