Hacker News new | comments | show | ask | jobs | submit login
Godaddy has issued at least 8850 SSL certificates without validating anything (groups.google.com)
327 points by 0x0 192 days ago | hide | past | web | 44 comments | favorite

Interesting tidbit: The CA/B Forum passed a change to the Baseline Requirements attempting to standardize the methods of domain ownership validation back in August of last year[1]. Prior to that, it was essentially up to the CAs to come up with secure methods. The methods described in that change contained mitigations against this vulnerability.

The change never went into effect (practically speaking - it's actually a bit more complex) because a number of CAs in the Forum filed patent exclusion notices, and wouldn't you know it: GoDaddy was one of them. Hope it was worth it.

[1]: https://cabforum.org/2016/08/05/ballot-169-revised-validatio...

[2]: https://cabforum.org/wp-content/uploads/GoDaddy-Ballot-169-E...

Finder here, this is the history:

    12.12.2016: First contact with MS
    27.12.2016: Answer, saying its not a bug. (Notice the promise that you get an answer in 24 hours)
    02.01.2017: Explaining the issue again in more detail
    03.01.2017: Opening the ticket, saying I will get more information if something is available.
    12.01.2017: No answer from MS, and seeing randomly this post on HN
I hope I will get some details by MS soon so I can keep you up to date guys.

tl;dr: they requested a URL and wanted it to echo a token passed in the query string. They accepted 404 pages that echoed the token as valid, too!

"In case anyone is wondering why this is problematic, during the Ballot 169 review process, Peter Bowen ran a check against the top 10,000 Alexa domains and noted that more than 400 sites returned a HTTP 200 response for a request to http://www.$DOMAIN/.well-known/pki-validation/4c079484040e32... [1]. A number of those included the URL in the response body, which would presumably be good enough for GoDaddy's domain validation process if they indeed only check for a HTTP 200 response.

[1]: https://cabforum.org/pipermail/public/2016-April/007506.html "

They needed the page to return just the token or something with the token in it? Because then I could validate this: http://google.com/this_is_my_token

Just something with the token in it. So, yes. You probably wouldn't have been able to validate google.com, as GoDaddy would have (hopefully) flagged any domains containing major trademarks like "google" for manual review, but that's hardly a strong protection.

Everybody who's used the Internet for more than two days knows that many 404 pages have the URL inside the html (Apache default 404 comes to my mind), so I suppose this was made by someone who simply didn't give a fuck.

Which is kinda big deal if you're generating SSL certs...

1. It sounds like this was originally implemented correctly, and a code change caused the check for status 200 to stop working. (See the first message, sentence starting "A configuration change to the library")

2. There are a number of "404" pages that are sent with HTTP status 200 and also echo the URL inside the HTML. See https://cabforum.org/pipermail/public/2016-April/007506.html , which Patrick Figel linked to in the thread.

So, in in this particular exploit, LetsEncrypt offers more verification, correct? How does the ACME spec mitigate this?

The way the http-01 challenge in ACME mitigates this is by not putting the whole token they'll look for in the request. Basically, they request example.com/.well-known/acme-challenge/<random_token>, and the request body has to be <random_token>+<account_key_fingerprint> for the challenge to pass. Since the account key fingerprint is not part of the request, the 404 page echoing back the token would not be enough (even if it's a 404-served-as-200).

Based on what I know of a high level overview of ACME:

The client says "Hey, i have a cert here, and I want to verify i control domain.com"

The server says "Ok, take this [set of challenge data], encrypt it with your cert, and put it at .well-known/acme-challenge."

The client takes the challenge data. encrypts it. puts it at the challenge point and informs the server.

The server verifies that the encrypted data matches what they were expecting to get. They then sign the cert because only a person with the webserver/domain and the cert could pull off this magic.

The important part here, is that since the client already has the cert in hand when the verification is taking place - it can be used as part of the verification. Use the cert to do cert stuff. Only that cert can do those things. If I can't even get a cert until AFTER I've proven I control a location, then this obviously doesnt work and another "proof" has to be made - hence godaddy's issue.

Google Apps just asked you to upload random text files to certain locations or add DNS TEXT records to verify you own the domain. So does Amazon with SES. You just make an account with them and they give you the stuff.

Why isn't that good enough? No need to use a cert to verify it.

Well, this is "upload random text files to certain locations."

Let's Encrypt is meant to be renewed by cronjob, and most web servers probably don't have programatic control over their DNS records.

That would work. The point of ACME is to not need to generate certs. They just verify them. That removes a bit of computational burden from them. the acme challenge is probably "random text files" too, just generated using the certificate at hand to verify ownership of domain and certificate. Certbot basically is just a client to do all those things you described for you.

The difference between "prove you own this domain and we will provide you a signed certificate for it" and "prove you own this domain and this certificate at the same time, and we will sign it for you"

It's really not that different, except they have opted to make it such a way that it is easily automated. And it's open source, so if they wanted to google, amazon, or even godaddy could set up an ACME server that validates SSL/TLS certs... but would they be able to charge for it if you had to use the lets-encrypt process anyway?

What you wrote about not needing "to generate certificates" is quite wrong, though I suppose I get what made you think this.

ACME uses public key cryptography in (at least) three separate places, the keys pairs are different in each place. Like completely different - one could be 2048-bit RSA, one could be a shiny Elliptic Curve key, and one 4096-bit RSA.

1. Access to HTTPS ACME service over SSL/TLS. Let's Encrypt has a private key, users of their ACME service get a public key, this allows them to be confident they are dealing with the genuine Let's Encrypt ACME service

2. ACME accounts and associated challenges. Every would-be subscriber using an ACME service must create an account, this account has a private key (which they know, usually stored in a file on their computer) and a public key, which they send to the ACME service.

The ACME private key is used to sign data in ACME challenges. This proves that whoever is passing the challenges (e.g. by putting files on a web server) is also making the ACME requests. This provides the confidence to issue certificates to the account holder.

3. The SSL certificates have a public key baked inside them, which corresponds to a private key owned by the subscriber. The subscriber proves they control the private key and want a certificate by signing a Certificate Signing Request, CSR with their private key.

The Certbot software often used with Let's Encrypt automates most of this, making the key pair for (2) automatically and just requesting you OK a PDF describing the rules for Let's Encrypt. By default it will make the key pair for (3) as well BUT if you're getting a certificate for some device that issues its own CSRs, you can hand over a CSR and it will use that one. If it didn't allow this, you couldn't get the right certificate for a device that only issues CSRs, because those devices provide no way to upload a different private key.


Finally, we expect that third parties will want to offer ACME services where a mandatory challenge is like "Fax us all the paperwork and pay $50 verification fee" and then you get an EV certificate instead of the DV certificates that Let's Encrypt issues. Also, never underestimate the willingness of big businesses to pay money for things.

I see how what I said may have been confusing, but I meant that the ACME server does not generate a certificate for each user. They have their own cert for validating their service (not generated per user). And the user/client generates the certificate. The server isn't (necessarily?) generating the keys, the client is.

The ACME server absolutely does generate the end entity certificates for their subscribers (e.g. a certificate on your web server). Code which does this is part of Boulder, the Let's Encrypt implementation of an ACME server so you can go see for yourself (obviously their production system config is not on display but it's the same code)

You are correct that Let's Encrypt don't generate the subscriber's key pair. This is (or should be) true for every public CA offering Web PKI certificates today. The subscriber puts their public key in a Certificate Signing Request, and signs it with the corresponding private key. Let's Encrypt receives the CSR, verifies that it is correctly signed, establishes that the requesting Let's Encrypt user account is authorised for all the DNS names listed in the CSR, and other parameters are acceptable, then it creates a signed certificate for those names, with the public key from the CSR.

[Edited to add: If you use Certbot or similar to just issue a certificate all the CSR creation and signing is done for you. But it still happens, and Let's Encrypt are still the ones actually making and signing the certificate]

It is essential that Let's Encrypt create the certificate. Under the current Baseline Requirements they're obliged to ensure that the serial number is unpredictable yet unique across all their issued certificates. The only practical way to achieve these goals is for them to create the certificate from scratch and sign it.

The situation where a tbsCertificate (the certificate document before it is signed) is transmitted anywhere is unusual. The only example I can think of recently was where Symantec were arranging SHA-1 exception certificates and Google's process deliberately involved showing everybody the tbsCertificate before it was signed so that we (the community) could examine it and ask for anything unusual to be explained. If you go back and look at those discussions you'll see it was me who first asked for the weird OU value in certificates for TSYS to be explained or removed -- they were subsequently removed before the certificates were issued.

Thanks, Thats good to know. I was operating under the assumption that the key pair associated with the cert and the cert were essentially the same.

That does show a fairly easy way that this could have been prevented in GoDaddy's case though. Simply provide both a chunk of data as a token and a location that it should be placed.

My understanding is specifically that wouldn't have helped, because all locations that didn't already have content would have shown the token -- the token being present in the 404 response.

What parent is saying is: ask to put STRING1 at path STRING2; this way, the server has no way of unintentionally show STRING1, as it is different from STRING2.

I should have been more clear but corecoder has it right the token in this case would be a different string/file than the location so you wouldn't run into the issue of some 404 pages including the token because the data godaddy would be looking for is completely unrelated to the url they're checking.

From my reading of the spec, LE requires a valid JSON response (and only that valid JSON). Looks pretty hard to accidentally spit out.

An http 200 response is not a 404 page. Why are websites returning 200s on their 404 pages?

    > Why are websites returning
    > 200s on their 404 pages
The behaviour for the user is identical in most cases, and "user" here includes "inexperienced or non-diligent person who set it up"

Sure, but they're referencing 'page not found' pages as 404 pages, like most people do. I wouldn't be surprised if a majority of sites returned the wrong status codes for some pages.

> Prior to the bug, the library used to query the website and check for the code was configured to return a failure if the HTTP status code was not 200 (success). A configuration change to the library caused it to return results even when the HTTP status code was not 200. Since many web servers are configured to include the URL of the request in the body of a 404 (not found) response, and the URL also contained the random code, any web server configured this way caused domain control verification to complete successfully.

I'd bet that the library in question was libcurl, and they forgot to set CURLOPT_FAILONERROR[1].

[1]: https://curl.haxx.se/libcurl/c/CURLOPT_FAILONERROR.html

This sounds like responsible handling, disclosure, and remedying of the problem.

- request http://example.com/<path or query containing random token>

- if random token is echoed in the response, and the HTTP response code is 200, they consider that the applicant has control over the requested FQDN

Do I understand their validation method correctly? If so, I wouldn't consider it very secure.

I believe this is essentially what ACME does when using the domain challenge, granted it likely requires a certain set of content to ensure your misconfigured webserver doesn't return 200 for all paths.

ACME does not include the expected response in the path or query. So it can't be accidentally included in the response.

EDIT: quote by Nick Lamb in the discussion linked:

> ACME http-01 won't get tripped here because the checked content of the URL is very much not the random string (it's a JWS signature over a data structure containing that random string, thereby proving it was made by whoever the ACME server is talking to). But yes, doing something that _looks_ superficially like the ACME style of validation without such subtlety will trip you up.

As ploxiln mentioned, there's a lot more to ACME.

ACME uses challenge-response validation, which is significantly more complicated than an echo. You've already sent a CSR to LetsEncrypt that contains your server's generated public key. LetsEncrypt then issues a challenge that your server must sign using the associated private key that it generated, and that only it has access to.

Then LetsEncrypt validates the signature using the public key and checks that a) the signed data matches the nonce they sent as a challenge and b) the signature is valid given the known public key.

This sufficiently guarantees the fact that you control both the domain name and the key associated with your submitted CSR. At that point, they sign and issue your certificate and send it back to you.


The way you've described only works if the ACME user also has access to the private key for the CSR. But that isn't true if you're using CSRs from some appliance you bought which doesn't let you change or view its private keys and doesn't implement ACME itself. That's a good secure design for the appliance so we don't want to discourage it.

Thus, ACME doesn't do it that way. It has a separate key pair for the ACME protocol itself, the "account key" pair. Your challenge signature is signed with the private key for the ACME account. The ACME client never needs to know the private key for a CSR it presents, although in the easy-to-use default modes in popular clients like Certbot they do generate these keys too.

I haven't looked at ACME, but any properly-designed system should require the entire response to match some expected value instead of just searching for the value as a substring.

Several of my customers were hit by this. What concerns me possibly even more is that GoDaddy, having revoked the certificates, then managed to "un-revoke" them on request with a grace period. This is unsettling, it's not how the CRL system is supposed to work!

That's... troubling. You should consider mentioning it on the mozilla.dev.security.policy thread:


EDIT: GoDaddy themselves say they will never do this:


"The process cannot be reversed."

The first part of the story definitely checks out

https://crt.sh/?id=29236482 This certificate absolutely was revoked by GoDaddy

However, that certificate is _still_ revoked right now. A _new_ certificate for the same names was issued on the 12th of January, presumably once the re-validation was completed. This isn't in violation of any policies. Sites on that new certificate such as https://royalduchy.co.uk/ do indeed work fine.

Can you update the medium story to reflect this? I mean, not your feelings about GoDaddy, say whatever you feel, but the facts aren't as portrayed in that story so far as I am able to see.

I own a SSL certificate when I bought a domain from NameCheap. The SSL cert is provided by Comodo. I recently lost my box so I also lost my cert and had to revoke the existing one and ask to re-generate. I would need to wait ~3-6 hours before a new cert is available (ugh), so I decided to go to Let's Encrypt. While the initial setup for starter is quite confusing, I managed to create one regardless and I haven't looked back.

I can't help but to think when someone was designing this challenge scheme he/she must have thought of this potential risk, but probably shrug it because "most people won't be able to come up with this method."

According to TFA, the certs have already been revoked.

Great article! Just the sort of thing I come to HN for.

I note it's a year old, though - has much changed since then? OCSP stapling sounds like a Very Good Idea, I hope it's been catching on.

Not all that much has changed. SSL Pulse[1] shows that use of OCSP Stapling went up about 3% in the past 12 months, currently putting it at 25% of observed sites. OCSP Must-Staple is now available in Firefox and can be added as an extension to certificates issued by Let's Encrypt. No other browsers have plans for Must-Staple AFAIK. Otherwise, OCSP is still soft-fail for everything but EV in most scenarios.

[1]: https://www.trustworthyinternet.org/ssl-pulse/

+3%? Well, that's less inspiring news than I'd hoped. Still, it's something, I suppose.

Anyway, that's another very interesting link you've got there - I knew about SSL Labs, but hadn't come across that report before. Looks like something worth keeping an eye on. Cheers!

The problem is that the major web server apache has basically a broken ocsp stapling implementation. It does a bunch of unexpected things and you'll likely not want to enforce stapling with it.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | DMCA | Apply to YC | Contact