CA Root expired on 30 May 2020

jeffbee · on May 30, 2020

Quick reminder from your friendly local SRE: never ever issue certificates that expire on weekends. Make certs expire in the middle of the afternoon on a business day wherever your operators live and work. The cert in question expires at May 30 10:48:38 2020 GMT, which smells suspiciously like a fixed time after the cert was generated, rather than at a well-chosen point in time.

lmilcin · on May 30, 2020

All my applications use a component that watches certs configured (everything in cert and trust store) and returns warning in telemetry from the application if any of the certificates is less than a week from expiration. This is checked periodically while the application runs.

This not only makes sure we don't miss expiration but also ensures we don't forget to configure any of the application.

We had a situation when the cert was replaced but the file was placed in incorrect path and was not actually used by the app. Having the app report on what is actually being in use is the best way to prevent this from ever happening.

kl4m · on May 30, 2020

Good old "cert replaced but apache/nginx failed to reload" has bitten me more than once...

Sebb767 · on May 31, 2020

Me too! Especially with the short expiration times of LetsEncrypt. But I really don't want to put `nginx -s reload` in the Cron, in case I'm tinkering with the configs and they're suddenly live (which only really happens at staging or at home of course, but still).

emilburzo · on May 31, 2020

You can use `nginx -t && nginx -s reload` for that.

It will first check the configs/paths, and only then, if successful, signal nginx to reload.

Sebb767 · on June 1, 2020

That's what I usually do. My problem is that I might be adding a location and nginx reloads between that and adding access restrictions (i.e. because I took a break to google).

amaccuish · on May 31, 2020

Certbot has deploy hooks which is where I'd put the nginx reload statement. The hooks are run automatically when a new certificate is issued.

Sebb767 · on June 1, 2020

Oh, that's a great idea! Thanks :)

colechristensen · on May 30, 2020

I've used this https://manpages.debian.org/testing/nagios-plugins-contrib/c...

After one scrambling emergency with a cert expiring in the middle of the day, a constant check with warnings and alerts a couple of weeks before expiry made a matter of defensive organization into something trivial.

BurningFrog · on May 30, 2020

There is just no substitute for Reality!

btgeekboy · on May 30, 2020

If you get to the point where the exact expiration date on the certificate matters, you've already lost the game.

colechristensen · on May 30, 2020

Engineering for failure is important, you should always set yourself up so that you have several lines of defence which can fail. Some lines of defence to make failing "impossible" others to make a fail softer, even when you think failing is impossible.

DavidSJ · on May 30, 2020

Defense in depth.

techslave · on May 30, 2020

it’s more like blue m&m’s than an actual requirement

alasdair_ · on May 30, 2020

>it’s more like blue m&m’s than an actual requirement

Did you mean Van Halen's famous "WARNING: ABSOLUTELY NO BROWN M&Ms" clause?

https://www.snopes.com/fact-check/brown-out/

dialamac · on May 31, 2020

Probably. It’s an honest mistake. For Gen Xers and early millennials blue m&ms are memorable for being added in the 90s with much ludicrous publicity.

encoderer · on May 30, 2020

Great tip. Did you notice that cert in this case was issued 20 years ago? It’s crazy to me that it was still being used to sign certs as recently as last week (according to twitter)

jeffbee · on May 30, 2020

Of course, but that doesn't really excuse them. My first experience with middle-of-Sunday-night SSL certificate expiration was in December 1998, and it was already a well-known doctrine by then. I'd expect a commercial certificate authority to have these kinds of things squared away.

user5994461 · on May 30, 2020

My experience with commercial CA is that they set the expiry exactly 1 year from creation. Doesn't matter if it's a week end or a holiday.

fullsend · on May 31, 2020

Generally that’s for Server certs, Roots and Intermediates will be multiple years from what I’ve seen. Roots in particular 10+ years.

abiogenesis · on June 1, 2020

1 Year is a relatively recent thing. Previously you were able to buy server certs with a 5 year expiration.

jis · on May 30, 2020

It's actually worse. The new root (good I believe until 2038) uses the same key as the now expired certificate. It has to or it would not be possible to validate the certificates that were issued. And this new one is a root certificate installed in browsers!

What "should" happen is that no certificate should be issued with an expiration date later than the issuing certificate. Then as the issuing certificate gets closer to expiration, a new one, with a new key pair, should be created and this new certificate should sign subordinate certificates.

jis · on May 30, 2020

Sorry to reply to my own comment. But I want to clarify. Two certificates (at least) expired. The root named "AddTrust External CA Root" and a subordinate certificate with a subject of "USERTrust RSA Certification Authority." Both expired around the same time.

The "USERTrust RSA Certification Authority" certificate signed yet another layer of intermediate certificates.

The "USERTrust RSA Certification Authority" certificate was promoted to a self-signed certificate, now in the browser trust stores, using the same key pair as the original certificate that was signed by "AddTrust External CA Root." It has an expiration of 2038 (although that concept is a bit vague in a root certificate).

josephcsible · on May 30, 2020

There's actually a third certificate for "USERTrust RSA Certification Authority", also using the same key pair, signed by a different root called "AAA Certificate Services". It looks like the intended replacement for the expiring one is this one, rather than the one where it's the root itself.

dylz · on May 30, 2020

It is explicitly not a replacement, but some kind of legacy fallback that they don't want you to use, but exists for enterprise customers that absolutely can't get trust.

josephcsible · on May 30, 2020

Are you sure? That's the path that InCommon has been providing me for new certificates since they switched away from the expiring one.

dylz · on May 31, 2020

On anything with a modern TLS stack, I see this trust chain:

- Leaf cert (your cert)

- InCommon RSA Server CA

- USERTrust RSA Certification Authority (this is/should be the final point)

josephcsible · on May 31, 2020

That's what my browser shows me too, but it's just because it's ignoring the cross-signed one that chains to AAA. The server is sending it, per InCommon's setup instructions.

tialaramex · on May 31, 2020

That's correct

The old TLS (versions 1.0, 1.1, 1.2) specifications said that the certificates supplied are to form a chain, starting from a leaf and leading back towards a root.

Pretty much all clients assume that once they can see a way to a root they trust they'll give up following the provided chain and trust that - but sadly not all of them, so "over-specifying" the chain can cause problems.

Modern clients tend to go further, they still assume the first certificate is a leaf, but all other certificates are just potential hints that might be helpful in working out an acceptable trust path. TLS 1.3 actually specifies that clients must tolerate certificates supplied on this basis rather than a strict "chain".

I'm actually surprised at the number of claimed clients which don't have vaguely modern trust stores but do understand SHA256.

toast0 · on May 31, 2020

> I'm actually surprised at the number of claimed clients which don't have vaguely modern trust stores but do understand SHA256.

All the clients were limited to SHA-1 have already been forced off https; CAs in the CA/Browser forum weren't permitted to issue SHA-1 certs valid past Jan 1 2017, and you had to have gotten those issued before Jan 1 2016. Browsers were showing warnings on SHA-1 certs depending on expiration throughout 2015, so you had to either put up with a warning (and the customer service burden thereof), ditch your old clients and go sha-2 only, segregate traffic, or build custom software to send sha-1 certs to some people and sha-2 certs to others.

Microsoft added support for sha-2 certs in the OS system stack with XP Service Pack 3, released in 2008, and Microsoft was always pretty slow with support on things, other platforms may have supported this earlier. A CA bundle from like 2005-2008 is going to be fairly limiting today. A lot of CAs back then had a 20 year validity period, which may have started 5-10 years before the bundle date. Of course, a lot of bundles today end in 2038, so we'll be screwed then.

cmbuckley · on May 31, 2020

Certificates issued with this CA will have been cross-signed by the newer root certificate, but our CA (Sectigo) was sending the old chain in issuing emails as late as April this year, despite the cross-signed root being available for a long time.

imron · on May 30, 2020

> It’s crazy to me that it was still being used to sign certs as recently as last week (according to twitter)

It's likely because it was issued 20 years ago. People have been using it for 20 years and no-one realized it was about to stop working.

adrianmonk · on May 30, 2020

> rather than at a well-chosen point in time

So, you're saying that "I'm not going to be working here anymore by then... hahahaha" isn't well-chosen?

beamatronic · on May 30, 2020

Also make a Calendar placeholder (like a fake meeting), invite a lot of folks or a distribution list, and turn on an alert for 24 hours ahead.

folmar · on May 30, 2020

For long-lived certificates they will outlive your calendar tech. The bit rot leave maybe the events fine, but anything fancier like notifications will fad away.

Source: BTDT 3 times in 7 years, and it was all with "Big Enterprise" grade products.

jldugger · on May 31, 2020

> Make certs expire in the middle of the afternoon on a business day wherever your operators live and work.

If I can predict that 20 years into the future I wouldn't be in the SRE business.

jeffbee · on May 31, 2020

I think it was foreseeable even in dark days of the year 2000 that this certificate expired after business hours globally.

But your statement is really pointing out that nobody should be making long-lived certificates.

solarengineer · on May 31, 2020

If you were to issue certs for short durations, and also maintain a calendar of cert expiry, those certs could be renewed in a timely manner.

In other scenarios where one would want to issue fresh certificates (receiving Ops control from other orgs, credentials refresh for what ever reason), one can still do so without waiting for the current certificates to expire.

kjaftaedi · on May 30, 2020

"Middle of the afternoon" .. for who?

hombre_fatal · on May 30, 2020

If you're going to ask a question about someone's comment, at least finish reading the sentence to make sure it's not immediately answered.

Don't be part of the death of internet discourse.

kjaftaedi · on May 31, 2020

I read the comment several times.

I am just of the opinion that "make it expire in the afternoon" doesn't apply to root certificates that are used across the entire world (i.e. - the topic of discussion)

LeonM · on May 30, 2020

This one bit me today and abruptly ended my day at the beach.

The certificate reseller advised my customer that it was okay to include the cross-signing cert in the chain, because browsers will automatically ignore it once it expires, and use the Comodo CA root instead.

And that was true for browsers I guess. But my customer also has about 100 machines in the field that use cURL to access their HTTPS API endpoint. cURL will throw an error if one of the certs in the chain has expired (may be dependent on the order, don't know).

Anyway, 100 machines went down and I had a stressed out customer on the phone.

lucaspiller · on May 31, 2020

HTTP clients in programming languages are not as smart as web browsers when it comes to verifying SSL certificate chains. For example, if the chain presented by the server is missing intermediate certificates, modern web browsers are able to fetch those intermediate certificates without issue. Most HTTP clients do not do that though, and instead will throw a cryptic error, something along the lines of "unable to get local issuer certificate". This is known as a 'incomplete chain' error.

Earlier this year I added SSL verification to a website uptime monitoring service I run (https://www.watchsumo.com/docs/ssl-tls-monitoring) and it wasn't anywhere near as simple as I thought it would be. There's so many edge cases regarding verification, and languages usually don't expose the full errors in exceptions, then you have errors like this which only affect a subset of clients.

tass · on May 31, 2020

Hi Luca, Just tried this out. I added a URL which has an expired root cert, but it passed your test.

Let me know if I can help with more info.

0x0 · on May 30, 2020

Sounds like a good test case for exercising those otherwise useless "million dollar insurances" that some certificate vendors flash in their sales materials?

donmcronald · on May 30, 2020

Have you ever read the terms? I don't know if they even publish them anymore, but I read one many years ago. TLDR;

1. The CA must misissue a cert.

2. The misissued cert is used by a malicious party to impersonate you.

3. Every user (your users) must prove their damages and claim individually.

4. There might have been a low maximum, per-user claim, but I can't remember.

I'd be amazed if there's a single person on the internet who's been paid out by that warranty.

tialaramex · on May 31, 2020

Yes, it's useless insurance. The interesting thing is that useless insurance is illegal to sell in lots of places† - to consumers, but here the insurance was sold to the root CAs which are huge corporations so they don't care that it's useless because they only bought it as a PR exercise.

†This is the root of a huge scandal in the UK that resulted in banks refunding people years of fees for a product called PPI which they should never have been sold. As a secondary effect this resulted in annoying spam from firms who'd help you claim your money back. By the end I almost felt sad I hadn't fallen for the original scam, because I was being inconvenienced by all the spam but (since I hadn't lost anything) not getting a pile of cash as a refund.

alexiaya · on May 31, 2020

I had this problem with mediaarea.net. Actually, cURL and openssl s_client didn’t complain, but wget and APT failed because a certificate in one of the certification paths had expired. Had to contact them to fix it.

dataflow · on May 30, 2020

Is that a cURL bug?

mshade · on May 30, 2020

It seems only to be older versions of curl or curl with openssl <= 1.1.1. My macbook's curl fails, but my arch linux box's curl works fine.

jonathanoliver · on May 31, 2020

When I started getting reports from customers on my company's website about issues, I was baffled for a few minutes because all my tests with OpenSSL and cURL, etc. were passing...on my Arch Linux install. Then I switched over to my MacBook and ran the official /usr/bin/curl version (instead of my brew install curl version) and I understood what was happening. Gotta love when this happens.

kbenson · on June 3, 2020

My guess is openssl, since we experienced this problem with a lot of our internal services and our monitoring, both of which make heavy use of Perl and LWP::UserAgent, which build on OpenSSL. CentOS 7 boxes had problems (easily shown through the lwp-request util, which can often be used like curl's CLI tool), but not on CentOS 8.

admax88q · on May 30, 2020

Honestly, certificates should never expire or should expire daily. If certificate revocation works then its pointless to have expiring certs. Its just a mechanism for CAs to seek rent.

If certificate revocation doesnt work then certs need to expire super frequently to limit potential damage if compromised.

A certificate that expires in 20 years does absolutely nothing for security compared to a certificate that never expires. Odds are that in 20 years the crypto will need to be updated anyways, effectively revoking the certificate.

josephcsible · on May 30, 2020

Exactly. Certificate expiration has never really been about security. It's purely for practicality, so that CRLs won't grow without bound.

This is especially true now that we have OCSP stapling. From a security perspective, a short-lived certificate is exactly equivalent to a long-lived certificate with mandatory OCSP stapling and a short-lived OCSP response, but the latter is much more complicated.

And in this case since it's a root, it goes even further than that. Root CA's can't be revoked anyway, so if they're compromised, a software update to distrust it is required. There's really not a good reason for them to expire at all.

sleevi · on May 30, 2020

It’s not true that expiration is not about security. Dan Geer’s talk in 1998, noted at https://cseweb.ucsd.edu/~goguen/courses/275f00/geer.html , is just as relevant today in the design of key management systems.

Expiration is not “just” about cryptographic risk either; there are plenty of operational risks. If you’re putting your server on the Internet, and exposing a service, you should be worried about key compromise, whether by hacker or by Heartbleed. Lifetimes are a way of expressing, and managing, that risk, especially in a world where revocation has a host of failure modes (operational, legal/political, interoperability) that may not be desirable.

As for Root expiration, it’s definitely more complicated than being black and white. It’s a question about whether software should fail-secure (fail-closed) or fail-insecure (fail-open). The decision to trust a CA, by a software vendor, is in theory backed by a variety of evidence, such as the CA’s policies and practices, as well as additional evidence such as audits. On expiration, under today’s model, all of those requirements largely disappear; the CA is free to do whatever they want with the key. Rejecting expired roots is, in part, a statement that what is secure now can’t be guaranteed as secure in 5 years, or 10 years, or 6 months, whatever the vendor decides. They can choose to let legacy software continue to work, but insecurely, potentially laying the accidental groundwork for the botnets of tomorrow, or they can choose to have legacy software stop working then, on the assumption that if they were receiving software updates, they would have received an update to keep things working / extend the timer.

Ultimately, this is what software engineering is: balancing these tradeoffs, both locally and in the broader ecosystem, to try and find the right balance.

josephcsible · on May 30, 2020

I don't see anything about expiration in that talk.

If you don't have a strong revocation system, then your host is vulnerable whether or not you have expiration, since attackers aren't going to wait until the day before your key expires to try to steal it.

In general, when a CA's root certificate expires, it creates a new one and gives it to browser and OS vendors. What's the difference between the CA continuing to guard their old private key, and starting to guard the new private key?

toast0 · on May 31, 2020

> If you don't have a strong revocation system, then your host is vulnerable whether or not you have expiration, since attackers aren't going to wait until the day before your key expires to try to steal it.

We don't have a strong revocation system. Maybe one day OSCP stapling will be mandatory, although OSCP signatures are reusable within an expiration window, so we still have the question of expiration.

> In general, when a CA's root certificate expires, it creates a new one and gives it to browser and OS vendors. What's the difference between the CA continuing to guard their old private key, and starting to guard the new private key?

Their new key is fresh --- the public key hasn't been floating around being factored for the last 20 years. Maybe it's longer too. It certainly wouldn't be on disk of hardware they discarded before the new key was generated. Of course, they should have taken proper precautions with their discarded hardware, but maybe someone slipped up.

Frequent expiration is a way of limiting the damage of key theft, not a way to prevent it. In some (many?) cases, key theft is not detected, so limiting the amount of time it could be used is helpful.

OTOH, what do you do for devices which are shipped with a CA bundle, and never updated. They may be a problem for other reasons, but at some point, they don't have any valid roots and they turn into a pumpkin. (Fun times if the manufacturer realizes and tries to update, but doesn't get the update distributed before the certs expire; there was an Amazon Kindle update like that once).

sleevi · on May 30, 2020

Search for “Needham & Schroeder”

It’s not either/or expiration vs revocation; they are the same thing. Expiration is natural revocation and a ceiling function to the overall cost.

The statement “when a CA’s root certificate expires, it creates a new one” is not a general statement. That’s the exception, rather than the rule, as evidenced by just watching the changes to root stores over the past 30 years. More CAs have left the CA business / folded / been acquired than have carried on. A classic example of this is the AOL root, for which the long-standing scuttlebutt is that no one knows what happened to the key after AOL exited the CA business. The reason it’s scuttlebutt, as opposed to being a Sky is falling DigiNotar, is that the certificate expired. Or, for that matter, look at how many CAs have been distrusted. Expiration fits as a natural bound for legacy software that doesn’t receive updates, failing-secure rather than failing insecurely.

josephcsible · on May 30, 2020

When I search for that, all of my hits are about a key-transport protocol that doesn't seem related to certificates at all.

Expiration and revocation are far from the same thing. If my site's private key gets stolen, I want clients to stop trusting it today, not next year.

Expiring roots means that if a device stops getting updates from its vendor, it will gradually become a brick even if no CAs do anything wrong.

cheerlessbog · on May 30, 2020

Expiration may be useful but how is expiration in 2038 useful?

sleevi · on May 30, 2020

It isn’t, but then again, in 1995 we might have said the same for expirations in 2015, and yet so, so many poorly managed CAs were expunged in the past 5 years.

A healthy root store would set revocation at a much more aggressive period; say, every five years. Every three years, the CA applies to have their new root trusted, which gives two years to distribute that root to clients that need it, while having the old root sign the new root, to support an immediate transition to the new root. Among other things, this encourages a more robust and healthy CA ecosystem, because you don’t end up with lopsided balances based on “who has the oldest root.” That imbalance encouraged poor behavior which got CAs distrusted, in the past, because they behaved in a manner that assumed they were too big, by virtue of being too ubiquitous, to fail.

jakub_g · on May 30, 2020

Someone on Twitter (forgot whom, maybe swiftonsecurity?) suggested lately in a tongue-in-cheek way that the certs should not hard-expire, but instead add an exponentially-increasing slowdown at TLS handshake.

Once the slowdown is too big, someone will notice and have a look.

dredmorbius · on May 31, 2020

Unfortunately, given crypto algo evolution and Moore's Law, the reverse is more likely true. Though that would be a neat hack.

jakub_g · on May 31, 2020

I don't understand how this is relevant. We're talking about a deterministic timeout, based on the diff between cert exp date and current date.

If Chrome added e.g. a 20 second slowdown to connect to the page for every user in the world one day after the cert expired, surely there would be some users who would ping the company that the site is unbearably slow (on social media, by email, whatever). Or someone in the company would notice. Or analytics would drop like hell.

Myriads of ways how a non-abandoned website would learn about it directly or indirectly.

Of course that seems like a giant hack, but a grace period of 1-7 days with browsers doing something less scary than a giant error screen would be more than welcome.

dredmorbius · on May 31, 2020

My point, such as it was, is that at present the workfactor penalty favours less-effective crypto, the opposite of the suggestion.

Of course a specifically-implemented timeout might be incorporated. That faces the challenge of bad actors (or incompetent / unaware ones) bypassing such a mechanism.

Incorporating the cost into the algorithm itself (say; requiring, more rounds based on time since first release, according to a mandatory and well-estabished protocol, just off the top of my head, with both client and server agreeing on minimum required rounds) might work.

vbezhenar · on May 30, 2020

To revoke a certificate you must keep a list of revoked certificates. Without expiration date that list would grow infinitely. And that list should be downloaded periodically by every entity which wants to verify certificate.

josephcsible · on May 30, 2020

They said "certificates should never expire or should expire daily". Roots already can't be revoked, so they should never expire. Intermediates and leaves should expire daily. Since currently, OCSP responses are often valid for that long, there'd be no need for revocation anymore then.

elcomet · on May 30, 2020

What if your CA is down for a day? Imagine let's encrypt being down for 24 hours and all if it's certificates going invalid. That would be millions of websites unavailable..

josephcsible · on May 31, 2020

This is no different than an OCSP server going down for a day. Either the site becomes unreachable, or clients take a risk by accepting a certificate that might be revoked.

Avamander · on May 31, 2020

When OCSP is down nothing happens with most browsers. Except-Staple might worsen it a bit, but how many use that?

josephcsible · on May 31, 2020

My point is that connecting with OCSP down carries the exact same risk that accepting an expired certificate does. In both cases, the risk is that the certificate might have been revoked without you knowing it.

Thorrez · on May 31, 2020

If I operate a website, I might have some confidence that my key hasn't been stolen in the last year. But I might have much less confidence that my key hasn't been stolen in the last 20 years.

Certificate expiration means I don't need to worry about that second case.

oarsinsync · on May 31, 2020

That’s only true if your key is regenerated each time you request an updated certificate. This is not mandatory, and there are lots of guides on the internet for generating a csr from an existing key.

Thorrez · on May 31, 2020

Sure but I don't think that's generally done or recommended. I think people only do that if they have certificate pinning, which on the web is pretty rare and getting rarer as browsers drop support for HPKP.

Ididntdothis · on May 30, 2020

I tend to agree. Seems dealing with expiration dates is just another burden without real security. If something goes wrong you have to revoke now and not wait for another year until the cert expires.

sleevi · on May 30, 2020

Andrew Ayer has a write-up about this at https://www.agwa.name/blog/post/fixing_the_addtrust_root_exp...

At the core, this is not a problem with the server, or the CA, but with the clients. However, servers have to deal with broken clients, so it’s easy to point at the server and say it was broken, or to point at the server and say it’s fixed, but that’s not quite the case.

I discussed this some in https://twitter.com/sleevi_/status/1266647545675210753 , as clients need to be prepared to discover and explore alternative certificate paths. Almost every major CA relies on cross-certificates, some even with circular loops (e.g. DigiCert), and clients need to be capable of exploring those certificates and finding what they like. There’s not a single canonical “correct” certificate chain, because of course different clients trust different CAs.

Regardless of your CA, you can still do things to reduce the risk. Using tools like mkbundle in CFSSL (with https://github.com/cloudflare/cfssl_trust ) or https://whatsmychaincert.com/ help configure a chain that will maximize interoperability, even with dumb and old clients.

Of course, using shorter lived certificates, and automating them, also helps prepare your servers, by removing the toil from configuring changes and making sure you pickup updates (to the certificate path) in a timely fashion.

Tools like Censys can be used to explore the certificate graph and visualize the nodes and edges. You’ll see plenty of sites rely on this, and that means clients need to not be lazy in how they verify certificates. Or, alternatively, that root stores should impose more rules on how CAs sign such cross-certificates, to reduce the risk posed to the ecosystem by these events.

dataflow · on May 30, 2020

Given you mention OpenSSL is currently terrible at verifying "real" certificates: why doesn't e.g. Google just throw a bit of money at them and fix their bugs when they're clearly so well-known? It seems like such an obvious thing to do for a company whose entire business is built on the web. Is there really too little benefit to justify the cost of the engineer(s) it would take even for big companies? Or are the projects somehow blocking help?

sleevi · on May 30, 2020

Google has, in the past. Look at the ChangeLog for 1.0.0 - the massive improvements made (around PKITS) were sponsored by Google.

Google has a healthy Patch Rewards program ( https://www.google.com/about/appsecurity/patch-rewards/ ) that rewards patches to a variety of Open Source Projects.

Google also finds a variety of projects through the Core Infrastructure Initiative ( https://www.coreinfrastructure.org/ ), which OpenSSL is part of https://www.coreinfrastructure.org/announcements/the-linux-f...

telesilla · on May 30, 2020

Andrew Ayer's tip on getting Debian sorted may have saved me hours.

elithrar · on May 30, 2020

Great thread by Ryan Sleevi tracking the many (and growing) reports of issues caused by this root expiring: https://twitter.com/sleevi_/status/1266647545675210753

Top offender so far seems to be GnuTLS.

Mojah · on May 30, 2020

This issue is largely cause by people still stuffing old root certificates in their certificate chains, and serving that to their users.

As a general rule of thumb:

1) You don't need to add root certificates to your certificate chain

2) You especially don't need to add expired root certificates to the chain

For additional context and the ability to check using `openssl` what certificates you should modify in your chain, I found this post useful: https://ohdear.app/blog/resolving-the-addtrust-external-ca-r...

toast0 · on May 30, 2020

You shouldn't need to send the root certificate (unless the clients are _really_ dumb, but I worked with a lot of dumb clients, and did not see any issues with only sending intermediates and the entity cert), but a fair number of cert chain verifiers are fairly dumb and won't stop when they get to a root they know which makes things tricky.

If some of your clients don't have the UserTrust CA, but do have the AddTrust CA, up until today, you probably wanted to include the UserTrust CA cert signed by AddTrust. Clients with the UserTrust CA should see that the intermediate cert is signed by UserTrust and not even read that cross signed cert, but many do see the cross signed cert and then make the trust decision based on the AddTrust CA.

It's hard to identify clients in the TLS handshake to give them a cert chain tailored to their individual needs; there's some extensions for CA certs supported, but they're largely unused.

viraptor · on May 31, 2020

It depends what clients you need to support. ssllabs test for the server will tell you which ones you're compatible with. You may get some surprises with old Androids and XP. (whether you're interested in being available to them is another question)

encoderer · on May 30, 2020

Any guess at what percentage is this versus the case where these certs are cross-signed with a newer root but older clients with outdated bundles do not trust the newer root?

(At Cronitor, we saw about a 10% drop in traffic, presumably from those with outdated bundles)

Mojah · on May 30, 2020

Hard to say, as we don't have any insights into the client-side. But we can say that only ~2% of our clients had expiring root certificates in their chain in the last few weeks, so it's definitely a minority.

Since you don't control the clients in anyway, it might be that there are clients that haven't updated their local certificate stores in ages and don't yet trust the new root certificates.

MobileVet · on May 30, 2020

This appears to have caused our Heroku managed apps to go offline for 70+ minutes.

https://status.heroku.com/incidents/2034

Anyone that was already connected was able to continue accessing the sites but new connections failed. This mostly affected web users.

Our main app server continued to crank along thankfully (also on Heroku) and that kept the mobile traffic going which is 90% of our users.

Edit: adding Heroku ticket link

encoderer · on May 30, 2020

I have never really wanted to go "serverless" until today.

TIL that I can buy a cert that expires in a year that is signed by a root certificate that expires sooner. Still not sure WHY this is the case, but this is definitely the case.

ta17711771 · on May 30, 2020

Because the certificate authority paradigm is LITERALLY INSANE.

AmericanChopper · on May 30, 2020

It’s the PKI paradigm that creates most of the insanity. Authentication is still an unsolved issue with PKI, there’s many ways that you can perform authentication, but all of the different approaches lead to one form of insanity or another. The CA system has its share of insanity, but it is the most successful PKI implementation in history, and by a long way.

pwdisswordfish2 · on May 31, 2020

PKI authentication is only insane when delegated to a third party. There is a built-in assumption within the CA system that no two parties can ever trust each other and intermediaries are always needed. A world of strangers who never learn anything about or get to know each other. It is either impractical or impossible for the first party to trust the second, using this system, without third party intervention. What reason is there that a website owner should never send a CSR to an end user who creates her own CA cert? Why are third parties the only ones permitted by websites to sign their certificates? Welcome to the world wide web of middlemen.

AmericanChopper · on May 31, 2020

Well you can’t perform authentication over an insecure channel, and you can’t have a secure channel without authentication. Either you trust an authority, or you authenticate manually yourself. There’s a reason TLS uses the CA system, and not PGP.

pwdisswordfish2 · on May 31, 2020

You can authenticate outside of the insecure channel. There is a real world outside of the internet.

It is this "manual authentication" that the CA system does not account for. It is not an option. Why is it that, in practice, the only certificates an end user's "CA" can sign are the end user's server certificates?

AmericanChopper · on May 31, 2020

> You can authenticate outside of the insecure channel. There is a real world outside of the internet.

Exactly, and you can look at how much of a failure PGP has been to see how successful that approach is.

> Why is it that, in practice, the only certificates an end user's "CA" can sign are the end user's server certificates?

CAs can sign any X.509 certificate. They only authenticate domain control or business ownership (via “EV”), though. CA certs also aren’t only used for TLS. You can get a code signing cert from a CA for instance.

You can write a very long list of perfectly valid complaints about the CA system. However it is undeniably the most successful PKI ever implemented, and not just by a little bit.

This isn’t because CAs are bad at what they do. It’s because there is absolutely no elegant solution to that problem. If you want to authenticate identity manually, then I wish you luck finding one or two other people to join you. If you want to securely communicate with people you don’t know personally, or who don’t know how to/can’t be bothered to maintain their own set of private keys, then you’re going to need to establish trust via a 3rd party authority.

pwdisswordfish2 · on May 31, 2020

"CAs can sign any X.509 certificate."

Please explain how a user who creates a CA pem file with openssl can sign the certificate from example.com. Not a faked up certificate for example.com but the real one the owner of the example.com domain name got from Digicert.

josephcsible · on May 31, 2020

This is the no true Scotsman fallacy. If I told you how to sign a certificate with your own CA, you'd tell me the result was "faked up".

pwdisswordfish2 · on May 31, 2020

If the owner of the example.com website creates the CSR and send its to the user, then the result is not "faked up". I use the term "faked up" only to refer to a scenario where the user generates a CSR for a domain name that is not under her control.

josephcsible · on June 1, 2020

Fair enough. This command would do it then:

    openssl x509 -req -days 365 -in example.com.csr -CA ca.crt -CAkey ca.key -CAcreateserial -out example.com.crt

Sphax · on May 30, 2020

As far as I understand your certificate is still valid but you need to remove the intermediate certificate from your bundle. That was the case for me anyway.

encoderer · on May 30, 2020

If your traffic comes from a browser you are fine with this but if you're coming from e.g. Curl you will find that you need to include an intermediate chain.

(The reason for the difference being that browser stay up to date, many old client systems do not.)

We ended up getting a new cert from a different provider.

Thorrez · on May 31, 2020

You only need to remove the cert if you want to support buggy clients. If none of your clients are buggy, it will be fine to leave the expired cert in the chain.

cmbuckley · on May 31, 2020

That’s not the case — all certs are cross-signed with a newer root. The real problem is that certificate issuers have been giving people the old CA chain instead of the new one.

snapetom · on May 30, 2020

Yep. Got woken up early today for this. We renewed our cert about a month and two days ago. Namecheap, the vendor, sent us the bad AddTrust cert in the bundle. They weren't updating the bundles until two days after we renewed the cert.

Cymen · on May 30, 2020

Same exact thing happened to me (Namecheap, PositiveSSL, renewed roughly a month ago). I went the reissue route on Namecheap and that fixed it (and I ended up with a certificate chain that is one certificate shorter).

seibelj · on May 30, 2020

DataDog failed this morning because of root CA issue.[0] Was a fun Saturday morning with 5000 alarms blowing up my phone.

[0] https://status.datadoghq.com/incidents/6bqpd511nj4h

jimdigriz · on May 30, 2020

Yeah, took me a while to figure this out, the alerts were not welcome.

Found it ironic that the top of their page advertises "Security Monitoring now available".

aarmenaa · on May 31, 2020

Datadog has shit the bed for us multiple times in the last six months. Unannounced breaking API changes, unaddressed bugs, and now their embedded cert expired.

Our org is currently divided over further commitment to the service, or leaving them entirely. They've made it very hard to argue in their favor.

SteveNuts · on May 31, 2020

Their pricing doesn't scale well either, IMO. We have several hundred hosts running and for some of the smaller instance types it costs just as much to monitor than it does to run the entire machine.

Operyl · on May 30, 2020

Datadog failed, and our WAF provider failed at the same time too (internal services). It was .. rather confusing at it seemed like the sky was falling D: .

x3n0ph3n3 · on May 30, 2020

Thanks for mentioning this, since it caused me to go check metrics and find they weren't coming in... Luckily only a couple of my alarms come from metrics via the agent itself.

luckylion · on May 30, 2020

Any predictions how much the usage of CURLOPT_SSL_VERIFYPEER, false will increase in the next 7 days?

josephcsible · on May 30, 2020

IMO, there's a bit of a design flaw with curl here. There should be an easy flag to say "trust the particular certificate with this hash, no matter what's wrong with it", but there isn't, so people instead use the one that says "trust whatever certificate you get, no matter what's wrong with it".

user5994461 · on May 30, 2020

Trusting a specific hash would blow up when the service rotate its self-signed certificate, defeating the point of ignoring certificate error.

josephcsible · on May 30, 2020

If you're rotating a self-signed certificate, then how do you suppose that clients securely trust it? Or if you just mean replacing it when it expires, then this could instead be tied to the underlying public key alone, which can be reused.

pornel · on May 31, 2020

If your clients support "rotating" self-signed certs just like that, it's a huge MITM vulnerability and makes HTTPS as secure as a TSA checkpoint.

gregmac · on May 30, 2020

Yikes.. yeah, if you're going to do this, consider wrapping it in an `if (date < 2020-06-15)` and be sure to fix it properly before then. This reduces the ability to just forget about it (or have the fix constantly deprioritized) and leave your software with a security vulnerablty.

elithrar · on May 30, 2020

Predictions? https://github.com/search?o=desc&q=SSL+verify&s=committer-da...

:)

compumike · on May 30, 2020

Stripe Webhooks are currently failing "for some users" https://twitter.com/stripestatus/status/1266756286734938116 -- some chance that's related.

Edit: for https://www.circuitlab.com/ we saw all Stripe webhooks failing from 4:08am through 12:04pm PDT today with "TLS error". Since 12:04pm (5 minutes ago), some webhooks are succeeding and others are still failing.

Edit 2: since 12:17pm all webhooks are succeeding again. Thanks Stripe!

compumike · on May 30, 2020

For backwards compatibility, I updated our intermediate certificates to provide the AAA Certificate Services signing https://censys.io/certificates/68b9c761219a5b1f0131784474665... to replace the expired 2nd intermediate certificate. (Modifying the "GandiStandardSSLCA2.pem" file in my case.)

zouhair · on May 30, 2020

I was wondering why Lynx started spouting some nonsense:

    $ lynx -dump https://wiki.factorio.com/Version_history
    
    Looking up wiki.factorio.com
    Making HTTPS connection to wiki.factorio.com
    SSL callback:certificate has expired, preverify_ok=0, ssl_okay=0
    Retrying connection without TLS.
    Looking up wiki.factorio.com
    Making HTTPS connection to wiki.factorio.com
    SSL callback:certificate has expired, preverify_ok=0, ssl_okay=0
    Alert!: Unable to make secure connection to remote host.
    
    lynx: Can't access startfile https://wiki.factorio.com/Version_history

0x0 · on May 30, 2020

Are we going to experience the same bug next year for all LetsEncrypt certificates when the DST Root CA X3 expires? I guess modern devices could deal with LetsEncrypt issuing directly from their own modern ISRG Root X1, but would that leave legacy clients completely stranded (iOS <10, older versions of Windows and Android...?)

tialaramex · on May 31, 2020

You'll get two different but related bugs but yes, assuredly something will break and somebody will be angry about it.

The first thing that'll happen is Let's Encrypt's systems will tell systems by default to present certificate chains which don't mention DST Root CA X3. Lots of systems will, as a result, automatically switch to such a chain when renewing and you'll see a gentle trickle of weird bugs over ~90 days starting this summer unless Let's Encrypt moves the date.

Those bugs will be from clients that somehow in 2020 both didn't trust the ISRG root and couldn't imagine their way to using a different trust path not presented by the server. Somebody more expert in crap certificate verification software can probably tell you exactly which programs will fail and how.

Then there will be months of peace in which seemingly everything is now fine.

Then in September 2021 the other shoe drops. Clients that didn't trust ISRG but had managed to cobble together their own trust path to DST Root CA X3 now notice it has expired on services which present a modern chain or no chain at all.

Those sites which deliberately used the legacy DST Root CA X3 chain to buy a few more months of compatibility likewise see errors, but hopefully they at least knew this was coming and are expecting it.

But there are also sites using crappy ACME clients that didn't obey the spec. They've hard coded DST Root CA X3 not because they wanted compatibility at all costs and are prepared for it to end in September, but because they just pasted together whatever seemed to work without obeying the ACME spec and so even though Let's Encrypt's servers have told them not to use that old certificate chain they aren't listening. Those services now mysteriously break too, even in some relatively modern clients that would trust ISRG, because the service is presenting a chain that insists on DSR Root CA X3 and they aren't smart enough to ignore that.

On the upside, lots of Let's Encrypt certs are just to make somebody's web site work, and an ordinary modern web browser has been battle-tested against this crap for years, so it will soldier on.

phonon · on May 31, 2020

https://letsencrypt.org/2019/04/15/transitioning-to-isrg-roo...

halukakin · on May 30, 2020

Site24x7's ssl monitor caught this for us yesterday. And i thought they were wrong as we purchased this last certificate just a few months back.

fragsworth · on May 30, 2020

Some users on Safari (probably old versions) appear to be getting bad cert warnings for https://www.playsaurus.com. REALLY glad I found this post here, it was driving me nuts.

CaveTech · on May 30, 2020

FWIW I'm getting cert errors on that site on the latest chrome.

foepys · on May 30, 2020

I don't. Isn't chrome using the systems CA store and encryption infrastructure if possible? At least on Windows it's using Windows' built-in certs.

ric2b · on May 30, 2020

CloudAMPQ (managed RabbitMQ) was affected: https://status.cloudamqp.com/

Caused us some connections issues that required a restart of both our clients and the rabbitmq cluster.

CarlHoerberg · on May 30, 2020

yes, a bunch of older clusters were affected by this. They included an intermediate of USERTrust that was signed by AddTrust, clients that didn't check for alternate chains would then fail. We pushed the new chain (which now only includes the server cert and the Sectigo RSA cert), and dynamically reloaded the TLS listener in RabbitMQ, it should have solved it for most ppl, email support@cloudamqp.com if it didn't for you. We're sorry we didn't pushed this earlier. We were aware that the AddTrust would expire during the life time of the server certificate, but we assumed that all TLS client would find the valid chain regardless, that assumption was obviously wrong.

dvdkhlng · on May 30, 2020

This just hit me via Debian's 'apt-get update': I'm using jitsi's package repository which is hosted via HTTPS and seems to rely on the expired root-CA. Certificate checks started failing for everybody a few hours ago [1].

That's quite bad, as I tried to do a clean re-install of jitsi-meet, and now I have no installation at all any more.

[1] https://github.com/jitsi/jitsi-meet/issues/6918

userbinator · on May 31, 2020

A bit of an aside, but

While Android 2.3 Gingerbread does not have the modern roots installed and relies on AddTrust, it also does not support TLS 1.2 or 1.3, and is unsupported and labelled obsolete by the vendor.

If the platform doesn’t support modern algorithms (SHA-2, for example) then you will need to speak to that system vendor about updates.

I find things like that really really irritating. Crypto is basically maths, and a very pure form at that, so should be one of the most portable types of software in existence. Computers have been doing maths since before they were machines. Instead, the forced obolescence bandwagon has made companies take this very pure and portable technology and tied it to their platform's versions, using the "security" argument to bait and coerce users into taking other unwanted changes, and possibly replacing hardware that is otherwise functional (and, as mentioned earlier, is perfectly capable of executing the relevant code) along with all the ecological impact that has. Adding new root certificates at least for PCs is rather easy due to their extreme portability, but I wish the same could be said of crypto algorithms/libraries.

josephcsible · on May 31, 2020

You're mad at the wrong people. The security argument is legitimate, so there's no need for your scare quotes. The weaknesses in TLS older than 1.2 are real. You should instead be upset at device vendors for deciding to drop support for devices so quickly. If they'd just keep supplying updates, or even open-source everything so the community could, then this wouldn't be an issue.

toast0 · on May 31, 2020

You could ship better crypto (and updated CAs) with your app for Android -- then you could get support for whatever you like on all versions. But it might not use hardware acceleration if available, and hardware running Gingerbread needs crypto acceleration if available. TLS 1.3 isn't all that much code if you can use the system x.509 and system ciphers, or maybe pick one or two ciphers to ship if they're not there; I'd guess TLS 1.2 isn't that much code either, the complexity comes from trying to support lots of versions -- and from X.509 which has a lot of stuff to process.

I think Chrome for Android did include TLS 1.2 at least, when it was shipping for Gingerbread.

zozbot234 · on May 31, 2020

These days, Android 2.3 Gingerbread devices are essentially obsolete even from a strictly hardware point of view. Most of those were actually very well supported by the old CyanogenMod releases, but few people would even bother trying to bring up something reasonably modern like pmOS by building on that work, the specs are just that bad.

niffydroid · on May 30, 2020

Thankfully our uptime services spotted this earlier in the week. I'm terrible with certs, so no idea why a cert we brought this year is even using this root ca. To be honest, things like let's encrypt or cloud services which manage ssl is a great help

abiogenesis · on June 1, 2020

It uses two root CAs, one old and one new. Your web server must be serving the intermediate certificate signed by the older CA.

bransonf · on May 30, 2020

Fairly certain this affected Kroger. My sister called me this morning asking to troubleshoot why her laptop was warning of an unsecured connection.

Perhaps a coincidence, but also likely that their cert expired.

taypo · on May 30, 2020

We had our CI systems fail today because of this. They were running Ubuntu 16.04. Check the below thread, they say an openssl bug is also a contributing factor. Removing the expired root CA fixed the issue for me. (edit: removed from the clients)

https://www.reddit.com/r/linux/comments/gshh70/sectigo_root_...

pixmin · on May 30, 2020

Everything is fine with PKI and SSL certificates. It was a bug in OpenSSL 1.0.1 / 1.0.2 in dealing with two times cross-signed root CA. It is fixed in 1.1.1, but these older versions are still default on RHEL6/RHEL7/Centos6/Centos7 and even Ubuntu16.04.

I think a large portion of online communications have been affected today.

josephcsible · on May 31, 2020

It's really ironic that only "stable" distros were affected by this, and that distros with software closer to bleeding-edge worked fine through it.

badrabbit · on May 30, 2020

Shouldn't it be fairly simple to monitor expiry's that affect a lot of sites using censys.io dataset?

aarbor989 · on May 31, 2020

We had to get an entirely new certificate to resolve this. We had recently migrated our docker images to be based on Amazon Linux 2, and low a behold, there was no easy way we found to upgrade to the required version of OpenSSL on Amazon Linux 2. Was easier to just upgrade our certificates

spleen · on June 2, 2020

This workaround fixed the problem on our servers: https://forums.aws.amazon.com/thread.jspa?messageID=945042&t...

josephcsible · on May 31, 2020

You didn't need to do that. You could have kept your leaf certificate and just swapped out the expired intermediate certificate.

minaguib · on May 31, 2020

I've maintained some high-level notes on this event, problems and fixes here: https://gist.github.com/minaguib/c8db186af450bceaaa7c452b76a...

vld · on May 30, 2020

ip-api.com was also affected by this. After our first alert at 10:49 (cert expired at 10:48:38) and a minute of being puzzled as to why our certificate expired, we realized the root we bundled is the issue. We finished updating our primary API servers at 10:55.

PixelPaul · on May 30, 2020

The amount of times I told the CA that this will be an issue is a lot. And the amount of time they replied saying there will be no issue is every single time. Dam I hate CAs like Comodo

m-p-3 · on May 30, 2020

That explains why some of Integromat automations failed, they rely on Sectigo when I checked this morning.

antaviana · on May 31, 2020

How would this affect a code signing certificate issued by Sectigo last month?

ta17711771 · on May 30, 2020

Surely the current CA paradigm shouldn't continue to be accepted by the people who keep infrastructure running anymore?

We need to do something.

jpalomaki · on May 30, 2020

At least for many web apps the future is likely automatically created and managed domain validated certificates. Amazon and Azure provide these free of charge and then you have Let’s encrypt.

This does not change the CA paradigm, but removes many operational issues.