
CA Root expired on 30 May 2020 - gionn
https://support.sectigo.com/articles/Knowledge/Sectigo-AddTrust-External-CA-Root-Expiring-May-30-2020
======
jeffbee
Quick reminder from your friendly local SRE: never ever issue certificates
that expire on weekends. Make certs expire in the middle of the afternoon on a
business day wherever your operators live and work. The cert in question
expires at May 30 10:48:38 2020 GMT, which smells suspiciously like a fixed
time after the cert was generated, rather than at a well-chosen point in time.

~~~
lmilcin
All my applications use a component that watches certs configured (everything
in cert and trust store) and returns warning in telemetry from the application
if any of the certificates is less than a week from expiration. This is
checked periodically while the application runs.

This not only makes sure we don't miss expiration but also ensures we don't
forget to configure any of the application.

We had a situation when the cert was replaced but the file was placed in
incorrect path and was not actually used by the app. Having the app report on
what is actually being in use is the best way to prevent this from ever
happening.

~~~
kl4m
Good old "cert replaced but apache/nginx failed to reload" has bitten me more
than once...

~~~
Sebb767
Me too! Especially with the short expiration times of LetsEncrypt. But I
really don't want to put `nginx -s reload` in the Cron, in case I'm tinkering
with the configs and they're suddenly live (which only really happens at
staging or at home of course, but still).

~~~
emilburzo
You can use `nginx -t && nginx -s reload` for that.

It will first check the configs/paths, and only then, if successful, signal
nginx to reload.

~~~
Sebb767
That's what I usually do. My problem is that I might be adding a location and
nginx reloads between that and adding access restrictions (i.e. because I took
a break to google).

------
LeonM
This one bit me today and abruptly ended my day at the beach.

The certificate reseller advised my customer that it was okay to include the
cross-signing cert in the chain, because browsers will automatically ignore it
once it expires, and use the Comodo CA root instead.

And that was true for browsers I guess. But my customer also has about 100
machines in the field that use cURL to access their HTTPS API endpoint. cURL
will throw an error if one of the certs in the chain has expired (may be
dependent on the order, don't know).

Anyway, 100 machines went down and I had a stressed out customer on the phone.

~~~
lucaspiller
HTTP clients in programming languages are not as smart as web browsers when it
comes to verifying SSL certificate chains. For example, if the chain presented
by the server is missing intermediate certificates, modern web browsers are
able to fetch those intermediate certificates without issue. Most HTTP clients
do not do that though, and instead will throw a cryptic error, something along
the lines of "unable to get local issuer certificate". This is known as a
'incomplete chain' error.

Earlier this year I added SSL verification to a website uptime monitoring
service I run ([https://www.watchsumo.com/docs/ssl-tls-
monitoring](https://www.watchsumo.com/docs/ssl-tls-monitoring)) and it wasn't
anywhere near as simple as I thought it would be. There's so many edge cases
regarding verification, and languages usually don't expose the full errors in
exceptions, then you have errors like this which only affect a subset of
clients.

~~~
tass
Hi Luca, Just tried this out. I added a URL which has an expired root cert,
but it passed your test.

Let me know if I can help with more info.

------
admax88q
Honestly, certificates should never expire or should expire daily. If
certificate revocation works then its pointless to have expiring certs. Its
just a mechanism for CAs to seek rent.

If certificate revocation doesnt work then certs need to expire super
frequently to limit potential damage if compromised.

A certificate that expires in 20 years does absolutely nothing for security
compared to a certificate that never expires. Odds are that in 20 years the
crypto will need to be updated anyways, effectively revoking the certificate.

~~~
josephcsible
Exactly. Certificate expiration has never really been about security. It's
purely for practicality, so that CRLs won't grow without bound.

This is especially true now that we have OCSP stapling. From a security
perspective, a short-lived certificate is exactly equivalent to a long-lived
certificate with mandatory OCSP stapling and a short-lived OCSP response, but
the latter is much more complicated.

And in this case since it's a root, it goes even further than that. Root CA's
can't be revoked anyway, so if they're compromised, a software update to
distrust it is required. There's really not a good reason for them to expire
at all.

~~~
sleevi
It’s not true that expiration is not about security. Dan Geer’s talk in 1998,
noted at
[https://cseweb.ucsd.edu/~goguen/courses/275f00/geer.html](https://cseweb.ucsd.edu/~goguen/courses/275f00/geer.html)
, is just as relevant today in the design of key management systems.

Expiration is not “just” about cryptographic risk either; there are plenty of
operational risks. If you’re putting your server on the Internet, and exposing
a service, you should be worried about key compromise, whether by hacker or by
Heartbleed. Lifetimes are a way of expressing, and managing, that risk,
especially in a world where revocation has a host of failure modes
(operational, legal/political, interoperability) that may not be desirable.

As for Root expiration, it’s definitely more complicated than being black and
white. It’s a question about whether software should fail-secure (fail-closed)
or fail-insecure (fail-open). The decision to trust a CA, by a software
vendor, is in theory backed by a variety of evidence, such as the CA’s
policies and practices, as well as additional evidence such as audits. On
expiration, under today’s model, all of those requirements largely disappear;
the CA is free to do whatever they want with the key. Rejecting expired roots
is, in part, a statement that what is secure now can’t be guaranteed as secure
in 5 years, or 10 years, or 6 months, whatever the vendor decides. They can
choose to let legacy software continue to work, but insecurely, potentially
laying the accidental groundwork for the botnets of tomorrow, or they can
choose to have legacy software stop working then, on the assumption that if
they were receiving software updates, they would have received an update to
keep things working / extend the timer.

Ultimately, this is what software engineering is: balancing these tradeoffs,
both locally and in the broader ecosystem, to try and find the right balance.

~~~
josephcsible
I don't see anything about expiration in that talk.

If you don't have a strong revocation system, then your host is vulnerable
whether or not you have expiration, since attackers aren't going to wait until
the day before your key expires to try to steal it.

In general, when a CA's root certificate expires, it creates a new one and
gives it to browser and OS vendors. What's the difference between the CA
continuing to guard their old private key, and starting to guard the new
private key?

~~~
sleevi
Search for “Needham & Schroeder”

It’s not either/or expiration vs revocation; they are the same thing.
Expiration is natural revocation and a ceiling function to the overall cost.

The statement “when a CA’s root certificate expires, it creates a new one” is
not a general statement. That’s the exception, rather than the rule, as
evidenced by just watching the changes to root stores over the past 30 years.
More CAs have left the CA business / folded / been acquired than have carried
on. A classic example of this is the AOL root, for which the long-standing
scuttlebutt is that no one knows what happened to the key after AOL exited the
CA business. The reason it’s scuttlebutt, as opposed to being a Sky is falling
DigiNotar, is that the certificate expired. Or, for that matter, look at how
many CAs have been distrusted. Expiration fits as a natural bound for legacy
software that doesn’t receive updates, failing-secure rather than failing
insecurely.

~~~
josephcsible
When I search for that, all of my hits are about a key-transport protocol that
doesn't seem related to certificates at all.

Expiration and revocation are far from the same thing. If my site's private
key gets stolen, I want clients to stop trusting it today, not next year.

Expiring roots means that if a device stops getting updates from its vendor,
it will gradually become a brick even if no CAs do anything wrong.

------
sleevi
Andrew Ayer has a write-up about this at
[https://www.agwa.name/blog/post/fixing_the_addtrust_root_exp...](https://www.agwa.name/blog/post/fixing_the_addtrust_root_expiration)

At the core, this is not a problem with the server, or the CA, but with the
clients. However, servers have to deal with broken clients, so it’s easy to
point at the server and say it was broken, or to point at the server and say
it’s fixed, but that’s not quite the case.

I discussed this some in
[https://twitter.com/sleevi_/status/1266647545675210753](https://twitter.com/sleevi_/status/1266647545675210753)
, as clients need to be prepared to discover and explore alternative
certificate paths. Almost every major CA relies on cross-certificates, some
even with circular loops (e.g. DigiCert), and clients need to be capable of
exploring those certificates and finding what they like. There’s not a single
canonical “correct” certificate chain, because of course different clients
trust different CAs.

Regardless of your CA, you can still do things to reduce the risk. Using tools
like mkbundle in CFSSL (with
[https://github.com/cloudflare/cfssl_trust](https://github.com/cloudflare/cfssl_trust)
) or [https://whatsmychaincert.com/](https://whatsmychaincert.com/) help
configure a chain that will maximize interoperability, even with dumb and old
clients.

Of course, using shorter lived certificates, and automating them, also helps
prepare your servers, by removing the toil from configuring changes and making
sure you pickup updates (to the certificate path) in a timely fashion.

Tools like Censys can be used to explore the certificate graph and visualize
the nodes and edges. You’ll see plenty of sites rely on this, and that means
clients need to not be lazy in how they verify certificates. Or,
alternatively, that root stores should impose more rules on how CAs sign such
cross-certificates, to reduce the risk posed to the ecosystem by these events.

~~~
mehrdadn
Given you mention OpenSSL is _currently_ terrible at verifying "real"
certificates: why doesn't e.g. Google just throw a bit of money at them and
fix their bugs when they're clearly so well-known? It seems like such an
obvious thing to do for a company whose entire business is built on the web.
Is there really too little benefit to justify the cost of the engineer(s) it
would take even for big companies? Or are the projects somehow blocking help?

~~~
sleevi
Google has, in the past. Look at the ChangeLog for 1.0.0 - the massive
improvements made (around PKITS) were sponsored by Google.

Google has a healthy Patch Rewards program (
[https://www.google.com/about/appsecurity/patch-
rewards/](https://www.google.com/about/appsecurity/patch-rewards/) ) that
rewards patches to a variety of Open Source Projects.

Google also finds a variety of projects through the Core Infrastructure
Initiative (
[https://www.coreinfrastructure.org/](https://www.coreinfrastructure.org/) ),
which OpenSSL is part of
[https://www.coreinfrastructure.org/announcements/the-
linux-f...](https://www.coreinfrastructure.org/announcements/the-linux-
foundations-core-infrastructure-initiative-announces-new-backers-first-
projects-to-receive-support-and-advisory-board-members/)

------
elithrar
Great thread by Ryan Sleevi tracking the many (and growing) reports of issues
caused by this root expiring:
[https://twitter.com/sleevi_/status/1266647545675210753](https://twitter.com/sleevi_/status/1266647545675210753)

Top offender so far seems to be GnuTLS.

------
Mojah
This issue is largely cause by people still stuffing old root certificates in
their certificate chains, and serving that to their users.

As a general rule of thumb:

1) You don't need to add root certificates to your certificate chain

2) You especially don't need to add expired root certificates to the chain

For additional context and the ability to check using `openssl` what
certificates you should modify in your chain, I found this post useful:
[https://ohdear.app/blog/resolving-the-addtrust-external-
ca-r...](https://ohdear.app/blog/resolving-the-addtrust-external-ca-root-
certificate-expiration)

~~~
encoderer
Any guess at what percentage is this versus the case where these certs are
cross-signed with a newer root but older clients with outdated bundles do not
trust the newer root?

(At Cronitor, we saw about a 10% drop in traffic, presumably from those with
outdated bundles)

~~~
Mojah
Hard to say, as we don't have any insights into the client-side. But we can
say that only ~2% of our clients had expiring root certificates in their chain
in the last few weeks, so it's definitely a minority.

Since you don't control the clients in anyway, it might be that there are
clients that haven't updated their local certificate stores in ages and don't
yet trust the new root certificates.

------
MobileVet
This appears to have caused our Heroku managed apps to go offline for 70+
minutes.

[https://status.heroku.com/incidents/2034](https://status.heroku.com/incidents/2034)

Anyone that was already connected was able to continue accessing the sites but
new connections failed. This mostly affected web users.

Our main app server continued to crank along thankfully (also on Heroku) and
that kept the mobile traffic going which is 90% of our users.

Edit: adding Heroku ticket link

------
encoderer
I have never really wanted to go "serverless" until today.

TIL that I can buy a cert that expires in a year that is signed by a root
certificate that expires sooner. Still not sure WHY this is the case, but this
is definitely the case.

~~~
Sphax
As far as I understand your certificate is still valid but you need to remove
the intermediate certificate from your bundle. That was the case for me
anyway.

~~~
encoderer
If your traffic comes from a browser you are fine with this but if you're
coming from e.g. Curl you will find that you need to include an intermediate
chain.

(The reason for the difference being that browser stay up to date, many old
client systems do not.)

We ended up getting a new cert from a different provider.

------
snapetom
Yep. Got woken up early today for this. We renewed our cert about a month and
two days ago. Namecheap, the vendor, sent us the bad AddTrust cert in the
bundle. They weren't updating the bundles until two days after we renewed the
cert.

~~~
Cymen
Same exact thing happened to me (Namecheap, PositiveSSL, renewed roughly a
month ago). I went the reissue route on Namecheap and that fixed it (and I
ended up with a certificate chain that is one certificate shorter).

------
seibelj
DataDog failed this morning because of root CA issue.[0] Was a fun Saturday
morning with 5000 alarms blowing up my phone.

[0]
[https://status.datadoghq.com/incidents/6bqpd511nj4h](https://status.datadoghq.com/incidents/6bqpd511nj4h)

~~~
aarmenaa
Datadog has shit the bed for us multiple times in the last six months.
Unannounced breaking API changes, unaddressed bugs, and now their embedded
cert expired.

Our org is currently divided over further commitment to the service, or
leaving them entirely. They've made it very hard to argue in their favor.

~~~
SteveNuts
Their pricing doesn't scale well either, IMO. We have several hundred hosts
running and for some of the smaller instance types it costs just as much to
monitor than it does to run the entire machine.

------
luckylion
Any predictions how much the usage of _CURLOPT_SSL_VERIFYPEER, false_ will
increase in the next 7 days?

~~~
josephcsible
IMO, there's a bit of a design flaw with curl here. There should be an easy
flag to say "trust the particular certificate with this hash, no matter what's
wrong with it", but there isn't, so people instead use the one that says
"trust whatever certificate you get, no matter what's wrong with it".

~~~
user5994461
Trusting a specific hash would blow up when the service rotate its self-signed
certificate, defeating the point of ignoring certificate error.

~~~
josephcsible
If you're rotating a self-signed certificate, then how do you suppose that
clients securely trust it? Or if you just mean replacing it when it expires,
then this could instead be tied to the underlying public key alone, which can
be reused.

------
compumike
Stripe Webhooks are currently failing "for some users"
[https://twitter.com/stripestatus/status/1266756286734938116](https://twitter.com/stripestatus/status/1266756286734938116)
\-- some chance that's related.

Edit: for [https://www.circuitlab.com/](https://www.circuitlab.com/) we saw
all Stripe webhooks failing from 4:08am through 12:04pm PDT today with "TLS
error". Since 12:04pm (5 minutes ago), some webhooks are succeeding and others
are still failing.

Edit 2: since 12:17pm all webhooks are succeeding again. Thanks Stripe!

~~~
compumike
For backwards compatibility, I updated our intermediate certificates to
provide the AAA Certificate Services signing
[https://censys.io/certificates/68b9c761219a5b1f0131784474665...](https://censys.io/certificates/68b9c761219a5b1f0131784474665db61bbdb109e00f05ca9f74244ee5f5f52b)
to replace the expired 2nd intermediate certificate. (Modifying the
"GandiStandardSSLCA2.pem" file in my case.)

------
zouhair
I was wondering why Lynx started spouting some nonsense:

    
    
        $ lynx -dump https://wiki.factorio.com/Version_history
        
        Looking up wiki.factorio.com
        Making HTTPS connection to wiki.factorio.com
        SSL callback:certificate has expired, preverify_ok=0, ssl_okay=0
        Retrying connection without TLS.
        Looking up wiki.factorio.com
        Making HTTPS connection to wiki.factorio.com
        SSL callback:certificate has expired, preverify_ok=0, ssl_okay=0
        Alert!: Unable to make secure connection to remote host.
        
        lynx: Can't access startfile https://wiki.factorio.com/Version_history

------
0x0
Are we going to experience the same bug next year for all LetsEncrypt
certificates when the DST Root CA X3 expires? I guess modern devices could
deal with LetsEncrypt issuing directly from their own modern ISRG Root X1, but
would that leave legacy clients completely stranded (iOS <10, older versions
of Windows and Android...?)

~~~
tialaramex
You'll get two different but related bugs but yes, assuredly something will
break and somebody will be angry about it.

The first thing that'll happen is Let's Encrypt's systems will tell systems by
default to present certificate chains which don't mention DST Root CA X3. Lots
of systems will, as a result, automatically switch to such a chain when
renewing and you'll see a gentle trickle of weird bugs over ~90 days starting
this summer unless Let's Encrypt moves the date.

Those bugs will be from clients that somehow in 2020 both didn't trust the
ISRG root and couldn't imagine their way to using a different trust path not
presented by the server. Somebody more expert in crap certificate verification
software can probably tell you exactly which programs will fail and how.

Then there will be months of peace in which seemingly everything is now fine.

Then in September 2021 the other shoe drops. Clients that didn't trust ISRG
but had managed to cobble together their own trust path to DST Root CA X3 now
notice it has expired on services which present a modern chain or no chain at
all.

Those sites which deliberately used the legacy DST Root CA X3 chain to buy a
few more months of compatibility likewise see errors, but hopefully they at
least knew this was coming and are expecting it.

But there are also sites using crappy ACME clients that didn't obey the spec.
They've hard coded DST Root CA X3 not because they wanted compatibility at all
costs and are prepared for it to end in September, but because they just
pasted together whatever seemed to work without obeying the ACME spec and so
even though Let's Encrypt's servers have told them not to use that old
certificate chain they aren't listening. Those services now mysteriously break
too, even in some relatively modern clients that would trust ISRG, because the
service is presenting a chain that insists on DSR Root CA X3 and they aren't
smart enough to ignore that.

On the upside, lots of Let's Encrypt certs are just to make somebody's web
site work, and an ordinary modern web browser has been battle-tested against
this crap for years, so it will soldier on.

------
halukakin
Site24x7's ssl monitor caught this for us yesterday. And i thought they were
wrong as we purchased this last certificate just a few months back.

------
fragsworth
Some users on Safari (probably old versions) appear to be getting bad cert
warnings for [https://www.playsaurus.com](https://www.playsaurus.com). REALLY
glad I found this post here, it was driving me nuts.

~~~
CaveTech
FWIW I'm getting cert errors on that site on the latest chrome.

~~~
foepys
I don't. Isn't chrome using the systems CA store and encryption infrastructure
if possible? At least on Windows it's using Windows' built-in certs.

------
ric2b
CloudAMPQ (managed RabbitMQ) was affected:
[https://status.cloudamqp.com/](https://status.cloudamqp.com/)

Caused us some connections issues that required a restart of both our clients
and the rabbitmq cluster.

~~~
CarlHoerberg
yes, a bunch of older clusters were affected by this. They included an
intermediate of USERTrust that was signed by AddTrust, clients that didn't
check for alternate chains would then fail. We pushed the new chain (which now
only includes the server cert and the Sectigo RSA cert), and dynamically
reloaded the TLS listener in RabbitMQ, it should have solved it for most ppl,
email support@cloudamqp.com if it didn't for you. We're sorry we didn't pushed
this earlier. We were aware that the AddTrust would expire during the life
time of the server certificate, but we assumed that all TLS client would find
the valid chain regardless, that assumption was obviously wrong.

------
dvdkhlng
This just hit me via Debian's 'apt-get update': I'm using jitsi's package
repository which is hosted via HTTPS and seems to rely on the expired root-CA.
Certificate checks started failing for everybody a few hours ago [1].

That's quite bad, as I tried to do a clean re-install of jitsi-meet, and now I
have no installation at all any more.

[1] [https://github.com/jitsi/jitsi-
meet/issues/6918](https://github.com/jitsi/jitsi-meet/issues/6918)

------
userbinator
A bit of an aside, but

 _While Android 2.3 Gingerbread does not have the modern roots installed and
relies on AddTrust, it also does not support TLS 1.2 or 1.3, and is
unsupported and labelled obsolete by the vendor._

 _If the platform doesn’t support modern algorithms (SHA-2, for example) then
you will need to speak to that system vendor about updates._

I find things like that really really irritating. Crypto is basically maths,
and a very pure form at that, so should be one of the most portable types of
software in existence. Computers have been doing maths since before they were
machines. Instead, the forced obolescence bandwagon has made companies take
this very pure and portable technology and tied it to their platform's
versions, using the "security" argument to bait and coerce users into taking
other unwanted changes, and possibly replacing hardware that is otherwise
functional (and, as mentioned earlier, is perfectly capable of executing the
relevant code) along with all the ecological impact that has. Adding new root
certificates at least for PCs is rather easy due to their extreme portability,
but I wish the same could be said of crypto algorithms/libraries.

~~~
josephcsible
You're mad at the wrong people. The security argument is legitimate, so
there's no need for your scare quotes. The weaknesses in TLS older than 1.2
are real. You should instead be upset at device vendors for deciding to drop
support for devices so quickly. If they'd just keep supplying updates, or even
open-source everything so the community could, then this wouldn't be an issue.

------
niffydroid
Thankfully our uptime services spotted this earlier in the week. I'm terrible
with certs, so no idea why a cert we brought this year is even using this root
ca. To be honest, things like let's encrypt or cloud services which manage ssl
is a great help

~~~
abiogenesis
It uses two root CAs, one old and one new. Your web server must be serving the
intermediate certificate signed by the older CA.

------
bransonf
Fairly certain this affected Kroger. My sister called me this morning asking
to troubleshoot why her laptop was warning of an unsecured connection.

Perhaps a coincidence, but also likely that their cert expired.

------
taypo
We had our CI systems fail today because of this. They were running Ubuntu
16.04. Check the below thread, they say an openssl bug is also a contributing
factor. Removing the expired root CA fixed the issue for me. (edit: removed
from the clients)

[https://www.reddit.com/r/linux/comments/gshh70/sectigo_root_...](https://www.reddit.com/r/linux/comments/gshh70/sectigo_root_ca_expiring_may_not_be_handled_well/)

------
pixmin
Everything is fine with PKI and SSL certificates. It was a bug in OpenSSL
1.0.1 / 1.0.2 in dealing with two times cross-signed root CA. It is fixed in
1.1.1, but these older versions are still default on
RHEL6/RHEL7/Centos6/Centos7 and even Ubuntu16.04.

I think a large portion of online communications have been affected today.

~~~
josephcsible
It's really ironic that only "stable" distros were affected by this, and that
distros with software closer to bleeding-edge worked fine through it.

------
badrabbit
Shouldn't it be fairly simple to monitor expiry's that affect a lot of sites
using censys.io dataset?

------
aarbor989
We had to get an entirely new certificate to resolve this. We had recently
migrated our docker images to be based on Amazon Linux 2, and low a behold,
there was no easy way we found to upgrade to the required version of OpenSSL
on Amazon Linux 2. Was easier to just upgrade our certificates

~~~
spleen
This workaround fixed the problem on our servers:
[https://forums.aws.amazon.com/thread.jspa?messageID=945042&t...](https://forums.aws.amazon.com/thread.jspa?messageID=945042&tstart=0)

------
minaguib
I've maintained some high-level notes on this event, problems and fixes here:
[https://gist.github.com/minaguib/c8db186af450bceaaa7c452b76a...](https://gist.github.com/minaguib/c8db186af450bceaaa7c452b76a9901b)

------
vld
ip-api.com was also affected by this. After our first alert at 10:49 (cert
expired at 10:48:38) and a minute of being puzzled as to why our certificate
expired, we realized the root we bundled is the issue. We finished updating
our primary API servers at 10:55.

------
PixelPaul
The amount of times I told the CA that this will be an issue is a lot. And the
amount of time they replied saying there will be no issue is every single
time. Dam I hate CAs like Comodo

------
m-p-3
That explains why some of Integromat automations failed, they rely on Sectigo
when I checked this morning.

------
antaviana
How would this affect a code signing certificate issued by Sectigo last month?

------
ta17711771
Surely the current CA paradigm shouldn't continue to be accepted by the people
who keep infrastructure running anymore?

We need to do something.

~~~
jpalomaki
At least for many web apps the future is likely automatically created and
managed domain validated certificates. Amazon and Azure provide these free of
charge and then you have Let’s encrypt.

This does not change the CA paradigm, but removes many operational issues.

