
Hitless TLS Certificate Rotation in Go - diogomonicapt
https://diogomonica.com/2017/01/11/hitless-tls-certificate-rotation-in-go/
======
colmmacc
This is a good write up, and it's awesome to see on-line rotation of
certificates.

But (there was always a but coming) ... the word "rotation" is over-used here
and very dangerous, because it doesn't emphasize what's important. To many it
means "deploying a new credential". That's not that important at all, at best
it's a means to an end at worst it's make-work. What's important is that
credentials are revoked. It's exactly like the important part of backup
systems being that we can restore (and we should really call them "restore"
systems).

When a credential becomes compromised, what you want to do is revoke it and
make sure it stays revoked, other wise the attacker's goal is complete. So
think of it a "Revocation" system, and call it that.

Viewed in that context, it become more apparent that the write-up doesn't
mention, or test or check, that the credential actually is revoked and doesn't
work any more. But that's the most critical step. Even if you're relying only
on expiration times (which seems unsafe!) it's important to check for broken
checks (like fail-open configurations that let everything in), broken clocks,
etc ...

~~~
cestith
I think the more common case than a revocation is replacing an expiring
certificate. I don't have hard data. It sure seems to me that short-lived
certificates tend to rotate out far more often than they need to be revoked.

~~~
viraptor
> It sure seems to me that short-lived certificates tend to rotate out far
> more often than they need to be revoked.

For large majority of companies, would they even spot that their keys have
been stolen? That's a few steps before revocation itself.

------
zalmoxes
Nice writeup!

I wish there was a CA out there that could let you requests new certs more
frequently.

Yes there's Let's Encrypt, which is amazing and works great but the
ratelimits[1] really kill you if you're not careful. I've had a few issues
where I've triggered the LE ratelimit with a production domain and got locked
out of making new certs for a whole week. I would gladly pay for an ACME CA
which does not enforce these ratelimits.

[1] [https://letsencrypt.org/docs/rate-
limits/](https://letsencrypt.org/docs/rate-limits/)

~~~
stanleydrew
I'm actively considering what it would take to set up a for-profit ACME CA,
and pricing based on rate limits might be the key business model insight I
needed. Thanks!

~~~
jvehent
LE needs $2MM/year to run and bootstrapped under an existing CA, so there's
your starting point ;)

It might be easier to resell someone else's certificates.

------
amenghra
[https://github.com/square/ghostunnel/](https://github.com/square/ghostunnel/)
is written in Go and does hitless cert reloading too.

------
kyrra
So I'm not 100% certain on this, but this flow seems like like it would be a
good candidate for atomic.Value[0]? The the mutexes could be removed entirely.
That way you don't need to get a lock on every config read.

[0]
[https://golang.org/pkg/sync/atomic/#example_Value_config](https://golang.org/pkg/sync/atomic/#example_Value_config)

~~~
diogomonicapt
Author here, I actually added a footnote exactly because of that fact:
[https://diogomonica.com/2017/01/11/hitless-tls-
certificate-r...](https://diogomonica.com/2017/01/11/hitless-tls-certificate-
rotation-in-go/#fn:3)

