
If you’re not using SSH certificates you’re doing SSH wrong - mmalone
https://smallstep.com/blog/use-ssh-certificates
======
Fradow
The author keep using the word "easy". I don't think we have the same
definition of that word.

A cursory read of the article is enough to gather a main point, which should
be a disclaimer: it's aimed at medium/big companies that have significant
security threat model and have a dedicated sysadmin team.

If you have neither, this is adding operational complexity for no real gain.
When your threat model is small, potential attackers are not sophisticated
enough to target SSH access. And configuring this and properly using it is a
big waste of time. You might even lock yourself out of your servers because of
a mistake or lost keys (that's my biggest fear which prevents me from ever
switching to certificates unless I get a dedicated sysadmin).

Yeah there are benefits, but they are just too abstract to bother for small
companies.

~~~
mmalone
Am author.

> medium/big companies

My goal (or our goal at smallstep) is to build tools to make this easy enough
that it makes sense for small teams. If you have one client and one server,
pubkey authn will probably always be easier. And as long as it's the default,
certificate authentication will always require some amount of configuration.
But I think good tooling with baked in defaults to encourage best practices
can make this easy and quick enough to deploy to be a good idea for most non-
trivial SSH deployments.

However, much of this is aspirational at the moment. So your point remains.
The point of the post was to raise awareness and start a discussion about
what's needed to make this feasible for pragmatic people.

> dedicated sysadmin team

I don't think a dedicated sysadmin team should be necessary. On the contrary,
with the right tooling, and unless you're just completely punting on any
rational management of SSH keys, I think certificates are easier to
administer.

> potential attackers are not sophisticated enough to target SSH access

The more common attack vector is a lost/stolen/compromised laptop or a private
key that's accidentally committed to a repository. This is a _very real_
vector. It happens in real life on a regular basis. Since certificates expire
a lost/stolen machine or committed private key will become worthless for
accessing internal infra very quickly. So it reduces the attack window, which
is a real and significant improvement. For a compromised machine, SSH
certificates aren't a cure-all, but they're at least as good as pubkey. They
make it easier to keep private keys off disk, providing better cover. And they
make it harder for attackers to exfil private keys, cover their tracks on the
compromised laptop, and do bad things with them later.

> You might even lock yourself out of your servers

You could do this with pubkeys too. If this happens, cloud VMs typically have
a virtual serial interface you can use to remediate. You can leave pubkey
authn enabled and still use certificates day-to-day, so you could keep a
backup pubkey available for a few select admin users that you trust.
Alternatively, you could issue a long-lived certificate in a secure place for
use in an emergency. If this is the thing that's preventing you from trying it
out I hope these ideas get you over the hump! If they don't, I'd love to hear
why!

> they are just too abstract to bother for small companies

This is a very common problem with security stuff :(. That's why I think the
headline feature here is "SSO for SSH". It's an operational improvement and
better UX. The security benefits come for free.

------
exabrial
Distributing user keys makes audits near impossible. Not a fan of
~/.ssh/authorized_keys.

We decided to keep user ssh keys in ldap and pin the key of the ldap cluster.
If LDAP were to go down, we have local [only] accounts we can access the box
via the console.

~~~
wglb
Can these keys be revoked?

~~~
exabrial
Absolutely. Since the keys are checked in LDAP at login, it happens
instantaneously.

------
aloknnikhil
It's also worth mentioning that compromised certificates can be explicitly
revoked.

[https://serverfault.com/questions/264515/how-to-revoke-an-
ss...](https://serverfault.com/questions/264515/how-to-revoke-an-ssh-
certificate-not-ssh-identity-file)

------
gnufx
This seems to be aimed at a single security domain like a company. That's not
what you have in many cases, and supplying a public key to a remote site for
login, revision control, etc. is actually tractable. There is actually
longstanding experience with a "global" PKI in the research community with
Globus, specifically as a hacked-up GSSAPI mechanism with ssh. It was widely
detested as I saw a few years ago. (They could have used Kerberos without
inventing and messily implementing all that.)

------
adontz
Actually, there is third option, AuthorizedKeysCommand, which I like the most
for small to medium business.

A program, script or any executable, which outputs (prints in stdout)
authorized_keys style formatted data. This data can be pulled from any source.

Thus, if public keys are stored in LDAP and script pulls keys for active users
only, public key can be easily revoked by disabling corresponding user
account.

~~~
mmalone
Hey that's interesting!

I still think it's harder to do this than it is to use certs, since you'll
need to build some tooling to securely distribute keys. But that would vary by
environment, for sure, and it's conceptually simpler for folks who don't know
how certificates work I guess.

Do you know when AuthorizedKeysCommand was added to sshd? Is it available in
most distros?

It's too bad there isn't a RevokedKeysCommand! That would make certificate
revocation a lot more flexible. I wonder how hard it would be to add.

------
yabadabadoes
Interesting reading, but it kind of glosses over the important problems for
me. For example:

Do all OS distributions have new enough openssh, is it ever built with missing
support by default?

what extensions are necessary in the certs to be trusted for ssh or the CA
cert (is your average test CA cert going to be valid for ssh without
tweaking)?

Do any (or all?) cloud server products regularly accept cert based
configuration just as they accept public key lists?

How will this interact with other features like using a pkcs mechanism for a
token key?

If revoking does work, that means the system will deny cert based auth when it
has internet connectivity problems? If it wont isnt building more general
pools of certs more dangerous than restricting systems to specific lists of
public keys?

Also questions like what does an eaves dropper see in the clear and how does
it differ from what they see with raw public key?

(I find it insane that ssh ultimately ends up with a weaker trust model than a
pretty average website in a typical install due to less certainty and clarity
about what a normal certificate based configuration should look like.)

~~~
mmalone
Am author.

Yea, fair... there's probably enough to cover for a whole follow-up on nuts &
bolts and these sorts of considerations. This post was more about raising
awareness and it was already super hard to keep it to 3000 words :P.

I think I can at least partially address all of these points though.

> Do all OS distributions have new enough openssh, is it ever built with
> missing support by default?

I haven't done an analysis but certificate authentication was added in OpenSSH
5.4 in March, 2010. So it's almost _ten years_ old. So yea, I think all
reasonably current distros have support at this point.

> what extensions are necessary in the certs to be trusted for ssh or the CA
> cert (is your average test CA cert going to be valid for ssh without
> tweaking)?

To clarify, SSH doesn't use X.509 like TLS/HTTPS. OpenSSH invented its own
certificate format.

The docs for `ssh-keygen` have pretty good info on this.

* [https://man.openbsd.org/ssh-keygen.1#CERTIFICATES](https://man.openbsd.org/ssh-keygen.1#CERTIFICATES) has basic cert info * [https://man.openbsd.org/ssh-keygen.1#O](https://man.openbsd.org/ssh-keygen.1#O) has info on some cert options / extensions

If you have experience administering _nix boxes & SSH the options / extensions
should look pretty familiar. Basically, it's a lot of the same stuff you'd
ordinarily put in config files on each host, but instead you're baking it into
certificates.

> Do any (or all?) cloud server products regularly accept cert based
> configuration just as they accept public key lists?

As far as I know, none of the cloud providers work with SSH certificates out
of the box. It's kind of weird, actually, because it seems like super low-
hanging fruit and would be a better user experience. Instead, they use public
key authentication and deliver pubkeys to an authorized keys file for you (via
their metadata APIs).

That said, it's super easy to set this up yourself on a cloud VM. You could
easily bake the necessary bootstrapping into an AMI. Alternatively, you could
use a startup script. One of the things we're working on is streamlining this
experience. For now, here's a gist demonstrating the concept (it assumes you
already have `step-ca` running):

_
[https://gist.github.com/mmalone/a5980799c9c6ec9d6530372d5b60...](https://gist.github.com/mmalone/a5980799c9c6ec9d6530372d5b609e7a)

Most of the work there applies to other tools but obviously you'd use a
different client instead of `step` to get the certs.

> How will this interact with other features like using a pkcs mechanism for a
> token key?

Oh hrm. This is a good question but I'm not sure, mostly because I haven't put
SSH keys in a PKCS module myself. I know it's possible. You can implement the
`ssh-agent` API pretty easily and back it by whatever you want. This is on our
roadmap but we haven't done much investigation. It's possible that it _just
works_ already with standard `ssh` and `ssh-agent`s.

For MFA, you can still use PAM authentication with something like Duo if you
want. Certificate authentication basically works exactly like public key
authentication, but it manages public key distribution differently (using
certificates).

That said, personally, I think the right place for MFA is during certificate
issuance rather than on each SSH connection. That way you're not constantly
asking people to MFA. This decision is somewhat subjective and depends on your
risk profile / threat model.

> revoking

Generally the best practice for certificates is to issue them for a short
enough duration that you're comfortable not doing revocation. I think a work
day is pretty safe and reasonable, but some people may not be ok with that and
choose something shorter. Of course you'd have to reauthn more frequently then
to get a new certificate. A thought to ponder: for any reasonably sized
infrastructure public key removal won't be instantaneous either.

That said, SSH does have cert revocation capabilities. You create a "revoke
file" and configure it using a `RevokeFile <filename>` directive in your
`sshd_config`. So it's managed on each host, which means there are no
connectivity issues, but also means you have more of a management problem (on
par with managing authorized_keys :/). For that reason, if you really need
revocation, my recommendation is to do revocation checks in a bastion host.
Then you only need to maintain the revocation list in one place.

> Also questions like what does an eaves dropper see in the clear and how does
> it differ from what they see with raw public key?

Great question. I should know the answer to this, but I don't. I just spent
like 20 minutes perusing the specs at:

* [https://www.openssh.com/specs.html](https://www.openssh.com/specs.html)

and it's not obvious just from skimming. Also couldn't find a definitive
answer from the google machine. I'll need to read the specs more closely.

That said, it _looks like_ pubkeys/certificates are exchanged during diffie-
hellman key exchange, before a symmetric key has been negotiated. The
implication is that this information is sent in cleartext. TLS1.2 does the
same thing, but 1.3 keeps certificates confidential. So there's _possibly_ an
infrastructure enumeration vector here -- a passive attacker could see
whatever info is in your certs (usernames, public keys, permissions). It looks
like normal public key authentication would also transmit a lot of this same
data, but maybe not the permissions. Again, I'd have to look at the spec more
closely. I think for most people this is not a big risk and the profile looks
pretty similar to pubkey authn, but I'm saying that without a complete
understanding so caveat emptor.

> I find it insane that ssh ultimately ends up with a weaker trust model than
> a pretty average website in a typical install due to less certainty and
> clarity about what a normal certificate based configuration should look
> like.

Lol yes. Me too.

To be fair though the trust on first use model with raw pubkeys made SSH way
easier to deploy, which was one of the things that allowed SSH to displace
telnet and, from a security perspective, it was obviously a _huge_ improvement
over telnet. So I don't fault the SSH folks for this. But it's 2019 now so we
can/should do better!

~~~
yabadabadoes
Thank you very much for your answers and thoughts about my many questions! I
hope you will go on to write more about the subject. Most of my questions
relate in some way to my last time setting up a new network about 2 years ago.
At the time I found that each question relating to certs seemed to lead me to
two more questions or some ambiguity and I kind of forgot how all the problems
ended up somewhat interlinked and related to this non-x509 cert format and/or
selecting x509 or gpg models of token use.

It looks like there is some support in both keygen and ssh-agent at least for
combining a cert file with a pkcs11 key and maybe I can work out how to get
custom format in and out of pkcs15 storage. The situation with gpg card applet
and gpg-agent ssh support looks like it may be similar..

My general view is that a smartcard with a non-extractable key, corresponding
x509 cert, and pin manages the minimal something you have, know, and identity
in the only way that everyone at least intended to support at some point. I.e.
ssh actually works well with it using the raw public key, as client certs
pkcs11 should work with the browsers, some of the vpn implementations have
pkcs11 support.. I think even kerberos should be able to bootstrap identity
from it with pk-init. All the other applets combined into commercial keys for
MFA are all very interesting but tend to each be piecemeal in protocols they
could be used for.

------
cbanek
Not that I disagree that putting public keys everywhere is somewhat annoying,
I feel like the same arguments could be made about Kerberos and ssh. And I
think Kerberos handles the lifetime a lot better than short lived
certificates.

~~~
mmalone
Am author.

Yea Kerberos is another option. If you have all the necessary pieces, that
is... you’d need managed devices, an LDAP/AD setup, DNSSEC & SSHFP, PAM &
various agents on servers.

Certificates offer all the same benefits, and a few more, with less work &
fewer moving pieces. They’re also more flexible since you can control the auth
flow to _get_ a cert.

~~~
AndyMcConachie
What do SSH X.509 certs provide that DNSSEC and SSHFP RRs don't? SSH X.509
certs reminds of MTA-STS for SMTP, a shoe horning of the broken web PKI into
another protocol that doesn't need it.

Also, I'm not convinced that TOFU for SSH is really all that terrible. Maybe
it's just my use case and my situation, but I SSH to the same servers over and
over again. And if I need to give access to someone new I give them the public
key first.

SSH X.509 looks like a complex solution to a not very big problem that can be
more easily solved with DNSSEC and SSHFP.

~~~
tptacek
What do SSH x.509 certs do that DNSSEC doesn't? Sure, I'll bite:

* They put authentication to your SSH servers entirely under your own control, without escrowing keys to the operators of the TLDs.

* They're deployable independently, without enrolling all of the rest of your infrastructure into DNSSEC and all its unreliability and complexity.

* They integrate with IDPs, so that users can get short-duration authenticators based on MFA logins, without storing long-lived secrets on their machines.

SSH X.509 certificates have nothing to do with the web PKI, other than that
they use the same encoding. Engineering orgs that use SSH CAs run their own
CAs. That's the point.

Happy to have helped!

~~~
gnufx
I'd expect a policy dictating using ephemeral certificates like Kerberos keys
would actually have them for web services too, not just ssh (as with Globus).

------
rob-olmos
Is there a project for being the SSH CA that also includes RBAC for users
without costing $hundreds/$thousand per month?

I've seen projects that'll do the SSH CA part, but unless I overlooked it I
didn't see any RBAC or mapping ability to restrict users to specific servers.

~~~
mmalone
Yea the RBAC part is tricky. I think you need PAM or some sort of agent on the
hosts to do that if you need individual user accounts (vs "principal" accounts
that map to server groups like "frontend", "backend", "database"). The latter
isn't a terrible option when you use certs, fwiw, since you can still get good
audit by encoding the actual user in the certificate, which gets logged to
`auth.log`. Still, not ideal for everyone.

This is one of a handful of problems we're working to solve / streamline for
an SSH product. We might even open source this bit, but not sure yet. If we
don't, it still probably wouldn't be $hundreds/$thousands per month unless you
have a huge org. If you're interested at all I'd love to talk more about this
-- what your requirements are, whether you want hosted/not hosted, whether you
want to pay at all / how much you'd be willing to pay, etc. Easiest thing to
do to stay in the loop is send us your email at [https://smallstep.com/sso-
ssh/](https://smallstep.com/sso-ssh/) by "requesting early access" and we'll
reach out to schedule a convo & keep you updated as we make progress. Or just
watch our blog! :)

~~~
rob-olmos
Thanks for your reply and offer. I'll look into the product a bit more when I
get some free time.

I think I was a bit vague about RBAC. I'm thinking that the user account
creation isn't vital in this use case since the audit log holds who
generated/access. It's more about controlling who can generate a cert for
what.

Eg, I think Teleport's model has the RBAC config at or near the CA part, so
the cert generation either happens if authorized or denied if not.

~~~
mmalone
Oh, our open source stuff has basic controls around that... using OAuth OIDC
you can only get a certificate for yourself (right now it's just a direct
mapping so it goes from, e.g., mike@example.com to just `mike` as the
principal in the cert). For hosts our instance identity document stuff for
cloud VMs can be configured to only issue certificates for the VMs hostname.
Or at least it should be able to do that. I think there's a bug we're
currently working on.

We also have a one-time-token mechanism that you can have some trusted infra
like Puppet issue to hosts as they come up. The token includes the specific
name that you want bound in the cert. It can only be exchanged for a cert with
that subject.

Super secretly: we also have a whole policy language and enforcement engine
that we'll eventually get around to doing something with and would address
this issue pretty comprehensively.

Edit:

After reading your comment again I think I still might be misunderstanding
your use case. Are you talking about having principals like "frontend" and
"backend" and then having RBAC that says "mike can get a cert for frontend"?

------
jiveturkey
So naive, so wrong. It's hard to know where to start.

> Operating SSH at scale is a disaster. Key approval & distribution is a silly
> waste of time. Host names can’t be reused. Homegrown tools scatter key
> material across your fleet that must be cleaned up later to off-board users.

Outright lie or incompetence at worst, or a "my salary depends on not
understanding this" case at best, being demonstrated here. Even if you don't
have chef/puppet/etc managing at least some level of fleet configuration (if
you don't, your org is hopeless anyway and ssh-at-scale is not a first problem
to solve), it's oh-so-easy to put just your pubkeys somewhere (S3, eg) and
manage them via, say, git.

Next problem is that openssh-style certs do not support cert chains. So you
cannot revoke an intermediate. A huge aspect of PKI is that you must be able
to recover from a compromise. The rest of this point is left as an exercise
for the reader.

You have to secure the CA! You have to properly authorize signing requests!
These things fall into the realm of hard.

~~~
mschuster91
> Outright lie or incompetence at worst, or a "my salary depends on not
> understanding this" case at best, being demonstrated here. Even if you don't
> have chef/puppet/etc managing at least some level of fleet configuration (if
> you don't, your org is hopeless anyway and ssh-at-scale is not a first
> problem to solve), it's oh-so-easy to put just your pubkeys somewhere (S3,
> eg) and manage them via, say, git.

Puppet's ssh key management only adds and removes public keys that you
specify, but _leaves other keys intact_. Which means that a hacker can place a
backdoor key in authorized_keys that is not going to be spotted by Puppet.
Cert-only auth prevents this class of attack from the beginning.

~~~
jiveturkey
Why would you use the puppet-esque style of management then?

Just generate the complete list of authorized_keys yourself, don't let puppet
modify it for you. But do let puppet distribute the file. It's beyond easy.

But this is a distraction. Anyone that can influence the insertion of a
permanent backdoor in authorized_keys is _almost_ certainly in deep enough to
not care about any specific mechanism. Being able to populate arbitrary,
unaudited data on your hosts is an intolerable situation to be in.

And anyway, you're wrong. The 'purge_ssh_keys' directive can be set to 'true',
disabling merge.

> Cert-only auth prevents this class of attack from the beginning.

Cert auth is a subset of pubkey auth, which includes both. You could set
AuthorizedKeysFile to 'none' but you'd have to explain how an attacker that
can influence authorized_keys cannot also influence sshd_config. Of course
these are different files likely populated via different mechanism, but the
threat model you are implying here seems overly specific. It might work for
some people but, given the tone of the article (unspecified 'you' are doing in
wrong), I don't buy it.

------
kerng
Why not just use Kerberos?

~~~
dikei
For Kerberos, you would need fallback authentication methods always active, in
case Kerberos servers went down. But fallback auths are potentially security
risks.

For client certificate, the fallback can be long-term certificates, locking up
in a safe, that should only be opened when all other ephemeral certificate
fail. I think this is less risky.

~~~
felipelemos
You can always have a long term certificate locked in a safe as a fallback to
Kerberos

~~~
dikei
The point is you don't need a server-side fallback if you don't use Kerberos,
since the server can validate login on its own.

The fallback certificate is only client-side, and can be keep offline/air- gap
from the main credentials.

~~~
kerng
You have to provision the key for the fallback cert on server in both cases.
So not sure why one would be better then the other...

~~~
dikei
You don't need to provision key for fallback cert on server, because the
server simply trust all certificates that was signed by the CA.

You only need to get the CA to sign the fallback cert.

------
omani
if you are not using the right title for your articles, you are a bad author.

^ sounds just bad, doesnt it? be respectful to others and thou shall succeed.

~~~
mmalone
Well if you want my honest opinion that statement sounds accurate to me...

There’s a difference between being respectful and walking on egg shells. I can
respect you and still say, bluntly and to your face, that you’re wrong. In
fact, I think _not_ doing that would be disrespectful.

The title was tongue-in-cheek. It communicates a concept compactly. It’s
direct and draws attention. It’s hyperbolic, but hyperbole is a legitimate
literary tool. I am sorry if people are taking it too literally and somehow
making it about them and their insecurities and being offended. But I think
that’s as much on them as it is on the title.

This word police thing is sort of annoying and pedantic. If you actually read
the article it’s unboxed and explained carefully and fully. That’s what
matters.

------
eigenloss
"If you're not using < _thing that my company sells_ > you're doing it wrong"

~~~
mmalone
Author.

We don’t sell anything related to what’s in that post (actually we don’t sell
anything right now). It’s all open source. We believe everyone deserves good
PKI; that it’s an underutilized technology with bad tools. We have plans to
make money off other stuff once that’s in place.

~~~
otterley
Also, "you're doing it wrong" reeks of arrogance, and makes me not to want to
do business with you.

~~~
mmalone
Eh, it’s just an attention getter. Marketing and messaging using unequivocal
statements works. SSH cert authn is super useful tech that deserves better
marketing. Damned if you do, damned if you don’t. Sorry.

~~~
kjaftaedi
The amount of people who understand terms like 'grok', but don't know how to
use SSH certificates is effectively zero.

You need to get to familiarize yourself better with your audience.

~~~
mmalone
> The amount of people who understand terms like 'grok', but don't know how to
> use SSH certificates is effectively zero.

I don’t want to be antagonistic but that’s just not true. I’ve talked to _a
lot_ of people about this. Maybe 10% of people I’ve talked to know how to use
ssh certs. These are technical people who are very smart and know what they
are doing, and know what “grok” means. That’s why I wrote the post.

If you already knew the info in the post then cool! Sorry to waste your time.

~~~
teh_klev
You got my attention. I've got ~25 years Unix/Linux experience under my belt
and didn't know about SSH certs and now I do. So appreciated.

~~~
mmalone
<3

------
chrisMyzel
Thanks for your article :thumbsup:

