Hacker News new | past | comments | ask | show | jobs | submit login
Ask HN: How do you manage SSH keys and SSL certificates in your company?
220 points by johnmathew123 on May 12, 2017 | hide | past | favorite | 107 comments
Do you use - Universal SSH Key Manager, ManageEngine Key Manager Plus, Cyberark Solution, or any other tool? Tips please.

It baffles me that nobody seems to have mentioned Hashicorp's Vault yet:

https://www.vaultproject.org https://github.com/hashicorp/vault

It comes with both, a full blown PKI (want a new cert? Use an authenticated REST endpoint!) and SSH backend. On top of that you can use it to manage accounts for many other third party applications as well (e.g. PostgreSQL, MySQL) while leveraging a multitude of authentication backends to delegate granular access for all of these features.

It's been an absolutely eye-opener for somebody like me who's used to managing SSH keys/PKIs (what a pain) and I wouldn't want to use anything else right now.

So let say you have bunch of servers that your team has to access via ssh, how would vault help adding users public keys add it to ~/.ssh/authorized_keys of each machine? i am familiar with vault locking/unlocking secrets but not sure if Vault can help centralizing and deploying those keys to individual machines.

A good solution to this problem is to use an SSH Certificate Authority - then you need only configure the CA certificate on each box, and you can either issue semi-long-lived certificates to each user who needs access, or use something like Vault to issue short-lived certificates intended for one-time use.

This model is described in an excellent post by Facebook from a while back [1].

(Disclaimer: I used to work at HashiCorp, and put this model into production there, though the Vault support for issuing short-lived certificates was added after I left)

[1]: https://code.facebook.com/posts/365787980419535/scalable-and...

Using an SSH Certificate Authority is also my recommendation, but be aware that it's relatively new, so associated tooling with it is not super mature yet. In particular, the user still needs some bits in order to login, and whether they generate it themselves and send it off to get signed, or the bits are generated for them on the backend and the user simply needs to receive them, there's a management aspect to it that isn't a totally solved problem with open source tools.

It's not a difficult problem, mind you, but there was custom code written that runs on developer laptops (OS X and Ubuntu) to support this workflow.

(Despite being a very similar looking string of bytes as more traditional pub/private keys, it's different in the SSH-Agent protocol, so don't assume all ssh-agent-looking daemons support it.)

ScaleFT uses a certificate based approach and issues short lived certain for SSH and RDP. Also has a policy engine to set additional access controls.

I tried to deploy them to grant developers limited time access to the hosts running their services. Didn't really work out and I killed the project.

One other handy feature that we're using is the ability for Vault to set user IDs in the signed key ID field which then get logged by sshd on the remote host. That way we can still audit who exactly logged in if using a single user on the remote host.

Agreed on an SSH Certificate Authority being a good solution. Another open source project (disclaimer: I'm affiliated with it) which provides approval controls is available here: https://github.com/cloudtools/ssh-cert-authority

As a follow up - I would now also take a _very_ serious look at Teleport [1] for managing SSH.

[1]: https://gravitational.com/teleport/

You have to build a new ssh CA for each class of server. There's no mechanism for granting access by host name. This becomes unmanageable fast.

You can represent that pretty easily with principals. Just make sure that each host has itself listed as a principal and generate keys that contain that same string. It's pretty common to have a set of increasingly general principals (I.e. Hostname, cluster, tier, root@everywhere).

> https://www.vaultproject.org

That link doesn't work. The correct one is https://www.vaultproject.io

Absolutely, thank you. I'm on mobile right now and was too lazy to check the reference m(

I'm the co-founder of a startup that has developed a tool for this called Userify[1].

It creates and removes local accounts and manages the ssh keys and sudo permissions centrally, so you don't have to worry about not being able to get in if your LDAP/AD is down. (Our Enterprise edition, self-hosted in your VPC or in your DC, can optionally integrate with LDAP or AD for dashboard logins, MFA, etc.) We also have an AWS server edition in the AWS marketplace. When you remove someone in the dashboard, their account is removed across all of the servers they have access to, and their sessions are terminated, but their home directory is retained for archival or forensics.

Our primary foci are paranoid levels of security (hardened, all data encrypted at rest with X25519, in motion with TLS), flexibility, and speed: fast to deploy or integrate (w/ ansible, chef, puppet, terraform, cloudformation), and fast to install (milliseconds), with both SaaS/Cloud and self-hosted versions and open source python agent (we call it the 'shim').

We were founded in 2011 and serve thousands of servers and users in real time; I'm first.last at userify for any questions, and/or info at userify.

1. https://userify.com

  > paranoid levels of security

  > # paste this into your terminal
  > # and get started in seconds..
  > curl -# https://usrfy.io/signup | sudo -sE

This brings up a more complex debate.

In order to be effective, signatures have to come from a root of trust. Where are you forming a trust basis for a signature in the curl? The GPG sigs built into your package manager are signed by the packager and form a cascading tree of trust, and the TLS certifcate authority system is also supposed to form a foundation for trust. (Sigs are on the file, but where do you get the packager's first signature in this place? curl from https piped into sudo doesn't decrease the CA trust model and a sig on the file as well doesn't add anything, since the sig would be easily replaced by anyone with the wherewithal to mod the original file.

Instead, that sig would be security theater, much like an EV-TLS cert. (Not that security theater can't still be valuable from a marketing perspective from people who think that checking sigs on an https site is still valuable.)

While there can be security value in sigs, it doesn't come into play in these circumstances without starting from a position of a signed sig from somewhere, and ultimately the smart and paranoid will still curl to a file and actually read the script. Actually, ultimately, if you are in a security sensitive situation, you are probably not in the cloud at all, or on dedicated instances, and then you should avoid SaaS and cloud software altogether and look at an on-premise solution like Userify Express/AWS/Enterprise, as then you will have that crucial foundation of trust in your initial purchase and you can tightly control that environment.

LDAP with caching using SSSD is a better solution. I just implemented it to replace using a configuration management tool. I use the configuration management tool to setup the LDAP server, and LDAP for authentication. But would take the configuration management tool over your toool. Your tool is too niche.

Fox pass has a nice hosted ldap solution which includes ssh key management.

I thought about doing something like this in the past, did you find this to be relatively profitable? Was it something that a lot of people wanted?

No, it's a terrible space. Pick something less noisy and with fewer competitors. Like log management. ;)

> their account is removed

Are you actually removing the unix account? How do you manage uids? How do you prevent reuse? How about over NFS?

Excellent question.. yes, we actually remove the account and pkill any existing sessions owned by the user. The OS might reuse UID's if it wishes. This will cleanly work over NFS as well. (On the server side as well, as long as the NFS server respects POSIX file locking semantics.) The agent (shim) is only a few hundred lines of readable Python that just scripts standard Linux commands, so it plays nicely with other tools -- even other logging or user management tools, PAM modules, etc. Also, the shim won't touch any user accounts that it didn't create (tagged in the comment field), so existing system or backup accounts are safe and won't ever be touched.

Here's the source code: https://github.com/userify/shim/blob/master/shim.py#L161

> The OS might reuse UID's if it wishes.

If the software doesn't manage UIDs internally then this is a recipe for disaster because keeping UIDs in sync is a PITA. It's a waste of time to do manually, but an even bigger waste of time to have to fight with the OS to get it right. If you're using NFS then you're screwed.

Usernames are one-for-one correlations with UID, so just don't allow users to personally owning files outside of their /home (they shouldn't according to FHS/LSB/POSIX/etc).

Can it verify or enforce that ssh secret keys are password protected?

This wouldn't be possible unless we generated or held the private keys. You cannot tell if the key is password protected from the public key alone.

LDAP as a public key service and servers configured via PAM to use that as source for pub keys. Nothing to distribute. Delete key from LDAP and second later user can't log on any machine. We are analysing teleport ssh suite for possible migration direction. SSL is different story :)

Can anyone log into hosts if LDAP is down? Is that a concern? Is there an easy way to mitigate the concern if you wanted to? Interested in exploring this solution, but worried about the availability risk.

Apart from the caching stuff mentioned in other replies, this is why you have multiple ldap servers in a HA replication setup.

Yeah running multiple servers is easy and the daemons are very mature at this point. LDAP daemons aren't crash prone. I haven't had an LDAP outage in over 10 years.

tl;dr: Yes! There's some caching mechanism, _providing_ the user has already logged into the host. Also there seems to be some replication that could help.

in my homelab I'm running RedHat IdM (which is their downstream version of _freeipa_). It's some value-add on top of LDAP on the server side, and sssd on the client side. My IdM runs in a VM on a server that isn't always powered on, and I'm still able to login thanks to sssd being configured to cache.. something. Clearly I haven't played with it as much as I should :).

In a Windows environment you mitigate this with multiple Directory servers.

Users can log onto machines if the credentials are cached, I think. Unsure how PAM handles this on Linux.

Various PAM module handle it differently.

The hotness these days is `sssd`, which transparently tries multiple directory servers (be they AD, IPA, or straight LDAP).

We use Ansible to deploy/manage peoples' SSH keys on our servers. From their laptop or a jumpbox (within the management VLAN) with their personal key (and a passphrase!) they are able to login in to all those servers. So logins are personal (as opposed to shared accounts which have to be updated when people leave). Now when new people arrive or when people leave we just run an Ansible playbook and all our 400+ servers are updated.

We also use Ansible for managing SSH keys, though we only have around 20 servers to maintain. We are a small team and dont have a huge staff turnover so managing keys is not huge issue so far. As we grow, however, this procesure may need to be revisited.

This works pretty well - we manage ~ 300 severs in a similar manner. One wrinkle is that I ask the new user to add their keys to their github account (ie github.com/<username>.keys ) which gives them the ability to add/del keys and jira me to update their user.

How do you manage to "remove" SSH keys. Since sensible is stateless, you probably run once for removing and once for adding a new key.

There are two states that we're talking about.

* Ensure that the following key is present.

* Ensure that only the following keys are present.

You can accomplish the second one with the 'exclusive' flag.


There's a third state as well: absent. Move keys from one file to another, and they go from allowed to denyed.

That said, I do like the "exclusive" tag too, I just tend to have a few one-off keys laying around as technical debt.

Not the OP, but say you were populating a directory of keys, you would simply remove it via file's state=absent and then repopulate.

You can have Ansible generate the authorized_keys file from a list of public key files, then when that list changes ansible will detect that the file has changed and will copy over the new file.

For a file, specify state=absent


Yes, we run it once/only when things (people) change.

Do you use Ansible for SSL as well? How do you store the keys?

Using ansible-vault, check out this gist:


Having a partly encrypted YAML file is also available since version 2.3.0

Key management and deployment are two different things. Installing up people's private keys on the hosts, doesn't involve any management. It's a standard sysadmin procedure.

Key managements involves policies such as: key rotation, algorithms to use, access restriction control, etc. It is way more daunting and complex. I don't see how ansible what ansible has to do with this, vaultproject on the other hand has most of the these features (and more) build-in. Never used it, but I know about it through a podcast presentation of vault in "Arrested Devops" IIRC.

Ansible has a featur called Vault that encrypts whatever you tell it to with a password. You can then upload that to version control, and when you want to deploy, you just pass the password to Ansible (either via a file or any environment variable, ideally).

I'm using it to migrate us over from keeping our SSL cert in Dropbox and 1Password and it works well.

SSH: Daily key generation and rotation with a 2FA registration system. Keys come from this local daemon, once per ui session / 12 hours. Port knocking to get through bastions. All bastions and config done via enterprise managed setup. You're not on a managed machine, you're not on prod/vpn/ssh/etc.

SSL: AWS KMS style solution which predates it on internal, and new system built on KMS. These systems are merging as KMS takes a lot of the load off. Then it's down to building a key distribution system. All secrets actually stored in this type of system. Devs rarely access secrets directly. Lots of nice "client" wrappers which look like DynamoDBClient, or Mysql driver but actually fetch passwords on a 5 minute rotation from the secret store so a rotation means push new key, wait 5 minutes, pull old keys. Secrets preferably never hit disk.

Not a big fan of vault because it wants to connect in to your hosts to manage and rotate passwords.

Interesting. What kinds of tools are you using for your SSH setup?

in house. you may be able to figure out where I work from my history which would explain why.

Regarding SSH keys, for anyone using Github organizations, we created a service called GitWarden[1] for automatic syncing of local user accounts/SSH keys with organization teams. This makes it very easy to manage users across your entire infrastructure directly through the Github UI, and have any team changes (add/remove members) reflected locally in near real-time. It also makes it incredibly easy for users to login, as they just use their Github username and any SSH key from their Github profile. There's a full demonstration in one of our blog posts here[2] for anyone interested.

This is a new service we recently launched, so any feedback would be greatly appreciated! My email is in my profile, if you would like to reach me directly.

[1] https://gitwarden.com [2] https://gitwarden.com/blog/2017/05/04/first-steps-with-gitwa...

Is there any logging of which ssh key is used? I don't think that I would be comfortable having access to work material from a personal machine, assuming the company is large enough to issue work desktops/laptops.

Not currently, but that's a great point that we haven't considered yet.

My company Foxpass (YC S15) has a product to manage SSH keys. It serves as access control too -- the keys are only available on the hosts where a user should have access.


We've been using Foxpass for more than a year now and can definitely recommend. We have Amazon Machine Images with the required packages installed and configured; we use the web interface to grant/revoke access to users and add SSH keys. So each user logs in as themselves.

Aren has been awesome with responding to emails and helping us set it up too.

How does it handle sudoers?

I didn't mention it in the original post, but Foxpass also has an LDAP endpoint so it can manage users and groups on your linux machines.

This means you can set up a linux group with sudo capabilities (sudo or wheel, usually) in /etc/sudoers. Then using Foxpass you can manage the membership of that group by adding users on a permanent or temporary basis.

For SSH:

I'm a fan of public key distribution via directory service (LDAP) with failover/caching via sssd. Private key storage on a smartcard (Yubikey in my case)

SSL: It depends on where it's being deployed -- AWS apps use ACM, situations where I need to access the secret from within a running system image (gce/ec2) generally uses the platform's kms long with their object store.

Can't help but notice this account only has two posts, both asking about commercial solutions folks use for secret management. I'm gonna take a stab in the dark and say this is market research?

I would prefer to not manage them at all. Let my cloud provider care about SSL (AWS already provides it).

I don't even want anyone (including me) to have access to our SSL certificates. There's no lock-in, I can always change my mind and buy new SSL certs somewhere else.

We were working on a more decentralized solution here: https://github.com/dedis/cothority/tree/master/cisc - unfortunately the update is still 1 month away...

This system uses a set of untrusted nodes that form a permissioned blockchain. Updating the chain requires a threshold of keys stored in the first block. The private keys are distributed over laptops/phones.

A person can have multiple devices that accept/deny new keys, while the servers check periodically for updates and can verify the new ssh-keys are legit by verifying the signatures.

I did a small demo at HotPETs 2016: https://www.securityweek2016.tu-darmstadt.de/fileadmin/user_...

I also hope to have it running again, soon. If anybody is interested, don't hesitate to contact us at linus.gasser@epfl.ch

Last time I did ssh key distribution on a global scale it was with cvs, cron, and makefiles. Self-generation of key with gated registration to central repository, auth'd via k5, distributed via cvsup (remember that?). Automatic expiry according to configurable policy. Two-person-rule for root trust and policy changes. We also modified sshd for better key usage auditing.

We used ScaleFT for Dreamhack (https://www.scaleft.com/blog/how-dreamhack-used-scaleft-to-s...) and it was very easy to work with. A very good product if your company simply wants something that works.

Another happy ScaleFT customer at Jungle Disk https://www.jungledisk.com/blog/2016/07/13/behind-the-scene-...

For self-hosted PKI checkout Lemur from Netflix:


We just release Kryptonite (https://krypt.co), an HSM for your SSH private key on your phone.

The private key never leaves your phone. Pair your phone with your computer to create a secure channel (channel is encrypted + signed with session keys only known to your computer and phone). Every time you SSH, the computer calls out to your phone over this channel and asks you to approve a signature.

SSH logins with simple push notification approvals. Code is public: https://github.com/kryptco.

We use CFengine 3 to manage personal accounts and ssh keys on servers. As CFengine continuously checks and converges configuration, if someone set bad permissions on their ssh file or remove the authorization it is automatically corrected before five minutes have elapsed.

At my previous company, we used Puppet to distribute both. SSH public keys were part of a developer's intranet profile, and SSL certificates were managed on a central repo.

Updates to either would be passed on to Puppet and distributed automatically across 1400 machines.

(Edit for typo)

The company I work for (Venafi), provides enterprise grade SSL certificate and SSH key management solutions. The on-premise enterprise solution provides certificate lifecycle management (monitoring for expiration, automatic renewal, and automatic installation). It provides a REST API so you can automate the process using your toolkit of choice. We've also recently released a new cloud service that offers free certificates for use in dev/test environments (and will have more functionality added soon!)

We use Yubikeys as GPG smartcards, and use them for gpg-agent as ssh keys.

Everyone puts their hsm keys on their github account (and removes all others).

We fetch the keys for each user from github on system init. e.g.


When we need to add/remove people, we just update the list of usernames in the script that fetches keys, and then kill off instances one at a time to force a redeploy.

>Everyone puts their hsm keys on their github account (and removes all others).

How do you ensure that nobody adds another key? I don't think github gives organizations visibility into key changes in user accounts.

A comment above, https://news.ycombinator.com/item?id=14321901, does a similar thing as you, except they use the Github API to fetch the list of usernames. I don't know your use case, but maybe this is useful to you too.


How do you do that? I mistook you for a Github employee at first.

Just add '.keys' after the profile URL. eg. for me this lists my ssh keys https://github.com/vtsingaras.keys

Appending .keys to your Github user profile URL returns SSH keys. Hugely useful in organisations that are using Github for code management already.

Is there any danger to adding keys to the Github account that are used elsewhere?

In 99% of cases, no, but longer answer:

Depends on the type and number of bits in the key.

If you create an ssh key that is already broken (say you managed to generate a... 512 byte RSA key), then an attacker would know what key he needs to generate before he attempts to authenticate with your server (or github).

But in practice public keys are meant to be public... very public. Like GPG keys! Here's a debian signing key https://ftp-master.debian.org/keys/archive-key-7.0.asc .

We can even verify that it's an RSA key with 4096... exactly what you could (should?) use to generate SSH keys. Effectively posting your public ssh key in the wild is as safe as debian posting their public signing key :)

``` pub rsa4096/0x8B48AD6246925553 2012-04-27 [SC] [expires: 2020-04-25] Key fingerprint = A1BD 8E9D 78F7 FE5C 3E65 D8AF 8B48 AD62 4692 5553 uid [ unknown] Debian Archive Automatic Signing Key (7.0/wheezy) <ftpmaster@debian.org> sub rsa4096/0x85215E51ADD6B7E2 2012-04-27 [E] [revoked: 2014-03-17] ```

"It depends".

I dislike that Github doesn't explicitly mention that it publishes your public keys, because they can be used to figure out your identity across multiple services. I believe someone a while back posted a demo on HN, where you could SSH in and it would greet you with "hello $yourname", which it derived from your github keys.

My advice: if you use different (user)names for different services, you should probably consider using different SSH keys for them as well. That is, if you don't want those two identities to be tied together.

Wrong solution.

The main problem that Fhilippo pointed out with the service is that ssh by default gives all keys present inn your keyring, or all keys named id_{rsa,dsa,ecdsa}. What should be done is to never present ALL your keys, but turn on "IdentitiesOnly yes" in your SSH config.

Project: https://github.com/FiloSottile/whosthere

Yes, that's part of the solution indeed.

Sounds like you've gone to a lot of effort to use a HSM (good) but then put all your trust in github.

At least github supports two factor auth, but you're still placing trust in a third party.

We use the Github API to retrieve public keys and teams to manage membership. Using this app to provision users on servers automatically and serve public keys. Read more here: https://github.com/cloudposse/github-authorized-keys

It supports caching to etcd (incase GitHub API is unavailable).

At Pivotal we're building CredHub for managing SSH keys, TLS certs, as well as passwords and arbitrary values.

It's specifically being built for particular systems to consume, and slowly generalised as it goes. The first consumer is BOSH. Typical BOSH deployments can have dozens or even hundreds of sensitive credentials to manage. Across a large company it quickly escalates into the thousands.

I am the author of a PAM module called Keeto that lets you reuse all key management processes of a PKI for X.509 certificates for OpenSSH. Furthermore a layer has been added to centrally manage access permissions for OpenSSH servers. For further information see: https://keeto.io

In our new stuff at aws we are storing public SSH keys in IAM and using a slightly modified version of https://github.com/widdix/aws-ec2-ssh to authenticate / synchronise users. So far it looks like it ticks all boxes for us.

The sysadmin just adds each employee's key to the servers. From reading this thread it sounds like bad practice?

Do you know how the sysadmin organizes himself? Is there a process where he gets notified of leaving engineers to remove their keys?

edit: typo

This works up to some point. It works if you have a small number of servers, a small number of people, and 'who can access what' is simple and obvious, but as soon as one of these conditions is exceeded, it becomes unmanageable.

We've been looking into deploying Conjur (https://www.conjur.com) for the SSH part of this question at least, as they provide a rotator which doesn't require changes to our server build (i.e. no special PAM modules or the like).

We don't.

As much as I hate do admit it, but I belong to the small Linux minority in my otherwise Windows only business.

We don't either, and we're mostly a Linux shop.

We only have ~10 employees. About half of them are technical staff with some access to certain systems (git, mostly). Only two users have global root access.

Vault (which was mentioned somewhere in these comments) looks interesting, but at our current scale it looks like it might be overkill.

keybox, quite handy.


We use Phabricator (installed locally) to manage most things in our company. We have a 20 member team, all repositories, ssl certificates, issue tracking etc


monkeysphere for personal ssh private keys stored in gpg

chef and hashicorp vault

Another neat thing to deploy into dns is sshfp records so there's almost never ssh fingerprint verification prompts for deployed hosts. Alternatively, ssh host fingerprints can be deployed to LDAP.

> Another neat thing to deploy into dns is sshfp records

For those wondering, [1] provides a bit of a background on SSHFP records. You can only skip host-key checking entirely if it's served with DNSSEC, although that might be easier if you're running internal DNS.

How do you have your system working? Its something I've fiddled with briefly, but ultimately gave up on for now.

[1] https://matoski.com/article/sshfp-dns-records/

SSH keys - OpenPGP smartcards (E.G yubikeys)

SSL Certs - vault PKI (internal) AWS ACM (external)

This works for me https://github.com/grahame/sedge not vault no, but better than straight .ssh/config.

I have a similar problem. Keeping check of which certificates are about to expire. I built an MVP to help with that:


Your "Account" link is failing, and also returning a verbose debugging output for Django...


Thank you for taking the time to look around and report back.

These two issues are known. The next step is notifying users via e-mail of expiring certs. Then the account button will be functional.

We're validating the value proposition. I've a real need for this application, but we're trying to see if others have the same need.

Would you like to be notified when the account and reminder functionality goes live?

Maybe we're old school, but we use autofs and nfs to mount homedirs and ssh keys on demand. Puppet handles host ssh keys across machine rebuilds. It works well, no need to replace it IMHO.

SSL certs are managed by AWS certificate manager and all VMs are deployed via elastic beanstalk or exist as Lambda functions. No SSH access is enabled across any of our infrastructure.

What has your experience been with AWS Certificate Manager?

Cert manager is only for a subset of the aws offerings and I don't think it works for EC2 instances. I know it covers cloudfront, which means it covers static websites under https (CF fronted S3), Api Gateway (which is CF under the covers for the most part), so also Lambda apis. A seperate but same thing covers ELB. I'm not sure that it covers ALB yet.

That said for these services it works very well, except there are some issues with it lagging on sending confirmation emails that just make you retry.

It's been pretty seamless. We use it for our static website on S3 and for our elastic load balancer on elastic beanstalk. I'm guessing in a year all the other cloud providers will follow suit with similar offerings now that heroku and AWS both provide SSL certs out of the box.

we use the public keys from existing github accounts in our organisation. a little cronjob runs daily on all servers to fetch all keys. http://labs.earthpeople.se/2016/04/controlling-ssh-access-wi...

no idea how secure this is or what implications this could have but it's easy and works well for our use case.

shadowd - Secure Login Distribution service

shadowc - client of shadowd




Dropbox and 1Password. It's, uh, not great.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact