Hacker News new | comments | show | ask | jobs | submit login
Ask HN: In a microservice architecture, how do you handle managing secrets?
134 points by kkamperschroer 610 days ago | hide | past | web | 57 comments | favorite
I'm evaluating solutions for secrets management in relation to a distributed microservice architecture and am curious to hear what everyone else out there does. Some options I've considered:

- Git-crypt and deploying secrets along with binaries

- Hashicorp Vault

- Square Keywhiz


- Lyft Confidant

- Roll your own

All seem to have pros and cons depending on use cases and how mission critical the service you are offering is.

So what do you do to solve this problem in your world?

You should check out Daniel Somerfield's talk at OWASP, "Turtles All the Way Down: Storing Secrets in the Cloud and the Data Center"


Fantastic. Thank you!

We use Kubernetes, which includes its own secrets API:


I can't remember which issue this was on, but it seemed like there was some discussion on their GitHub project about making pluggable secrets backends (HashiCorp's Vault was mentioned).

Kubernetes' secrets API is still very basic, but I think the fundamental concept is very sound and has a great foundation to continue building on.

Docker's commercial offering DUCP (Docker Universal Control Plane) offers this feature as well. Out in the wild, you can find Docker volume drivers for Keywhiz etc [1] that makes secrets available as files mounted to a container. I think Kubernetes does this, too.

If you are running on cloud, you would probably want your cloud provider to give you service secrets and rotate them somehow. AWS/Google Compute metadata service or Azure Key Vault are capable of doing this but I don't think they entirely map the microservices world because ACLs are set on the VM instances, not microservices specifically.

[1] https://github.com/calavera/docker-volume-keywhiz

I want to chime in with my experience.

Our service is built on top of docker deployed on CoreOS with fleet and etcd. Most of our secrets & runtime configuration was stored in etcd, which was our attempt to store the config in the environment (http://12factor.net/config).

With Kubernetes, life is much simpler. Gone are the silly fleet configuration files, and the bootstrap scripts I used to configure etcd. Moreover, Kubernetes' secrets volume means I can have my configuration and secrets easily plugged in.

There are definitely other great solutions out there, but I'm sold on Kubernetes.

> I can't remember which issue this was on, but it seemed like there was some discussion on their GitHub project about making pluggable secrets backends (HashiCorp's Vault was mentioned).


Are the k8s Secrets still stored in plaintext in the etcd datastore? Seems that this feature is a bit half-baked right now -- though I'd love to me mistaken on this point. The k8s docs mention shredding your apiserver hard drives once you're done with them; that's hardly feasible in a cloud environment.

Also on access control, any process with root on any node in your cluster can get access to all your secrets (since the kubelet needs to be able to do so). There are no user access controls either; any cluster admin can dump all the secrets.

This stuff is clearly documented, so it's not an indictment on k8s; I just get the feeling that the feature isn't really ready for production use yet.

afaik the only part of kubernetes accessing etcd is the master. Nodes don't need and can't access etcd directly.

That still leaves the secret in plain view on the nodes that run the pod that needs the service. It would be great to be able to umount the secret when not needed anymore.

Correct, the etcd instance is only accessed by the master, which uses etcd to back the apiserver. But any root process on the nodes can access the secrets through the apiserver (there's no access control at this point).

We have started work on exposing Hashicorp Vault secrets via FUSE and Docker volumes. The expectation is your containers will just mount secrets via a mount like /secret in the container.

The project is brand new and we'd love to hear your feedback: https://github.com/asteris-llc/vaultfs

Nice! The FUSE mounting method for obtaining secrets is similar to how Keywhiz does this. Very cool and novel solution. Though, if you are unfortunate enough to still have Windows servers in your architecture, I think you're out of luck.

We run a PHP hosting platform and that tickled us as well. We were especially upset with the common sense that storing secrets in ENV vars is a good idea — in PHP those vars are easily exposed. See our blog post: http://blog.fortrabbit.com/how-to-keep-a-secret — here we suggested:

1. create a secret key, store it with the code of your App 2. store the encrypted credentials in env vars

Later on we even launched our own solution for our clients, an app_secrets.yml file, which can be edited via Dashboard. http://help.fortrabbit.com/secrets

The nice thing is, that this file is partly managed by the platform for it's own credentials and partly by the user.

That has been running for a while now. The adaption rate is low until now. It turned out that not everything will fit into that ONE fault. Blackfire.io and NewRelic run as PHP extensions, thus the API-keys are stored with the extension setting.

We have also discussed to implement an some open source "Secret as a Service" but came to the conclusion that this can too easily turn into to be a SPOF.

I am amazed that this topic is getting discussed again and I have learned about many new concepts here.

I wrote some of my thoughts on the topic, and the primary motivation behind SOPS [1] (which uses PGP and KMS): https://jve.linuxwall.info/blog/index.php?post/2015/10/01/In...

The initial trust problem boils down to trusting the API that controls the provisioning of your infrastructure. Failing that, you have to ask a human to manually authorize new nodes to retrieve secrets (that's how puppet approves new agent certs).

[1] https://github.com/mozilla/sops

A simple solution if you are in AWS is S3 with instance profiles for access.

This is the solution I've come to use as well. Role-based access in AWS makes a lot of these type of things really nice. Too bad it's not enabled for everything. For example, their hosted ElasticSearch service doesn't yet work with VPC's, and using the role-based access is tough (though possible).

On the subject, I typically store a file containing env variable export statements on S3. When the box is provisioned, the file is downloaded to it. Since the box has role-based access, there is no point in downloading and deleting the file: any process on the box can download it again from S3 at any time. Basically, I trust that the EC2 instance will remain secure. Then the file is source'd in any context where my application code will run.

For applications outside of AWS, I just keep a local non-version-controlled copy of the secrets, and then upload them to the server when I provision it.

Yes, but is there a succinct howto on this?

The short version is:

1) Create an S3 bucket. Remove all permissions from it

2) Create an IAM role - give it explicit read permissions to just that bucket (there's a HOWTO at the bottom of this article: http://mikeferrier.com/2011/10/27/granting-access-to-a-singl...). When you start an ec2 instance, you can give it one (and only one) IAM instance role.

3) Put your secrets or configs in a file on that bucket. For example, config.json or whatever format you choose.

4) On your instance or container, use the aws-cli on when your app starts to copy that file down from S3, then read it into memory in your application and then delete it.

It's a bit of a hack but you can now easily restrict access to that secrets bucket, and only your running instances/containers can access it. The secrets only exist in running app memory. Now don't allow SSH access to those instances :)

I'm somewhat naive regarding S3. If data is in RAM, can you prevent it being swapped to disk and read by an unauthorised user?

(I guess "RAM" and "disk" are virtual entities, but hopefully the spirit of the question still applies.)

As the sibling comment to mine points out, the fact that the instance has access to S3 means it's not actually secure - they could just use the aws-cli to copy the file back down again. My comment about deleting the file from disk was a bit silly and doesn't add any true security.

Really, you need to just make sure that the instance is secure. The point of this whole setup is not to make secrets unobtainable if someone compromises your app server; it is to prevent you from checking in production database passwords and secrets to your code repository.

1) Have an EC2 instance with a role-specific IAM Role

2) Create a S3 bucket

3) Write a bucket policy that whitelists specific IAM Roles to specific key paths within the bucket.

4) Put secrets in that bucket (duh)

We have an open source solution called Cryptex[0] to handle this. It's better explained by this blog post[1] that gives the thinking and configuration necessary for most scenarios.

[0]: https://github.com/TechnologyAdvice/Cryptex [1]: http://technologyadvice.github.io/lock-up-your-customer-acco...

There's also blackbox by StackExchange https://github.com/StackExchange/blackbox

For AWS users, KMS's GenerateDataKey is a simple way to store secrets locally in a way that reuses your IAM policies. You can also use grants and EncryptionContext to restrict the ability to decrypt secrets in a very fine-grained manner. As a bonus, all decrypts are logged in CloudTrail. The KMS docs are awful but if you're on AWS then it is worth checking out!

Conjur (https://www.conjur.net/) has been working well the users of our PaaS. It's a self hosted commercial product.

Is there any detailed opinion on good and bad approaches to this problem? It is clearly a hot topic.

I'm sure each product listed conforms to one from a small set of design patterns. Has any credible analysis of these designs been published? Are competing offerings likely to evolve toward a stable converged solution? Or is there something in this problem that remains fundamentally unsolved?

I appreciate it's turtles all the way down, but I'm wondering if anyone has proven the merits of some approaches over others, or components within the approaches at least.

We use https://github.com/meltwater/secretary . The key differences with Secretary is that plaintext secrets are never stored to disk or otherwise made visible outside the container.

We ended up creating https://github.com/meltwater/secretary to allow storing encrypted secrets in config files checked into Git by devteams. The encrypted secrets are passed as env vars through the continuous delivery pipeline, Mesos/Marathon and into containers. They're then decrypted and injected into the app environment at runtime, safely inside the container.

At startup the container reaches out to the Secretary daemon that holds the master keys, using public key cryptos to authenticate itself. The Secretary deamon uses Marathon to authenticate containers (checking their public keys stored in env vars) and validate that they're authorized for the specific secret in question (checking that the encrypted secret is indeed part of the containers env vars).

Meaning that Marathon is the single source of trust of which container can access what secrets. The problem then becomes controlling who and how changes are made to the Git repo containing the CD config, which is something Github does well with roles, status/deployment API and pull requests.

We had a similar problem as some describe with the distribution of the initial secret (i.e. Vault token) and one time Vault tokens being cumbersome in dynamic scaling envs. We didn't want the cleartext token ending up in config files nor in the https://github.com/meltwater/lighter config we use to drive our continuous delivery pipelines that go into Mesos/Marathon. We also had some other aspects like

* Wanting to keep secrets, app config and code versions promoted together throughout or deployment pipelines. Seeing secrets as another type of app config we wanted to track all config and versions for an app in the same way, in the same place to avoid mismatches or deployment dependencies.

* Wanting to enable our very independent devteams to easily manage secrets for their services, same was as they manage the app config, versions and rollout of their services. And delegate management of what service is authorized for what secrets to devteams (with both automated checks for unencrypted secrets, and some gentle manual coaching post-commit)

* Versioning and rolling upgrades for secrets? E.g. how to roll out a new secret in a Marathon rolling upgrade? Creating and managing versioned keys in Vault seemed somewhat cumbersome.

Perhaps something like that could be used to solve your initial secret distribution problem or even handle the secrets themselves until Vault has solved the initial secret problem..?

Azure Key Vault! Disclosure: am dev in Azure, although not on this specific product.


Azure Key Vault is a great component but it's a component not a solution. By way of example, Key Vault's hardware "secrets check in but they don't check out" capability is awesome for preventing disclosure of secrets but if you don't have a system for adequately managing who/what can use the contained key to sign messages all you've done is add a complex and pricey piece of security theater (but as I mention elsewhere our primary concern is making sure whatever secret management we use helps us defend against at least the early stages of compromise of our infrastructure)

Cool, thanks! Seems like a pretty direct competitor to AWS KMS. The pricing is identical, so I guess the choice between the two is quite obvious if you are hosted in Azure or AWS.

Because we are already using Consul we went with Vault. We are using it in a POC type setup now (only for a few services/scripts) but so far it's been pretty easy to work with. The API is fairly easy to use outside of the fact that there is no search function (or wasn't last time I checked). The documentation could be better but since it's a public project and I haven't submitted anything I'm not going to bash that :) The fact that it's a single binary is another thing we liked. just drop it out somewhere and run it.

I'm pretty happy with this solution from Strongauth.


You can secure the root for it with TPM or HSM.

That's an interesting solution. I guess this would really only work though if you are self hosted, right?


We're using Ansible which means we use ansible-vault to store secrets. We store the encrypted files in S3 and decrypt them on deploy as needed.

So if you potentially need to roll a secret you would just run your deployment playbook limited to the secrets task?

When I am building stuff in AWS, most of the secrets are for entities that are access-controlled by AWS and can be passed in through server roles.

Another option for your list, which you'll have to evaluate for your use case: https://wiki.openstack.org/wiki/Barbican

I've been evaluating most of these same options for my use case, but haven't made any decisions yet.

Nice, thank you! I haven't heard of that one.

It also depends on how secret it needs to be. For most of our secrets (those used for configuration) we use Consul.

I think this can be sane when you don't have multiple privilege levels anywhere in the data center you're deploying in. It's less sane if you have less- and more- privileged machines anywhere in the environment, or less- and more- privileged applications.

You're putting a lot of faith in a very complex and not- well- tested codebase if you rely on Consul ACLs to protect secrets.

The poor state of its testing is the biggest red flag I have towards Consul. I'm much more positive about it in its way than I am about other Hashicorp tools like Packer and Terraform, if only because it seems like Consul is core enough to the way they want to make money that it's more important to them. But there doesn't seem to be a culture of correctness and strong testing around those tools; trusting my sensitive data to a tool that's as complex and complicated as Consul is worries me. (I feel like it should be normal to have something maintaining my cryptographic secrets to be at least as well-tested as my web framework...)

Of the tools listed in the OP, I feel really good about Square Keywhiz; I'm still rolling it out in my first environment, so I can't say for sure, but I appreciate the level of effort that's gone into only doing secret storage and making sure it is exhaustively tested to spec.

Do you take advantage of Consul's ACL system them for limiting access to secrets? Also, do you have any form of auditing then when using consul?

Thanks for your input!

Can't speak for the parent poster, but over here, yes, we use Consul's ACL. It's pretty solid and easy to use, and the GUI helps a whole lot. In terms of auditing, I've not dug too deeply into that, but there is really good logging.

I used Marathon and Mesos and rolled my own pub/priv encryption for our developers JSON (encrypted the ENV parameters POSTed to Marathon): https://github.com/malnick/mantle

You can encrypt and deploy secrets using Distelli.


Disclaimer: I'm the founder at Distelli

This is not a trivial thing. Why should I trust you with my company's secrets?

How do you manage key storage securely? Can people at your company see my secrets? If somebody comes with a court order will you give them my secrets and not tell me? What encryption algorithms do you use? What experience do you have in reducing attack surfaces from internal and external threats? Is any of your software open source? Has your software been audited? Is it PCI (or any other standard) compliant?

All good questions. We do not store your secrets. You do not give us your secrets. Your secrets do not live on our servers. No one at our company can see your secrets or access them.

We provide you with an agent that you install on your own servers and that agent is marked as a key management server. That agent is contacted to do asymmetric key encryption.

Here is a more detailed blog post about this: https://www.distelli.com/blog/keeping-your-application-secre...

Also we use standard encryption algorithms and have not written our own crypto (and never will).

What is the tradeoff matrix for these kinds of services? They all seem pretty similar to me.

it's a huge pain point for us. We're a .NET shop rolling our own that mimics/overlays app.config and web.config patterns for both dev and production usage. Our concern is less on how do you get the secrets to the box (though that's obviously important) and more on how do you keep an attacker who has started penetrating your infrastructure from gaining control of the infrastructure that holds your secrets.

Have you had a look at the new ASP.NET Configuration classes? [1]

I hate having to manage web.config but I get your point about keeping attackers at bay (and not providing pivot points).

[1]: http://docs.asp.net/en/latest/fundamentals/configuration.htm...

Thanks - it's clear MSFT is working hard to get to a place where secret management is a first class part of the dev process and we're attempting to integrate with the classes you mention but as I understand it they only work with ASP.NET 5 so you can't use them in console app based test harnesses or Windows services or etc. That means we end up needing to have a bunch of provider mechanisms, all essentially the same in principle but with different implementations for the different platform details. If it's not super easy for the dev to drop into a quick test app, they'll "just copy and paste the secrets for now" which is always the path to darkness.

It's 100% possible to use the new ConfigurationBuilder class outside of a asp.net site (like a console application). ex: http://stackoverflow.com/questions/31885912/how-to-read-valu...

Maybe a local (LAN?) NuGet feed with your preferred mechanism would help with the dev experience? You could even go as far as deploying custom templates. I don't think the new Configuration classes are limited to ASP.NET any more - and from the response I got on the issue tracker, if you find any hard dependencies report them as a bug.

That's very similar to the problem we are encountering. Getting the secrets to the machines at deploy time isn't too bad, but then they are available to a potential attacker.

Accessing secrets as needed at runtime instead requires some kind of extremely reliable service nearby. This is what I find most concerning about Vault since it can lock on you if the cluster goes down.

I'd choose the one that I am the most comfortable with and is less obtrusive to the rest of my stack.

Whatever you choose, make sure you are comfortable with it, it's easy to deploy and work with.

Thanks for the tip. I guess the easiest thing to do is use git-crypt with some encrypted file and have the secrets available at deploy time, but I'm worried about long term disadvantages to this approach. Rolling secrets would then require a deployment of at least that secrets file and restarting the services, or writing them in a way they read the file every time they need the secret.

Since our stack isn't on AWS, it kind of throws out AWS KMS and Lyft Confidant (since it is built on AWS). I'll keep digging into Vault and the other options put forward in this thread. Thanks again.

Applications are open for YC Winter 2018

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | DMCA | Apply to YC | Contact