Hacker News new | comments | ask | show | jobs | submit login
Secret Management with Vault (seatgeek.com)
209 points by josegonzalez on Oct 28, 2016 | hide | past | web | favorite | 59 comments

Vault has been excellent, it (and Consul) runs on Solaris/SmartOS and is available through pkgsrc. Vault has been one of the most set-it-and-forget-it pieces of my infrastructure, it "just works" to hold and expire secrets.

I use it to hold all my LetsEncrypt certs, my private keys get stored in Vault and I never have to expose them even to myself. I created the NodeJS package ten-ply-crest [0] (expressjs middleware) to handle this for me [1], using indutny/bud [2] as the TLS terminator with SNI.

[0]: https://github.com/nextorigin/ten-ply-crest

[1]: https://github.com/nextorigin/ten-ply-crest/blob/master/src/...

[2]: https://github.com/indutny/bud

Interesting. We're currently mostly using it for application secrets, but I'll definitely be looking at what other secrets we can move to Vault from Blackbox, if only to consolidate on one secret management solution.

At work, we are using Vault and have nothing but good things to say. It is a pleasure to work with. Use Consul as the storage backend and it is instantly highly available. There is a bit of a learning curve, and common implementation strategies aren't very well spelled out. However, this is all fairly easily deduced from reading the documentation. A+, would recommend.

The current problem I'm facing with figuring out how to implement Vault is how to tie it into Chef - which is our de facto CFM. The initial trust phase is kind of hard to figure out.

This is a common question, a good solution is the "cubbyhole" technique.


I prefer the "pull" model, it can be done with a few lines of code in the CD process: First from the deployer to authorize a new instance, and second on the app/instance to request the token.

I have read the cubbyhole doc and I understand there are two authentication mechanisms as the doc mentions - machine and user oriented. Can you explain how this helps integrate with config management such as Chef and Puppet etc as the OP above was asking. For some reason I'm just not getting it reading the Hashicorp Doc. Thanks.

Have the temp token generated as part of the deployment process. This temp token can be shown in plaintext to the developer/CD machine/Chef/logs because it is use- and time- limited. If anything besides the new app uses the temp token, the deployment will fail and the token usage can be easily traced to the offender.

Regarding machine- or user- oriented, it just depends on whether you trust the deployer (user) or the deployment machine to authorize a new temp token.

A client of mine has a number of windows machines, and a well-organized Active Directory setup. The small group of users who need to be able to create deployment tokens are all in an AD group, and it was pretty straightforward to use the LDAP auth backend with vault to allow them to create those one-time use tokens using their normal network logins. Everyone's super impressed so far.

I'll look into this more. I'm a simple guy so hopefully my initial forays produce fruit easily. Maybe I'll luck out and there will be a blog post from someone on how to implement this specifically with AWS and Chef.

There's an AWS based auth method based on iam roles and ami ids. Takes a few minutes to set up.


I see, that makes sense. Thanks for the good explanation.

Where are your services hosted?

AWS. I thought about using the AWS EC2 auth provider to grant a temporary lease that is fed into consul template so that template can pull down additional information.


But it's all in my head and I haven't gotten around to planning this out. It looks like once the initial trust is granted then hooking up consul template and chef is pretty straight forward. At least, that's what I got from Seth Vargo's post on using Chef and Vault.

"As Vault is run within our internal network (and for other reasons), TLS is disabled. "

No. No. No. No. No. No. This should NEVER be an option. You should not allow data to pass unencrypted over the wire, period.

SG Operations member here.

We're extremely aware that this isn't ideal, and is more or less the first thing we're working on fixing. It's why we have this listed both under "Causes for Concern" and "Strategic Improvements". The order of those issues isn't done so by importance :)

C.f. the NSA tapping Google's and Yahoo's internal networks since the data passed in plain text. [0]

[0] https://www.washingtonpost.com/world/national-security/nsa-i...

Theory: ALWAYS Encrypt. No exception. Don't let software go into production if it's not.

Practise: Always encrypt. Aw wait... what do you mean "tls is not supported?" Are you saying that half of our applications have been running on bare HTTP for years?. Mehhhh. Well, all our public accessible websites are running on HTTPS, right? Right. Guess it's only half a disaster after all:(

You don't know anything about their internal network.

TLS is not the only game in town.

Thank you for doing this before I had a chance to!

At Dollar Shave Club we use Vault in a very similar fashion to what's described in this article. It's a really great product, and well worth the upfront investment in terms of migration, workflow adjustment, etc. We may be open sourcing some tooling we've created around AppID authentication (now deprecated, sadly) in the near future.

SG Operations member here:

That mirrors our thoughts exactly - also would be great to hear about some of your tooling! - though at the moment there didn't seem to be a pressing need to move to the AppRole workflow. Our usage would be fairly similar if we did move, so I assume that we'll be revisiting this once it becomes clearer as to how we can better take advantage of AppRole.

Approle should work the same as appid.

TLS really shouldn't be disabled.. especially for SECRETS. I'd fix that up right away!

I've been very pleased with vault, our biggest hurdle has been working out the TLS fun, vault TLS certs from the PKI backend expire in ~ 32 days, but vault is a long-running service, you don't really want it restarting and having to re-unseal all the time, so we have our own CA out on disk, and vault's PKI holds an intermediate cert. This gives us the ability to generate long-term certs for stuff like vault, but still use vault's PKI infrastructure for most TLS certs, while only having to keep track of 1 CA cert internally.

SG Operations member here:

Yep, thats more or less what we're planning on doing. The documentation released was the result of the initial deploy. We prototyped integration with a few different areas of our infra, and are at the point where we'll be making strategic improvements as we rollout the integration everywhere.

In the interest of being fully transparent, the engineer in charge of the project listed as many issues - small or large - with their initial implementation so we could figure out how to prioritize each one before moving forward.

Awesome, good luck, We had vault deployed(but unused) for a few months while trying to figure out how to use the vault PKI backend to secure vault itself. We finally gave up, and did the external CA solution, with vault holding an intermediate. We actually have 2 intermediate keys, 1 that vault uses, and 1 that long running services (like vault) use for their TLS keys.

We used this https://jamielinux.com/docs/openssl-certificate-authority as a guide to get that all going. Good luck! otherwise awesome write up. I wish we could share more of what we do internally, but it's complicated.. getting permission! :)

That will be useful, and I'll certainly pass it along to the engineer in charge of this initiative. Hopefully we can have a follow-up soon about how we implemented TLS (and maybe some code).

Good luck on getting permission! I'll be on the lookout for when you can post this stuff :)

We are considering this at work, but there seems to be a problem. The secrets only last for a fixed amount of time.

How do people handle refreshing secrets on servers which maintain a connection to a database once their creds expire?

You could rotate the key (make available the new secret) before expiring the old. During that time (which can be even a few hours) you typically reload your application to use the new credential.

If you are able to change your application's code, you could integrate with vault's API directly which is the most clean solution. If you are unable, you can use [consul-template](https://github.com/hashicorp/consul-template) or [envconsul](https://github.com/hashicorp/envconsul) to securely introduce your secret which would entail reloading/restarting your application.

I think rotating with a little short-term redundancy is the simplest solution. That's what I've used myself too.

Sounds like it might be database or implementation-specific. From quick Googling I found https://groups.google.com/d/msg/vault-tool/6jhwtqkfpA8/IK2_H... and http://stackoverflow.com/questions/38579168/spring-boot-jdbc... -- see also, periodic tokens: https://github.com/hashicorp/vault/blob/master/website/sourc...

I look forward to other answers though, if anyone's currently set this up. I haven't used it yet myself, but I've been considering it.

The idea is that Vault creates/deletes/rotates the credentials that are used so the timeframe for exposure is limited. If your application can't re-connect to a database and drain the prior connection pool (or remove itself from a load balancer while restarting–whatever works for you) the credentials have to be long lived.

In that situation I would check out periodic tokens[0] since they live as long as they're renewed within the TTL.

0. https://www.vaultproject.io/docs/concepts/tokens.html#period...

if you mean the generic backend(a place to store secrets), they do not last a fixed amount of time, they are there forever.. there is a TTL, but vault does not erase anything that outlives it's TTL. Locally we just touch the TTL whenever we access it, in case someday the TTL's are followed and secrets get erased.

TOKENS however do get killed at the end of their TTL. But you can renew tokens forever if you allow for that.


I considered Vault for a recent project but ultimately chose not to use it. I'd like to hear more discussion around the alternatives to Vault that you considered (there are many for different use cases) and why you ultimately chose Vault.

We initially looked around, and here are a few alternatives:

    - Azure Key Vault: https://azure.microsoft.com/en-us/services/key-vault/
    - Blackbox: https://github.com/StackExchange/blackbox
    - CredStash: https://github.com/fugue/credstash
    - Lyft Confydant: https://github.com/lyft/confidant
    - Trousseau: https://github.com/oleiade/trousseau
    - Sneaker: https://github.com/codahale/sneaker
Some of these didn't exist when we were first investigating proper secret management, while others didn't fit as well with our deployment strategy. We're using what works when it makes sense - Blackbox is used in some repos, for instance - but I'll be sure the next time we have an internal doc like this that we cover why we didn't choose a different set of tools.

One thing we liked about Vault is that it built upon our usage of Consul and the http interface for Vault meant it was simple to plug into how we build and deploy services. While I'm sure some of these other tools would have had as good workflows, our experience in administrating Consul made it easier for us to have confidence in Vault.

I recently helped build a secret store system for our infrastructure, and we decided to not use Vault.

A big reason was that Vault’s AWS authentication backend is not based on AWS infrastructure like IAM/KMS, but uses a somewhat backhanded method (https://www.vaultproject.io/docs/auth/aws-ec2.html) to establish verify an EC2 instance. We use ECS, and it doesn't play well with it - see https://github.com/hashicorp/vault/issues/1298

Instead, we would have had to fall back to the App ID method, which requires separate configuration, and is “Trust On First Use” so doesn’t offer as strong of security guarantees in my opinion.

Also, the only Hashicorp supported-backends are file (non-HA) and Consul.

If you're all-AWS, I'd recommend checking out Confidant/Knox (run as a separate service) or Credstash/Biscuit (run directly against AWS infra).

We ended up just rolling something with AWS KMS and DynamoDB. Vault looks great though, it's just overkill for our needs ATM. One day maybe.

Why did you decide not to use Vault? Did it fail to meet a requirement? Over kill? Learning curve? Other?

Yes, mostly overkill and team learning curve. I just needed a simple, higher abstraction for storing API keys and secrets.

Integration with a CI bot is another consideration.

My issue with Vault is it has no history. If someone goes in and changes a password from 'foo' to 'bar'. I have no way to know it used to be 'foo'. In a production environment where the password might be stored in a internal user database of an application(mysql, rabbitmq, etc), not having history is a no go.

From https://www.vaultproject.io/docs/audit/

  Because every operation with Vault is an API request/response,
  the audit log contains every interaction with Vault, 
  including errors.

  The data...will be hashed with a salt using HMAC-SHA256.

  The purpose of the hash is so that secrets aren't in 
  plaintext within your audit logs. However, you're still
  able to check the value of secrets ...by using the 
  /sys/audit-hash API endpoint
Does this not cover your use-case?

Is the audit log insufficient? https://www.vaultproject.io/docs/audit/index.html

For security reasons it doesn't include raw secrets, but the hash is enough to tell you if it matches some known value.

"pass" [1] is a git-based command-line secrets manager and has many third-party GUI, web and mobile app interfaces around.

[1] https://www.passwordstore.org/

Have you integrated it with Vault? Are there any detail on doing so?

I've not used it with Vault (I don't know much about Vault) nor in a group, but it basically is a tool for managing a bunch of gpg-encrypted files in a given directory tree, and it uses git for version control and database distribution / synchronisation, and allows for encrypting for multiple GPG recipients. See man page [1], the FILES section especially.

[1] https://git.zx2c4.com/password-store/about/

Haven't done it myself, but I would probably have a git post-receive hook trigger a service (jenkins?) to pull down and decrypt the secret repo and update vault

I've been using SO blackbox which feels like a poorman hack, but solves the problem really easyly until a "proper" solution is decided on and implemented.


SG Operations member here:

Funny enough, so do we. We actually use both Vault and Blackbox for building out our BaseAMI, though for different purposes depending upon the nature of what is being encrypted.

One thing I like about Vault is that it allows us to rotate keys without needing to rekey all encrypted secrets. Being able to expire certain secrets - like database credentials - means we can rotate these at will without needing an extra git commit.

I highly recommend using something here, and I commend you for at least encrypting with blackbox :)

Why do you think blackbox feels like a hack? Just because it uses a repo at the end of the day?

I love it, I think it's both solid an elegant, and quite portable (needs bash + gpg) but your average enterprise architect will call it a hack and start comparing it to a multi k$ solution that needs its own server or three, has a fancy GUI etc...

In general, is Vault well suited for things like PII data for millions of users (like oauth2 tokens, facebook id, etc).

im not a security expert, so im wondering whether vault encapsulates security best practices to store lots of sensitive data in production

For anyone who cares, Azure Key Vault is a turnkey solution for this problem.

Works well for me.

SG Operations member here:

Definitely worth looking at solutions from the hosting platform you're infrastructure is on. In our case, we don't use Azure, so going across the net for secrets was less than ideal.

Oh certainly. But they are a one time per app startup operation, so it's not a huge deal in terms of performance. It is somewhat less secure, of course, since it has to travel multiple network boundaries, but it's still encrypted over the wire so also may not be a huge deal in that realm either.

Performance isn't an issue for us, since the existing processes stay alive until we've healthchecked new processes. That it's encrypted alleviates some concerns, but given the number of deploys we do a day - on over 50 services and counting - I'm hesitant to have secrets go over the network so often.

That said, if you're in Azure, it's certainly the easiest way to go, and definitely not a bad approach.

At my employer we considered Vault. It's easily the best-in-class right now for free vaulting of credentials but it's missing a number of features and I disagree with a few implementation details...

Firstly, it doesn't actually manage credentials for you (other than with AWS, a bunch of databases, and a few other things). It's not going out and logging into your 10s of thousands of Linux hosts making sure all the root passwords are vaulted, unique, and must be checked out before use. This is where products like CyberArk's come in (we use CyberArk and it's crap to be honest; it doesn't scale, its API is broken--REALLY broken--and its architecture is some of the worst junk I've seen in years and that's just scratching the surface of how bad it is!).

Another problem with Vault is that it exposes too much information about the secrets via their URL/path. This is a very minor concern of mine and probably shouldn't impact your decision to use the product but here's what I'm talking about: The article author says that they're going to use this model for their paths:

Here's my problem with this: If any configuration information is ever exposed an attacker will know that this application is part of ENVIRONMENT and has access to the keys for APP. If the host or container running this application were compromised this isn't really of concern since the attacker will probably be able to figure these things out anyway but my concern has more to do with configuration management systems. It's easy to imagine the ability to map out who-has-access-to-what in the vault just by examining their configurations.

It's the difference between a config file with this:

    "secret_url": "http://<vault>/v1/secret/ENVIRONMENT/APP/KEY",
    "certificate": "/path/to/vault/client_cert.pem"
...and this, my preferred way (I split the vault_url and secret below just to save space since indented lines don't get wrapped on HN):

    "secret": "2a9ac85d-f6f1-4ef5-8d36-65331708623a",
    "vault_url": "https://<vault>/api/v1/secret/{secret}",
    "credential": "whatever" // I'd support multiple auth options
The latter doesn't give an attacker much info at all about what that credential is. Even if they're able to retrieve it all they'll get back is, "<the secret>". "Great, now what do I do with this?"

How I would do things different:

* Secrets may only be accessed via their unique (e.g. UUID) object identifier. This means that if you want the credential for ENVIRONMENT and APP you must have the object ID. It cannot be inferred by the path.

* Access to credentials will be controlled via attributes (extra keys/values attached to the same record as the secrets). This is mostly standard attribute-based access control (ABAC) with one exception...

* Attributes can exist in namespaces that are controlled via independent groups. So you can imagine an internal team controlling namespace FOO and attaching `{"foo.admin":['userx']}` to the secret record for X. So if the FOO team wants to grant admin access to their app they can add a user (aka a subject) to their little "foo.admin" attribute on that particular secret. The point being that the FOO team doesn't need to open a ticket with the one true attribute authority (i.e. traditional ABAC) they can make the change themselves; the whole thing would be entirely self-service. There's zillions of ways to handle namespaced attributes like this but I personally prefer to just stick them right in the secret record itself and control access at the application layer.

At my current employer I've been tasked with writing a vaulting and management application from scratch because nothing in the market will suit our needs. We (me specifically) tried really hard to avoid this situation but it seems that nothing that exists currently can do everything. Most notably:

    1. Sane architecture (i.e. something that doesn't send all its
       logs to the same database as the secrets and doesn't run on
       Windows, gahh!  Among other things)
    2. Secure vaulting of secrets (aka "dumb vault")
    3. Can *manage* credentials for various platforms
    4. Provides a user interface for "checking out" secrets
Hashicorp Vault does a great job with #1 and #2 but it doesn't do #3 or #4. We had initially planned to use it as a "dumb back end" for a custom system but it didn't work out for reasons that have nothing to do with faults in the product. That just isn't really what it was meant for.

For reference, here's some of the problems we had with Vault that most won't run into:

    * Uses its own Certificate Authority (my employer only allows
      *one true authority*, sigh).
    * Written in Go which isn't approved.
There were others as well but they were mostly bureaucratic.

Note: In theory Hashicorp's Vault can do full credential management for anything but at this point in time it does not support writing your own custom back-ends. That feature is essential for managing zillions of disparate and possibly proprietary systems. For example, say you wanted to manage admin/root credentials for all Unix-like OSes at your company. All you'd need is a generic SSH back-end that can be scripted (e.g. expect or state machine scripts like CyberArk). Why they haven't added this is beyond me.

The vault we're writing at my employer is actually pretty damned awesome and innovative. I wish I could share more about it but I can't. I'm pushing to make it open source but that probably won't happen.

I mostly agree with you, but the PKI backend in Vault will let you import certificates to use(specifically an Intermediate). That's what we did, we have an internal CA, and the vault PKI backend has 1 intermediate certificate signed by the CA that it controls, so we can use the vault PKI backend and create certs on demand using vault, but keep the "One True Authority" CA (outside of Vault).

As for #3 and #4, you can write tooling around vault using the generic backend to store the data to accomplish both of those things.

To make credentials more secret in vault, we'd need an alternative mechanism to somehow hash what we want into the uuid in vault. Do you have a way to do that which is secure and wouldn't be broken if a box was compromised? The way I see it, if they have admin credentials to vault, whether or not it's a UUID doesn't matter, as a bad actor can figure that out later.

In our setup, credentials are stored in namespaces, which is equivalent to adding an attribute to the credential and having that attribute be namespaced.

I'm not quite sure how this helps anything except adding an extra level of indirection for both operators and users, but I'm definitely interested in learning more.

thanks for the nice write up

[nit] Token Management

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact