Hacker News new | comments | show | ask | jobs | submit login
Why We Need Dynamic Secrets (hashicorp.com)
86 points by mafro 67 days ago | hide | past | web | favorite | 58 comments



Secret management can be a really infuriating problem. No matter what you do there never seems to be a good solution between dealing with employees who know passwords leaving, automated processes needing credentials and other business processes. It's one of the toughest problems I have ever seen without a good solution. At least not one that the business people will accept without complaining that it's too expensive or burdensome.


One of the benefits of using dynamic secrets is that access to say databases carry a short TTL. Vault manages the lifecycle of these credentials and will automatically revoke them once the defined lifecycle has expired. To gain access to credentials a user would authenticate to Vault with say LDAP, this access can be controlled centrally with a policy defining access to secrets assigned on an individual user or group level.

Should an individual leave an organisation then the credentials they have obtained from Vault to access a datastore would expire automatically, normal process would apply to remove them from LDAP and disable the ability to require further credentials.

There is always a process problem with managing secrets but dynamic secrets in Vault stops long-lived secrets and reduces unofficial password sharing.


Secret management tends to suffer from the logical holders of secrets (managers) not actually having the time or expertise in a lot of cases to deal with the implementation (or even understanding) of the process.


This. The lack of focus faced today from key custodians and the lack of undergoing continuous training to understand the different threats and keeping up-to-date with implementations is a gaping hole in this particular type of model. It's never a core competency, which is why the industry needs a different paradigm.


As in, "Job title: The Holder Of Keys"?


My solution has always been to treat such keys and secrets as financial instruments, and therefore prone to use only through direct processing - i.e. build/deploy environments have to have explicit requirements of allowing the finance guy to come in and type the codes needed to distribute to mass, general public .. but this has to be a managed expectation of dev tooling in general ..


It's hard to explain the need for this to business people unless your engineering management is very strongly supporting it.


we collect dongles as employee leaves.


What if the employee has "lost" their dongles?


I suppose you can deactivate it.


This is all cool, but... once you get to a setup like this (which has many moving parts and config rewriting), why isn't that solution compared to kerberos? You get the temporary credentials giving you similar possibilities.


Kerberos was a major inspiration for us!

The goal of Vault was to be a modern Kerberos, but invert the integration model. I think that is the Achilles heel of Kerberos, since it has a complex API and only works if the endpoint systems are tightly integrated.

Vault operates in much the same way, and could be viewed as a "KDC". However, instead of requiring the Authentication Service (AS) to be Vault aware, Vault uses authentication plugins to do the integration in the other direction. Similarly, instead of network services being Vault aware, Vault uses secret plugins to do the integration with endpoint systems.

This lets Vault easily be extended to support new authentication systems and endpoint systems, without needing those systems to be modified. Otherwise, its conceptually very similar to Kerberos!

Edit: I'm a co-founder of HashiCorp, and one of the early authors on Vault.


This is a great first step on building the right kind of infrastructure to stop applications from exposing secrets. Leases on secrets with dynamic policy enforcement is, IMHO, the future of application security. Happy to see companies like Hashicorp discussing more of the vision on how Vault can be used to secure applications that we depend on for infrastructure.

I would caveat, however, that encryption is just one part of the tool in the toolbox. A correct security posture is about having layers of defense. The next layer is absolutely the application and isolating secret management responsibility from the app layer to the infrastructure layer is the right strategy, IMO.


I'm still stuck on "Yeah, but I need a credential to authenticate with Vault to get the other credential that I want".

Very chicken and egg.


Yes, establishing that “secret 0” (the credential which grants access to all the others) is a tough problem.

In some scenarios it’s possible to at least avoid introducing another vault-specific credential. For example if your workload is running in a managed environment like AWS or Kubernetes, the platform provides a credential automatically (IAM role or service account). If the vault can recognize this platform credential and map it to a role defined in the vault, then you are using the platform credential as the “secret 0”.

It’s still not completely idiot proof because you have to define the mapping between platform roles and vault roles, and define permissions between vault roles and secrets.

And of course the detailed workflows for doing this vary between vaults.


One situation where I've deployed Vault, it was configured to issue the Vault credential using LDAP (Active Directory, in that case) in the "user needs a credential for a particular system" case. We didn't do fine-grained permissions for that, but it provided the audit trail we needed.

In the automated case, Hashicorp Nomad can issue tokens to individual applications. I was working on that but left the company before getting it into production. That does require trusting your Nomad infrastructure with long-term tokens, but that's still quite a bit better than, e.g., committing tokens into Git :)


There are some good suggestions in the replies here, but I'd recommend looking at our secure introduction guide here: https://www.vaultproject.io/guides/identity/secure-intro.htm...

Effectively, there are only two possible approaches. Platform specific integrations, such as AWS IAM, Kubernetes, Nomad, etc. Or Trusted Orchestrators, which help inject the initial "secret zero".


Hi Armon,

Under the 'trusted orchestrator' model the article you linked to describes it states "you have an orchestrator which is already authenticated against Vault with privileged permissions".

https://www.vaultproject.io/guides/identity/secure-intro.htm...

In the case of the AppRole auth method and where the orchestrator is an automated app, would it be fair to suggest that you'd have an AppRole for which secret IDs are generated that do not expire based on time and/or max uses?

From the Vault documentation here: https://www.vaultproject.io/api/auth/approle/index.html#crea...

Is it possible to set 'secret_id_ttl' to something like '0' so that it never expires?

The reason behind my question is that if the automated app is initially seeded with a role and secret ID it can login to Vault and get a token, which can be renewed going forward (based on settings for the AppRole). However, if the token is not renewed in time, or the service has to be restarted, you would need to regenerate the secret ID and reseed the application with it, which would be a manual process.

Thanks for any advice :)


Hey peteski22,

Exactly what you suggested would work! Having an AppRole that never expires would allow the trusted orchestrator to authenticate on each run, and then generate and inject ephemeral credentials.


Vault doesn't try to solve the human trust problem, just the application trust problem. It assumes you can trust the humans involved in the system. Secrets simply don't live with applications.

Solving the human trust problem is far more difficult and outside the realm of a single software application.


But if the human "trust" problem is the weakest link, improving security in other ways is just theater. It's like adding a deadbolt to a door adjacent to an open window.


The human trust problem isn't the weakest link, the application is. Way more applications get hacked than humans.


That was my first thought, too.

"But... there's a root account... creating another account... which is typically a privileged action. What's protecting that? Is that root account being rotated, too?"


the master key is split into shards. See under "Why" here: https://www.vaultproject.io/docs/concepts/seal.html


Sure, the master key is split into shards...

I'm talking about the account creating the database user. Let's take MSSQL, for example. The equivalent to a root account there is `sa`. So, Vault will have control of the `sa` account in order to create leased database users.

If I'm a malicious actor inside the environment, what's stopping me from compromising the `sa` account and mimicking dynamic secrets? I'd need to be comparing Vault logs with Database logs constantly to ensure it was legit.

It's just madness, if you ask me.


In some cases, vault can change the password of the root account on the service, so that only vault is known to have the password. But you are correct, there is a problem here. There is nothing stopping you from attacking the DB or service directly.

You can do a lot to minimize that, many DB auth systems let you specify where connections for users are allowed to come from, only allow a single concurrent connection, etc. Plus you can encrypt and verify that connection with TLS, build up FW rules, etc.

Security is never "perfect" but to say dynamic secrets are wrong and crazy just because of this one problem isn't very well thought out. Your wordpress application will be hacked a LOT easier than your MSSQL sa account.

So dynamic secret people are saying, we all know our apps leak/get compromised all the time, minimize their leakage with a short lived, one-time use secret, that can only be used from X machine, etc, instead of a long-term, static secret that can be used anywhere, etc.

For sure, for high-value targets you best be auditing everything you care about, not just vault, but with a vault audit log, you can automate a lot of that comparison between MSSQL and vault, where before it was much harder to automate since the logs from every application using MSSQL will be different, etc. But if you are in a high-value org like this, you are almost certainly already auditing your MSSQL account creations, this doesn't change that.


I think the solution to this problem is to not run your apps with the root account in the database. Then the app db passwords can be automatically rotated by vault, and db admins can request temporary root access from vault if necessary (and those credentials should be short-lived).


That's protected by Shamir's Secret Sharing the root encryption key, which adds a fairly robust layer of security. It means that (unless you store the key parts in an HSM/Cloud KMS) there's a trade off of having a manual step to get the unencrypted key in to vault's memory.


You can also set up vault to encrypt each shard of the root key with the GPG pubkey for each administrator before dumping them to output. This way, plaintext shards of the root key never even touch the disk or otherwise come into view of anybody other than the intended recipient.


This is awesome. Somewhat reminds me of Short-Lived SSL Certificates (e.g., https://medium.facilelogin.com/short-lived-certificates-netf...)


Vault has always appeared to me to be a great technology for a larger tech org looking to implement more granular access control and increased auditability.

At a smaller scale, we've been satisfied with encrypting secrets using KMS and then placing the resultant ciphertext into environment variables that we commit with our terraform scripts. Our system does not allow granular access control, but it is relatively simple to implement using a couple of Bash scripts and allows committing all of the config, including secrets, all at once so that deploys are fully reproducible.

To allow a bit more granularity in isolating different environments, we use different encryption keys depending on whether we have a dev/uat/prod deployment. We grant the target application role access to particular secrets based on which KMS keys it can use.

The other trick is that we decrypt all of the secrets in our environment at the entry point of our application code, so that the environment is fully decrypted by the time the service runs. This means there is no trace of secret management in our application code.

There are intermediate steps of this scheme where we could revoke access to KMS keys for some developers, so that we could allow users to deploy services but not necessarily be able to use and access the encrypted secrets.

Regardless, I always appreciate knowing these kinds of technologies exist because they seem to me to solve very challenging problems at scale.


So this is like authentication tokens that pretty much any developer who's had to use a web API probably knows already ? It definitely is welcome in a more "traditional" setup.


IME, dynamic secrets don't scale well. First, you're going to have a tough challenge getting a large org to stop sharing secrets amongst apps/hosts. Tougher still is providing a justification for why host1 for Salesforce needs a unique credential from host2 for Salesforce that is doing the same thing. Second, ensuring privilege creep doesn't occur becomes nightmarish when you consider an environment today that has 500k secrets and having to audit the authorizations an ID was granted in multiple backend authentication systems after that ID has been deleted. Finally, generating dynamic ids for an environment like this is a staggering amount of compute workload. Taking that same 500k environment and breaking these out to unique credentials dynamically generated puts you into 7 figure numbers pretty quick. Even if we ignore the complexity of such an arrangement, just consider what the performance demands are going to be to generate an id, assign it privileges and a password, provide that to a client, then clean the same id up later. We'll ignore how this looks in an environment where you have to replicate the identity that was created across multiple authenticating backends. (think Active Directory domain replication)

In short, I think dynamic secrets will increase the threat surface dramatically, while simultaneously increasing complexity. Complex systems are expensive to maintain, hard to secure and notoriously fragile. There is a limited scope use case where dynamic secrets may be a good fit, but I for one wouldn't base a purchasing decision on whether or not my secret store can do dynamic secrets. Instead, I'd want a secret store that has some provision for secrets rotation that I can tailor to the needs of my specific use cases.


We work with many Fortune 2000 customers, and having 500K secrets is on the extreme side and most certainly puts you in an infrastructure where you have 50K-100K+ machines under management.

In terms of the "compute cost", for an infrastructure of that size this is a negligible amount of overhead. For dynamic secrets that live 30 days, rotating 500K secrets works out to 1 secret every 5 seconds.

The advantage would be avoiding an incredible number of static credentials sprawled across a very large estate, plus having a unique audit trail that lets you identify points of compromise. Treating those credentials as dynamic will also reduce the human overhead of managing so many credentials, instead focusing on roles and high level intents.

I question if there is an non-disclosed bias given the anonymous user, created just in advance of the comment.


How is a dynamic secret that has a TTL of 30 days avoiding static credentials sprawled across the estate? It's static (for 30 days) and the only difference I see between rotating the password of an ID every 30 days and a dynamic secret with a TTL of 30 days is that when I cycle the dynamic secret I have to reassign entitlements to the new user ID and correlate this new user ID with my monitoring systems and Vault audit data. I disagree that the compute cost to doing all of that is a negligible amount of overhead.

Like I said, I think there is a use case here for dynamic secrets, but I have questions about what it looks like when it comes to trying to do them at scale. If you have solutions to the worries I outlined, I'd love to hear them.


This sounds an awful lot like .. Kerberos?


Here's what I see happen with Vault in actual real world usage. Applications use the token auth method with extremely long lived tokens, and then the auth tokens are stored in source or in config files or whatever. It just moves the problem and makes it look better because they aren't called "passwords"


This is absolutely not how WE use it, but I'm sure lots of people do stupid stuff like that.

We actually deploy vault tokens to every user(happens at login time, since login authentication happens against vault). So we can deploy secrets directly to end users using vault policy.

Applications that are not ran/started directly by the user find other means of getting their secret 0(VAULT_TOKEN), usually through nomad, but in the rare case it's a standalone application not run under nomad, we seed the startup with a token by hand, and then have a process handle renewals, etc.


This is a very real problem. In my mind, the problem with security today is there is no convention over configuration. The current state of the art of vendors today is very much "here's a bigger gun, figure out how to defend yourself." Think back, in 2005, how Java and XML configuration vs Ruby on Rails -- convention over configuration was very much a big change in how we, as software developers, were thinking. We need something similar in the security space.


Funny how hashicorp promotes security products on one hand and fails to implement proper security in their remaining products on the other hand.

For instance: Vagrant Images/boxes hosted by hashicorp (vagrantcloud) are neither signed nor is there any author information available (if not explicitly provided by the uploader).


Also, Terraform stores "all settings, including usernames, passwords, port numbers and literally everything else" in the tfstate file (1).

I believe that using Dynamic Secrets is HashiCorp's proposal for how to mitigate this; leave the secrets in the log files, but make sure they expire in a timely manner.

[1] https://tosbourn.com/hiding-secrets-terraform/


Yea, I wish they had thought of sensitive stuff in the state files from the beginning.


How does an application (say, a RESTful API server) deal with dynamic secrets? I suppose it could be simply restarted when a secret changes (assuming it is requesting the secret from Vault at startup). Is that how it is typically handled?


That is exactly right. At the bottom of the blog post we touch on this, but if you are using consul-template to provide secrets via a configuration file, it can either restart or reload (signal) the application to pickup the changes. Alternatively, an application could be Vault aware and use the SDK programmatically.


My concern is how you can tell me whether a database user was created by Vault and not by a malicious actor with knowledge of Vault's Dynamic Secrets method mimicking it.

How the heck can I know if it was a legit created database user or not?


When Vault connects to the endpoint system to create a dynamic user, it presents a set of credentials only known to it. You have to authorize Vault to create dynamic users, so a malicious actor would need to somehow obtain a similar level of privilege.

Vault typically prefixes something to the username as well (e.g. "vault-...") and also audits the creation of dynamic users so you can either look for the prefix or cross check the audit logs.


I assume you can check Vault's audit logs and see if the user was created there or not.


I have 500k ids today. Now you want me to make those ids dynamic and correlate how much Splunk data with Vault audit data? And what is my pattern matching regex going to look like when the ids I'm trying to match on are randomly generated? And how do I pick out anomalous behavior from the noise I just intentionally created because Terraform can't stop leaking my secrets? And what about the performance? How does Vault scale to generate that many identities? And how do I audit my authorizations since on Vault I'd just see what groups the IDs were added to but not the groups those groups belonged to? What about replicating my authentication backends? Active Directory replication takes minutes to replicate a password in some environments, it's going to take longer to replicate a new identity and its group memberships. And while I can revoke an identity after some time, that doesn't mean existing authenticated sessions are terminated, it just means subsequent authentication with the same secret will fail.


Maybe this feature isn't a good fit for your organization?

But based on what you've just asked, I'd definitely never create anything like what you're working with because it sounds like ANY addition to your infrastructure is problematic.


THIS! EVERYTHING THIS!


Or if you want to automate it, you could use the Vault API to see which tokens have been created.


> Vault associates each dynamic secret with a lease and automatically destroys the credentials when the lease expires

So whatever is accepting the secret must be checking with Vault to verify the secret.


How does this different from an SSO (single-sign-on server)?


SSO is for users, and credential lifetime is usually on the order of months. Vault is for service to service authentication keys/secrets, which are traditionally static because changing them is a complex process (you need to make sure you don't break services by forgetting them or updating credentials in the wrong order). Vault automates credential creation and rotation.


"Why you need to buy our product."


Vault is free and open source. We do have an enterprise product, but all the dynamic secrets capabilities exist in the open source!


There are also excellent free and open source frontend interfaces for vault that are out there, such as vault-ui and goldfish. We have used both at my org, since the HashiCorp enterprise pricing is unfortunately not within our budget.





Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: