It's pretty rough but gets the job done for separating free and open source requirements.
Using >= doesn't ensure security, but it does ensure less stability, and part of security is availability.
You can also now list outdated packages with pip if you wanted to upgrade them yourself or test compatibility of new versions. Example for a side project I have laying around:
$ pip list --outdated
Django (Current: 1.8.5 Latest: 1.8.6)
$ pip install --upgrade django
I fully agree that deployment should be reproducible and stick to tested versions. But requirements.txt is how you build the software, not how you deploy it. (Unless it is, but then no sympathy from me ;) )
Real lesson. Authentication failed and no clue why app couldn't start. Developer spent almost two days and realized a new gem version had been released and the new version was not compatible with the current release. Quote developer "fool me once shame on you, fool me twice shame on me."
Imagine auto-scale (no image) or installing packages during docker container initialization, in production, the word "fuck" will fill up your mailbox/IRC/Slack/HipChat/ChatBot from your dear SRE / DevOps because of lack of regression testing.
Same argument goes for auto system update in Ubuntu. To receive critical update you better have a process to review, and warn critical updates. One change in the system API can screw up your entire system.
If you want the fail hard model, which is really useful, the best way is to have two sets of requirements.txt. One for random Saturday testing (just launch a docker instance running tests with latest version of packages installed), and one for development all the way to production.
Zope originated much of python packaging, through necessity (said tools/tooling has since continued to evolve). Plone still (AFAIK) have massive lists of "known good sets" of dependencies. Often down to the minor version. Because sometimes, what is a bugfix for most consumers of a library trigger latent bugs in other (sets of) libraries.
Yes, you want to fix all the bugs. But sometimes you can't - and then you might need to be explicit in upgrading from 3.2.1 to 3.2.1-bugfix9 rather than making the "leap" to 3.2.2.
Actually, it goes both way, and if your production is down, it is your problem. You can't expect everyone to be nice to you with guarantee. It is a shared responsibility.
on second thought, i guess you are suggesting to specify the major, with an ambiguous minor? but doesn't this require that no package will introduce a breaking change in a minor version? not sure you can rely on that assumption.
Looks like Confidant is tied to AWS whereas Vault can use various backends..?
Unsurprisingly it buys completely into the Hashicorp ecosystem (Consul, weirdo HCL configs, etc) which is a plus or minus depending upon your perspective I guess. Vault servers are also stateful, which complicates deployment (you can't just stick N Vault servers behind a load balancer).
If I had infinite time I'd consider creating a "SimpleVault" fork consisting of exactly one secret and storage backend, one way to auth, one curated workflow and make it run in a stateless manner. I'd probably also remove all the clever secret generation stuff, since 90% of applications just require a secure place to store secret blobs of data, as opposed to creating dynamic time-limited IAM credentials on the fly or whatever.
"Buys into the HashiCorp ecosystem": Consul is completely optional. We also support ZooKeeper, SQL, etcd, and more. There is zero forced tie to Consul. HCL is correct though!
"you can't just stick N Vault servers behind an LB": Actually, that is exactly what we recommend! https://vaultproject.io/docs/concepts/ha.html Vault servers are stateful in that there can only be one leader, but we use leader election and all data is stored in the storage backends, so it can die without issue.
"90% of applications just require a secure place to store secret blobs": Just use Vault out of the box. You'll have to configure only storage, after that you can use the token you get from initialization (one form of auth) and the `/secret` path to store arbitrary blobs that is preconfigured (one form of backend).
My team and I were investigating possible configuration options and secret management, and we couldn't figure out a good reason as to why the tool doesn't use S3 as one of the backends, which would essentially eliminate the need to maintain our own etcd/consul/sql whatever - plus, since the entire thing is path based, it seems incredibly well suited for that backend.
I couldn't think of one, but I might be missing something.
We're planning on switching to Vault or something similar in the near future & will be testing out both of these. I definitely agree with the "backend" confusion surrounding Vault but I'm interested in where Confidant may be lacking. We host services on AWS and Heroku, so in some places AWS' KMS may not be the best option.
Regardless, I do have to say it's so refreshing to see security tools with decent documentation & a clean user experience. I'm almost stunned to see a swiftype search box, one page threat model description & preconfigured turn-key container all in one place. Props to Lyft for opening up a great tool to the masses!
"The main difference between Confidant and the others is that Confidant is purposely not cloud agnostic, choosing to use AWS’s features to deliver a more integrated experience. By leveraging AWS’s KMS service, Confidant is able to ensure the master encryption key can’t be stolen, that authentication credentials for Confidant don’t need to be distributed to clients, and that authentication credentials for Confidant clients don’t need to generated and trusted through quasi-trustable metadata."
Confidant is purposely tied to AWS so that it can rely on KMS for its master key and for authentication.
Ex. You store your AWS Master key in a config file, and you have Microservice A that reads that key from the file. Microservice A is compromised (or its VM is compromised). How does having a secret store help you here? Couldn't the attacker just inspect the code of Microservice A and see that you are just reading from disk/reading from Vault?
In short, what do services like this protect from me (other than accidentally checking in my code to a public repo?)
Also, in this particular case, thanks to KMS you can also keep the stored at-rest on microservice A, and decrypt them only in memory as well.
Note that you don't store a master key in a config file, but instead you use the KMS master key to encrypt/decrypt things. You never get direct access to the master key.
One selling point of Confidant is using IAM roles to bootstrap authentication to the secret store. You can also do that with S3, put each secret into an individual text file and give each IAM role permission to access the secrets it needs. Set the S3 bucket to encrypt the data at rest, it uses KMS behind the scenes and automatically rotates encryption keys.
Rotation of the secrets themselves could be scripted or manual, that part would be basically the same process as using Confidant or any other tool. And I believe S3 access can even be auditable with CloudWatch logs.
Also, S3 now offers either eventually consistent or read-after-write consistency. EDIT: actually, it looks like new object PUTS can be read-after-write consistent but updates are not. So this could be a downside, if you rotate a key getting the new one is eventually consistent. In practice this might not be a big deal though, there's already going to be a gap between when you activate the new key and when your app gets reconfigured to start using the new key.
I'm very curious what the downsides might be of doing this. For all the various secret management tools that have been released in the past year or two, I'm kind of surprised I've never heard anyone talk about using raw S3.
Confidant provides access to credentials by IAM role. The key here is that you can have credentials A, B, and C, and IAM roles example1-production, example2-production and example3-production, then map credentials A and B to example1-production, A and C to example2-production and A to example3-production. Confidant just stores mapping info for this. If you update A, you don't need to fan out an update to all three services. Race conditions can come in when A and B or C are updated at the same time. S3 is very eventually consistent and there's no locking. read-after-write consistency is only available in one region at this point.
Of course, this is all based on your assumptions and how you design your system. Based on our needs KMS + DynamoDB made more sense for us. Something like sneaker (https://github.com/codahale/sneaker) may fit your needs better.
Another interesting bit about Confidant is that it manages the KMS grants on the auth KMS key, which lets you use the key from any service to any service to pass secrets, or to do general service to service auth.
Also, there's a bit more coming down the road for automatic management of other common secrets that are annoying to manage in AWS, like SSH host keys for autoscale groups: https://github.com/lyft/confidant/issues/1
Both of those are really great projects and I recommend them if they fit your use-case.
One of the reasons we originally avoided S3 was because it can occasionally be very eventually consistent, which we wanted to avoid. We also wanted to avoid fanning out secrets to services when the secret needed to be shared between multiple services.
We also wanted to have a UI that was simple and easy to use for everyone.
Otherwise the general design of credstash is very similar to Confidant.
I wish I am wrong - cause my heart always bleeds if I see db passwords in configuration files! But As long as there is a hypervisor you do not control access to - you must trust the owner of the bare metal to (1) honor your privacy (2) be competent to secure his system. Trust is nice, but it is not security.
granted - Confidant and KMS seem better solution than most. Will look into it at more detail. thx for open sourcing it and moving the solution forward.
There's immense value in defending against the kind of attacks where an attacker gets partial access, even if an attacker with omnipotence can compromise you.
mysqldump got a CLI argument to provide the password! As far as I know - visible to anyone with some access on the system. the documentation warns about that and suggest to create a config file to store the password.
security is a spectrum - but if it's about password storage in modern web applications live on the lower end.
I am increasingly frustrated by that - and if I raise concerns many admins stick to binary security "storing stuff unencrypted on disk is okay, cause the attacker is inside already" followed by "you must store the key somewhere" it's wrong and it's true at the same time. it's also sad - because not only the users, but also the ppl paying us trust us to keep them save. and we don't. :( we can say "it's a spectrum" and we are truthful - but we just can't keep em safe. I think it's important to recognize that simple fact.
...or is it me who is just meticulous?
but still..some rhetorical questions: Is KMS based on this concept? Is it's implementation open sourced? Is it verified to run unchanged in the instance I pay? Who is verifying it? Who is paying the one verifying it? Can I meet him / her to drink a coffee and build trust between us?
pls, do not misunderstand me. I got more trust in the amazon techs than in my own ability to admin a complex linux system on bare metal. But while all that is great progress toward security we still need to make a leap to find a solution. don't we?
It's theoretically possible, but right now it's too slow to be practical in usual situations. Just storing and single operations are maybe possible. But for example homomorphic AES takes multiple seconds per each block. And that's still AES-128.
(Granted, in this context, having a de-crypted disk at run time under a vm would be considered insecure - but still better (for some use cases) than man alternatives)
So in practice, probably that's the case - you could use homomorphic encryption for very infrequent operations on KEKs for example.
Doesn't Amazon KMS have access to the master key? And therefore, it can be stolen from them?
"AWS KMS is designed so that no one has access to your master keys."
Translation: We promise to not look at your keys.
Granted, this is most likely better than nothing, especially if you trust AWS, but ideally, you want only the client to have the key material.
There surely is some amount of trust there, though.
Not to diminish your point -- the how/where/when really does matter. Locking your house key in your car is still better than leaving it on your front step, but also not as good as in your pocket.
That being said, it seems like an interesting project to keep an eye on.
By sticking with a single cloud provider you get a lot of benefits. One of the more major benefits is cost, as you can do things like reserved instance in EC2. Another major benefit of sticking with a single provider is that you get to use solutions that only that provider has, which in this case is KMS, DynamoDB and IAM.
The biggest downside of being cloud agnostic is that you're stuck with the smallest combination of features that all of the clouds provide.
As you said, until tools exist that offer operability between clouds, users will be stuck with the smallest combination of tools. I do however believe such a day is coming.
Not saying it's a good or bad thing, but attend AWS reInvent or something and it's pretty easy to percieve huge uptake.
If you adopt extra management stacks running on or against your cloud you lose a lot of the benefits.
Sure Heat is like a comparable version of CloudFormation, but you can't do what you did in AWS and translate to Heat. OpenStack doesn't have all the services that AWS has. Not every product (and not every API in a product of AWS) works 100% with CloudFormation. If you are running an old release of OpenStack, you are more fucked in that case.
We are now moving our cloud infrastructure to AWS entirely so no more "it works in this environment", or "they sort of work and sort of different because of X and Y tool not comparable."
The right attitude should be "do one thing well, and improve." Architect "right" and than unlock yourself whenever possible. The same goes to people migrating to Docker or any container technology. Dockerfile is a lock-in. You still have to move back to regular shell script later if you drop Docker.
Not really a big problem there, since Dockerfiles are pretty much shell scripts :)