This is the core that Docker solves, and in such a way that developers can do most of the dependency wrangling for me. I don't even mind Java anymore because the CLASSPATHs can be figured out once, documented in the Dockerfile in a repeatable programatic fashion, and then ignored.
In my opinion the rest of it is gravy. Nice tasty gravy, but I don't care so much about the rest at the moment.
Edit: As danesparz points out, nobody has mentioned immutable architecture. This is what we do at Clarify.io. See also: https://news.ycombinator.com/item?id=9845255
I don't really see the point of lightweight virtualization. It provides an illusion of isolation which will likely come crashing down at some probably very inconvenient point (e.g. when you discover a bug caused by a different version of glibc or a different kernel).
Packer is not quite an apt comparison, but would be a better comparison, than Vagrant.
The advantage is you do the steps that could possibly fail at build time. The downside is you need to learn to get away from doing runtime configuration for upgrades.
I wrote Ansible, and I wouldn't even want to use it in Docker context to build or deploy VMs if I could just write a docker file - assumes I might not need to template anything in it, probably. I would still use Ansible to set up my "under cloud" as it were, and I might possibly use it to control upgrades (container version swapping) - until software grows to control this better (it's getting there).
However, if you were developing in an environment that also wanted to target containers, using a playbook might be a good way to have something portable between a Docker file and Vagrant if a simple shell script and the Vagrant shell provisioner wouldn't do.
I'd venture in many cases it would.
I don't care about the isolation for isolation sake, I care about it for the artifact sake.
What is this rule to only build once? I can see not wanting to create multiple artifacts of your codebase, but with machines it is possible to continually update them and sometimes desirable as well. In the "cloud" world, you can arguably rebuild a server every time it needs updates, but at the physical level you don't always have capacity to absorb the hit of rebuilding multiple boxes at once. The physical servers need to get updated and managed post-install.
Unless you're snapshotting that vagrant box and then deploying that to all your servers somehow, you are building multiple times.
> What is this rule to only build once?
I'd recommend reading the book Continuous Delivery. It is a fantastically helpful read.
I prefer not to update my machines, but that is because I follow immutable deployments. But, even if I did update my machines, it is far cleaner (and easier to roll back!) to deploy an asset which has all its dependencies in the box. than to push out code and maybe have to upgrade or install new packages. The gemfile.lock and friends make this a bit less of a problem, but you also get to lock things like libxml version or ffmpeg or...
> In the "cloud" world, you can arguably rebuild a server every time it needs updates, but at the physical level you don't always have capacity to absorb the hit of rebuilding multiple boxes at once.
Totally true, and we don't do this. We build a machine image and do a rolling-deploy replacing existing servers with the new machine image.
> The physical servers need to get updated and managed post-install.
One of the reasons I try not to work with hardware. Physical hardware is hard, and avoiding it makes my life much simpler. I love it.
You're also configuring many things in many different potentially complex ways.
The docker method of using environment variables as a configuration hack to get around this is pretty horrible, IMHO. Especially compared to ansible's YAML/jinja2 configuration.
YAML/jinja2 is just terrible. When you have to introduce a templating system to programmatically generate your YAML configuration files, what you have really needed the whole time is an actual programming language.
Taking out the turing completeness restricts the mess which you can make and lets non-programmers do more while still being customizable.
I think the point is not to conflate configuration that is equivalent to code (which, sure, put it in version control) with configuration that is specific to how code is deployed (which your deployment tool should just tell you, via env vars).
Which one? The one by Humble and Farley (Addison-Wesley) is from 2010, is it still relevant?
The book is excellent. I was surprised that it wasn't older.
If your cycle is "build, test, build, deploy", then you are not deploying those artifacts that you tested.
Any number of factors (different dependency version, toolchain difference, environment differences, non-reproducible builds) could lead to the second build being different from the first one, and then you deploy an untested artifact.
Not to mention that rebuild can be resource intenstive.
I care about tracking down issues before they reach production. Meaning that I want an environment that mirrors production as closely as possible. Meaning heavyweight not lightweight virtualization.
Our build scripts get tested a dozen times a day and cannot tolerate half-assed broken build scripts.
Our deployment pipeline (after verifying the image is good enough to be deployed) packs the docker image into a machine image along with several other containers. The machine image is then deployed to staging. If the machine image passes staging, it goes to production. If there is an issue which has hit production exclusively (it has happened only a handful of times,) it is simply an issue of rolling back to the previous machine image.
Well, it gets you a step closer to accurately mimicking production.
>It also doesn't "cover up" broken build scripts.
That seems to be what that 'build once' rule is for. If your building process isn't risky, why the need to prohibit running it twice?
Just because I can install the operating system doesn't mean I want to do this on every deploy of an application.
Agree and add that Python's paths (forget what they're called) (as well as Java CLASSPATHS) have been a problem for me on occasion too which means Docker would probably help with all these types of path issues.
Also, how do you manage your containers in production?
I'm not terribly fond of using environment variables for configuration, personally. That method requires either a startup shim or development work to make your program aware of the variables, and your container manager has to have access to all configuration values for the services it starts up.
Lots of people write their own scripts to do this; I wrote Tiller (http://github.com/markround/tiller) to standardise this usage pattern. While I have written a plugin to support storing values in a Zookeeper cluster, I tend to prefer to keep things in YAML files inside the container - at least, for small environments that don't change often. It just reduces a dependency on an external service, but I can see how useful it would be in larger, dynamic environments.
Management: We create AMIs using Packer. Packer runs a provisioning tool which downloads tho container and the configuration and sets up the process monitoring. It then builds a machine image, and then we launch new servers.