I'm very excited about Docker as both a development environment and deployment solution. However, from my early experiments, it seems there's an important difference between LXC (which is what Docker manages for you) and a full VM, namely that the model revolves around running one process at a time: you can install mySQL on your docker image, but once it's up, it's running mySQL -- you can't then ssh into it as you would a VM to poke around, modify config files, etc..
There are trivial ways to solve this, obviously. You can stop the image, restart it running bash, use that to modify config files, and then restart it again. But it requires a change of mindset: these things are much more than background processes, but they are less than a full VM. As the piece mentions, configuration management for newly-started images seems to be a missing piece of the puzzle right now, and debugging running Docker images can be... strange.  Not necessarily difficult, but different from what you're used to, and learning curves are barriers to adoption.
As this tech matures I think these things will be quickly solved, and I'm looking forward to the results.
 Plus Virtualbox, started by Vagrant. See mitchellh's comment.
 Unless, of course, I'm missing something. Docker-people: how do you configure vanilla server images to work in your environment?
Well, the canonical way of running a container is as you mention.
However! There's a couple of options if you do not have a config you're completely happy with yet.
One is to run a process manager like supervisord as your container process, and start up any arbitrary amount of services you wish (like ssh.) It's my understanding that in the future Docker will allow you to call `init` directly, so it becomes more vm like.
The other, assuming a sufficiently modern kernel (I believe 3.8+, which is the minimum supported for Docker) is to use the lxc userland tools, specifically `lxc-attach <containerid>` This will allow you to create a shell in the running container and poke about as needed.
My experiments with lxc-attach always failed; presumably my kernel was wrong in some way (I followed instructions to get to 3.8, but I am sufficiently clueless that I wasn't sure it had worked, or that it was the right flavor of 3.8).
But that's only the ad-hoc case: the bigger question is, if you have an image with instructions "RUN apt-get install mysql", you're not even halfway to having a copy of MySQL you can run in production: at a minimum you'll need to install a custom my.cnf to suit your application's operational parameters, but really you'll want it to be slightly different every time -- new bind addresses, potentially new master-slave relationship grants, etc.. The way docker images interact with configuration management in a grown-up production environment is still really hazy to me.
 We are all agreed that running default my.cnf in production is laughable, yes? That information has filtered into the mainstream from the DBA crowd?
How I would personally tackle that specific problem is the following:
1) Create a Dockerfile which installs the dependencies of my image as a base (maybe in this case all it is is RUN apt-get install mysql)
2) Tag the image as mysql-base.
3) Shell in to mysql-base, and iterate over the changes as needed until its 'production ready.'
4) Once it's suitably 'production ready', `docker diff` the version to see which files changed.
5) Here comes the fork in the road. Either go back and instrument my original Dockerfile to modify the files that were updated to make my image production ready, OR, `docker commit` that image. There are benefits to both sides, but ultimately it will be up to you in terms of maintainability. The definition of 'production ready' will differ from org to org.
With step #1 ... apt-get install mysql ... what happens when the network repos go down? Like when you have to rebuild the same system four years later? You might wind up with an epic fail. That's not very stable as a packaging format then, is it? But this is just an example challenge from a much larger set... all of which derive from the fact that state is being allowed to seep in from random places. It's not clean.
This is essentially one of the core complaints I have with some of these tools. In my own as-yet-unfinished tool's architecture that tackles similar domains, network access is disallowed at deployment time. If a package cannot be installed without network access, then it is not considered a valid package.
Right, my tooling generally builds everything from source (mostly gentoo is the target platform, though also ubuntu) and generates the deps automagically.
This is achieved by viewing 'build' and 'install' for the cloud-capable service package as two separate steps, ie. build is the 'gather all requisite goodies' step, and then a version is applied. 'Install' is where an instance is actually created on top of a target OS platform image (also versioned).
Apparently what I consider fundamental architectural issues you see as pathological cases. Your call! :)
Take for instance multiple cloud providers. Those guys are notorious for giving you a slightly different version of any OS as a stock image, and running slightly different configurations. Some of them even insert their own distro-specific repos/mirrors. In that case, you are going to see entire classes of weird and subtle bugs appear where you either:
(A) are not using the same cloud provider for test/staging/production environment. (People tend to lean on local virt for the former).
(B) try to migrate (eg. due to cloud provider failure, hacks, bandwidth or scaling issues, regulatory ingress, etc.) to another provider
> This is achieved by viewing 'build' and 'install' for the cloud-capable service package as two separate steps, ie. build is the 'gather all requisite goodies' step, and then a version is applied. 'Install' is where an instance is actually created on top of a target OS platform image (also versioned).
This doesn't apply to Docker. You can use the exact same process.
> Take for instance multiple cloud providers. Those guys are notorious for giving you a slightly different version of any OS as a stock image, and running slightly different configurations. Some of them even insert their own distro-specific repos/mirrors. In that case, you are going to see entire classes of weird and subtle bugs appear where you either
These are not issues with Docker. The Dockerfile specifically states its environment, so it matters not what the cloud providers use on their host image.
Yes, my point was that the state is iffy... the architecture isn't clean. The output itself isn't versioned, only the script being input. The product is assumed-equivalent (with inputs from the wider world suggesting it's not always going to be), and not known-same. That's a bug at the level of architecture, and it's real.
The Dockerfile specifically states its environment
Well, I wasn't talking about docker. I was talking about the reality of different cloud providers. But in my direct experience if docker makes the assumption that, say, 'ubuntu-12.04' on 5 cloud providers is equivalent, then sooner or later it's going to encounter problems.
> if docker makes the assumption that, say, 'ubuntu-12.04' on 5 cloud providers is equivalent, then sooner or later it's going to encounter problems.
You misunderstand how docker works. 'ubuntu:12.04' refers to a very particular image on the docker registry (https://index.docker.io/_/ubuntu/). That image is in fact identical byte for byte on all servers which download it. So any application built from it will, in fact, yield reproducible results on all cloud providers.
My bad. That sounds logical, though a bit SPOFfy. FYI on our system instead of providing an image (since the format is hard to fix if we want to support arbitrary OSs and arbitrary cloud providers) we first provide a script that can assemble (or acquire) an image (after which it is versioned), and also specify a linked test suite.
That way, a particular build of a platform (ubuntu-12.04-20130808) that we create on a cloud provider could be used, or alternatively a particular cloud provider's stock image (someprovider-ubuntu-12.04-xyz) or existing bare metal machine matching the general OS class in a conventional hosting environment could also be used.
The idea is that where bugs are found (defined as "application installs fine on our images, but not on <some other existing platform instead>") new tests can be added to the platform integrity test suite to detect such issues, and/or workarounds can be added to facilitate their support.
That way, when an application developer says "app-3.1 targets ubuntu" we can potentially test against many different Ubuntu official release versions on many different images on many different cloud providers or physical hosts. (Possibly determining that it only works on ubuntu above a certain version number.) Similarly, the app could target a particular version of ubuntu, or a particular range of build numbers of a particular version of ubuntu.
It's sort of a mid-way point offering a compromise of flexibility versus pain between the chef/puppet approach (which I intensely disagree with for deployment purposes in this era of virt) and the docker approach (which makes sense but could be viewed as a bit restrictive when attempting to target arbitrary platforms or use random existing or bare metal infrastructure).
Also, would you consider the architectural concern I outlined valid? I mean, in the case you are pulling down network-provided packages or doing other arbitrary network things when installing... it seems to be like there is a serious risk of configuration drift or outright failure.