1) Create a Dockerfile which installs the dependencies of my image as a base (maybe in this case all it is is RUN apt-get install mysql)
2) Tag the image as mysql-base.
3) Shell in to mysql-base, and iterate over the changes as needed until its 'production ready.'
4) Once it's suitably 'production ready', `docker diff` the version to see which files changed.
5) Here comes the fork in the road. Either go back and instrument my original Dockerfile to modify the files that were updated to make my image production ready, OR, `docker commit` that image. There are benefits to both sides, but ultimately it will be up to you in terms of maintainability. The definition of 'production ready' will differ from org to org.
6) Push the final image to a private registry.
This is essentially one of the core complaints I have with some of these tools. In my own as-yet-unfinished tool's architecture that tackles similar domains, network access is disallowed at deployment time. If a package cannot be installed without network access, then it is not considered a valid package.
If you expect apt-get install mysql to fail in the future, there are plenty of mitigating factors (storing the build/deps on your local repo, building from source..)
My point is, you can always find pathological cases. Discussing them is great as a straw man for improvement, but not really useful beyond it.
This is achieved by viewing 'build' and 'install' for the cloud-capable service package as two separate steps, ie. build is the 'gather all requisite goodies' step, and then a version is applied. 'Install' is where an instance is actually created on top of a target OS platform image (also versioned).
Apparently what I consider fundamental architectural issues you see as pathological cases. Your call! :)
Take for instance multiple cloud providers. Those guys are notorious for giving you a slightly different version of any OS as a stock image, and running slightly different configurations. Some of them even insert their own distro-specific repos/mirrors. In that case, you are going to see entire classes of weird and subtle bugs appear where you either:
(A) are not using the same cloud provider for test/staging/production environment. (People tend to lean on local virt for the former).
(B) try to migrate (eg. due to cloud provider failure, hacks, bandwidth or scaling issues, regulatory ingress, etc.) to another provider
That's not unrealistic, IMHO.
This doesn't apply to Docker. You can use the exact same process.
> Take for instance multiple cloud providers. Those guys are notorious for giving you a slightly different version of any OS as a stock image, and running slightly different configurations. Some of them even insert their own distro-specific repos/mirrors. In that case, you are going to see entire classes of weird and subtle bugs appear where you either
These are not issues with Docker. The Dockerfile specifically states its environment, so it matters not what the cloud providers use on their host image.
Yes, my point was that the state is iffy... the architecture isn't clean. The output itself isn't versioned, only the script being input. The product is assumed-equivalent (with inputs from the wider world suggesting it's not always going to be), and not known-same. That's a bug at the level of architecture, and it's real.
The Dockerfile specifically states its environment
Well, I wasn't talking about docker. I was talking about the reality of different cloud providers. But in my direct experience if docker makes the assumption that, say, 'ubuntu-12.04' on 5 cloud providers is equivalent, then sooner or later it's going to encounter problems.
You misunderstand how docker works. 'ubuntu:12.04' refers to a very particular image on the docker registry (https://index.docker.io/_/ubuntu/). That image is in fact identical byte for byte on all servers which download it. So any application built from it will, in fact, yield reproducible results on all cloud providers.
That way, a particular build of a platform (ubuntu-12.04-20130808) that we create on a cloud provider could be used, or alternatively a particular cloud provider's stock image (someprovider-ubuntu-12.04-xyz) or existing bare metal machine matching the general OS class in a conventional hosting environment could also be used.
The idea is that where bugs are found (defined as "application installs fine on our images, but not on <some other existing platform instead>") new tests can be added to the platform integrity test suite to detect such issues, and/or workarounds can be added to facilitate their support.
That way, when an application developer says "app-3.1 targets ubuntu" we can potentially test against many different Ubuntu official release versions on many different images on many different cloud providers or physical hosts. (Possibly determining that it only works on ubuntu above a certain version number.) Similarly, the app could target a particular version of ubuntu, or a particular range of build numbers of a particular version of ubuntu.
It's sort of a mid-way point offering a compromise of flexibility versus pain between the chef/puppet approach (which I intensely disagree with for deployment purposes in this era of virt) and the docker approach (which makes sense but could be viewed as a bit restrictive when attempting to target arbitrary platforms or use random existing or bare metal infrastructure).
Also, would you consider the architectural concern I outlined valid? I mean, in the case you are pulling down network-provided packages or doing other arbitrary network things when installing... it seems to be like there is a serious risk of configuration drift or outright failure.