
Docker: Git for deployment - itsderek23
http://blog.scoutapp.com/articles/2013/08/28/docker-git-for-deployment
======
mitchellh
Disclaimer: Vagrant creator/maintainer guy.

It is unfortunate that so many people compare Vagrant and Docker. While there
is overlap, Docker is mostly not viable as a dev environment tool alone, so it
isn't a fair comparison. The main reason is because you have to be using Linux
(and a recent Linux) as your main dev system, and in practice this is very
rare. Move beyond indie developers and for all intents and purposes Linux
desktops are non-existent (Vagrant is in use by companies like BBC, Expedia,
Intuit, etc. and I can tell you most devs don't know how to use Linux let
alone run it as their primary dev platform).

BUT, I agree that putting your dev environment in a Docker container is
absolutely _amazing_, and there is a KILLER Vagrant/Docker combo.

The killer combo is actually running Vagrant to spin up Docker-ready VMs, then
using Docker inside that to develop. This lets people use Docker on Windows,
Mac, and Linux. You get the fast iteration time because all your state is
actually in a container, so you just docker kill and run as usual.

In fact, an upcoming version of Vagrant is adding Docker as a provisioner, so
you can `docker pull` containers down as part of `vagrant up`.

And I published Packer[1] templates to build Docker-ready
VirtualBox/VMware/AWS images that are Vagrant-ready:
[https://github.com/mitchellh/packer-
ubuntu-12.04-docker](https://github.com/mitchellh/packer-ubuntu-12.04-docker)

[1]: www.packer.io

~~~
dpritchett
I'm really curious what these "don't know linux" devs are doing with Vagrant
in their day to day jobs. Do they get one team member who 'knows linux' to set
up a Vagrantfile and then force them all to use it?

~~~
skrebbel
First off, it's not so black and white. Many developer know _some_ Linux but
wouldn't trust themselves setting up a secure production server. I consider
myself part of this group.

Using Vagrant, we can still: * Write software that depends on POSIX-only
applications, such as Redis and, yes, Docker * Share development environments-
that-look-like-production with other developers, thus avoiding "works on my
machine" bugs.

You need very little Linux knowledge to do this. Just apt-get, a text editor
and the occasional HOWTO/blog post gets you very far.

Additionally, with Docker-on-Vagrant, we can easily: * Simulate a multi-server
environment locally without hogging resources * Do effective _versioning_ on
dev environment configuration before sharing stuff with colleagues * As a
result, _learn Linux administration_ with easy rollbacks after fuckups.

All this without an on-team Linux guru.

Of course, once you go live, you'll need a decent sysadmin/devop type to un-
suck the installations. And backport that to the dev setups. Or, just go to
some PaaS and have the security/efficiency part done for you. But that's not
my point: my point is that even without running your own hardcore-linux-guru-
devopsed production environment, and without anything more than basic Linux
skills, you can get a lot of value from Vagrant+Docker.

------
peterwwillis
I don't see the parallels to Git at all. If you just looked at the command
line options without understanding what's going on at all it might look
similar.

> a tool like Puppet or Chef is needed when you have long-running VMs that
> could become inconsistent over time.

Uh, no, Puppet and Chef are designed for configuration management. To manage
your configs. They are not designed to replace customization and they don't
address package management or service maintenance. (They have options to munge
these things, but you still need a human to make them useful and coordinate
changes with your environment) Neither does Docker. Docker also doesn't do
configuration management. All different things.

> You'll be 100% sure what runs locally will run on production.

Incorrect; you're using unions to fill in the missing dependency gaps, so
there's no guarantee what was on your testing container's host is on your
production container's host. Your devs also might be running with container A
and B, but in production you're using containers A and C. Not to mention the
kernels may be different, providing different functionality. All this assuming
A, B and C don't need instances of different configuration. There are no
guarantees.

You know what else is crazy fast and easy to manage? Packages. There's this
new idea where you can have an operating system, run the command 'apt-get
install foobar', and BAM all of a sudden foobar is installed. If you need a
dependency, it just installs it. And it only downloads what it needs. Also
does rollback, transactions, auditing, is multi-platform, extensible, does
pre-and-post configuration, etc. Sound a lot like docker? That's because it's
a simpler and more effective version of docker [without running everything as
a virtualized service].

Deploy using your package manager. Except for slotted services which (AFAIK)
no open-source package manager supports, it will do everything you need. And
what it doesn't do, you can hack in.

~~~
jaytaylor
You may be interested in checking out ShipBuilder - it is an open-source
Heroku-clone PaaS.

ShipBuilder is written in Go, and uses Git, LXC and HAProxy to deliver a
complete solution for application platform-as-a-service.

[http://shipbuilder.io](http://shipbuilder.io)

[https://github.com/sendhub/shipbuilder](https://github.com/sendhub/shipbuilder)

~~~
peterwwillis
Your documentation seems limited to basic functions, with nothing explaining
really what this does, or why I would need it. I've just spent like 20 minutes
looking all over it and I have no idea what to do with it.

~~~
jaytaylor
Thanks for your feedback, this kind in particular is very helpful and
valuable.

We have a video explanation and walk-through of shipbuilder in the works which
should help communicate more clearly about what ShipBuilder is and what it can
do for you.

I have a few additional questions if you wouldn't mind helping me improve this
aspect of Shipbuilder:

    
    
        1. Have you used Heroku before?
    
        2. Are you confused about the purpose of ShipBuilder?
        (i.e. "what does Shipbuilder do?")
    
        3. Are you confused about how to setup the ShipBuilder
        server/nodes/load-balancer?
    
        4. Are you confused about how the client works?
    

Finally, please feel free to contact me personally; I'd love the opportunity
to answer questions or help you (or anyone) get started with ShipBuilder.

contact info: #shipbuilder on irc.freenode.net or jay [at) jaytaylor do.t com

~~~
minikomi
Look at how easy heroku makes your first few steps[1]. I'd advise doing
something similar but with, for example, instructions for setting up a quick
shipbuilder server on amazon ec2 or the like. Bonus: it will show you where
the pain points are for shipbuilder at the moment, because you'll have to
write too many small gotchas&workarounds into the tutorial!

[1]
[https://devcenter.heroku.com/articles/git](https://devcenter.heroku.com/articles/git)

~~~
jaytaylor
Thanks for pointing this out minikomi, I will have to put together an EC2
quickstart guide. I am thinking of a more succinct form of what is currently
the server install documentation [1].

[1]
[https://github.com/Sendhub/shipbuilder/blob/master/SERVER.md...](https://github.com/Sendhub/shipbuilder/blob/master/SERVER.md#installation)

------
seldo
I'm very excited about Docker[1] as both a development environment and
deployment solution. However, from my early experiments, it seems there's an
important difference between LXC (which is what Docker manages for you) and a
full VM, namely that the model revolves around running one process at a time:
you can install mySQL on your docker image, but once it's up, it's running
mySQL -- you can't then ssh into it as you would a VM to poke around, modify
config files, etc..

There are trivial ways to solve this, obviously. You can stop the image,
restart it running bash, use _that_ to modify config files, and then restart
it again. But it requires a change of mindset: these things are much more than
background processes, but they are less than a full VM. As the piece mentions,
configuration management for newly-started images seems to be a missing piece
of the puzzle right now, and debugging running Docker images can be...
strange. [2] Not necessarily difficult, but different from what you're used
to, and learning curves are barriers to adoption.

As this tech matures I think these things will be quickly solved, and I'm
looking forward to the results.

[1] Plus Virtualbox, started by Vagrant. See mitchellh's comment.

[2] Unless, of course, I'm missing something. Docker-people: how do _you_
configure vanilla server images to work in your environment?

~~~
nickstinemates
Well, the canonical way of running a container is as you mention.

However! There's a couple of options if you do not have a config you're
completely happy with yet.

One is to run a process manager like supervisord as your container process,
and start up any arbitrary amount of services you wish (like ssh.) It's my
understanding that in the future Docker will allow you to call `init`
directly, so it becomes more vm like.

The other, assuming a sufficiently modern kernel (I believe 3.8+, which is the
minimum supported for Docker) is to use the lxc userland tools, specifically
`lxc-attach <containerid>` This will allow you to create a shell in the
running container and poke about as needed.

~~~
seldo
My experiments with lxc-attach always failed; presumably my kernel was wrong
in some way (I followed instructions to get to 3.8, but I am sufficiently
clueless that I wasn't sure it had worked, or that it was the right flavor of
3.8).

But that's only the ad-hoc case: the bigger question is, if you have an image
with instructions "RUN apt-get install mysql", you're not even halfway to
having a copy of MySQL you can run in production: at a minimum you'll need to
install a custom my.cnf to suit your application's operational parameters[1],
but really you'll want it to be slightly different every time -- new bind
addresses, potentially new master-slave relationship grants, etc.. The way
docker images interact with configuration management in a grown-up production
environment is still really hazy to me.

[1] We are all agreed that running default my.cnf in production is laughable,
yes? That information has filtered into the mainstream from the DBA crowd?

~~~
nickstinemates
How I would personally tackle that specific problem is the following:

1) Create a Dockerfile which installs the dependencies of my image as a base
(maybe in this case all it is is RUN apt-get install mysql)

2) Tag the image as mysql-base.

3) Shell in to mysql-base, and iterate over the changes as needed until its
'production ready.'

4) Once it's suitably 'production ready', `docker diff` the version to see
which files changed.

5) Here comes the fork in the road. Either go back and instrument my original
Dockerfile to modify the files that were updated to make my image production
ready, OR, `docker commit` that image. There are benefits to both sides, but
ultimately it will be up to you in terms of maintainability. The definition of
'production ready' will differ from org to org.

6) Push the final image to a private registry.

~~~
contingencies
With step #1 ... _apt-get install mysql_ ... what happens when the network
repos go down? Like when you have to rebuild the same system four years later?
You might wind up with an epic fail. That's not very stable as a packaging
format then, is it? But this is just an example challenge from a much larger
set... all of which derive from the fact that state is being allowed to seep
in from random places. It's not clean.

This is essentially one of the core complaints I have with some of these
tools. In my own as-yet-unfinished tool's architecture that tackles similar
domains, network access is disallowed at deployment time. If a package cannot
be installed without network access, then it is not considered a valid
package.

~~~
nickstinemates
It all depends on your tolerance threshold and the trade-off's that are
involved to make an acceptable decision.

If you expect apt-get install mysql to fail in the future, there are plenty of
mitigating factors (storing the build/deps on your local repo, building from
source..)

My point is, you can always find pathological cases. Discussing them is great
as a straw man for improvement, but not really useful beyond it.

~~~
contingencies
Right, my tooling generally builds everything from source (mostly gentoo is
the target platform, though also ubuntu) and generates the deps automagically.

This is achieved by viewing 'build' and 'install' for the cloud-capable
service package as two separate steps, ie. build is the 'gather all requisite
goodies' step, and then a version is applied. 'Install' is where an instance
is actually created on top of a target OS platform image (also versioned).

Apparently what I consider fundamental architectural issues you see as
pathological cases. Your call! :)

Take for instance multiple cloud providers. Those guys are notorious for
giving you a _slightly_ different version of any OS as a stock image, and
running _slightly_ different configurations. Some of them even insert their
own distro-specific repos/mirrors. In that case, you are going to see entire
classes of weird and subtle bugs appear where you either:

(A) are not using the same cloud provider for test/staging/production
environment. (People tend to lean on local virt for the former).

(B) try to migrate (eg. due to cloud provider failure, hacks, bandwidth or
scaling issues, regulatory ingress, etc.) to another provider

That's not unrealistic, IMHO.

~~~
nickstinemates
> This is achieved by viewing 'build' and 'install' for the cloud-capable
> service package as two separate steps, ie. build is the 'gather all
> requisite goodies' step, and then a version is applied. 'Install' is where
> an instance is actually created on top of a target OS platform image (also
> versioned).

This doesn't apply to Docker. You can use the exact same process.

> Take for instance multiple cloud providers. Those guys are notorious for
> giving you a slightly different version of any OS as a stock image, and
> running slightly different configurations. Some of them even insert their
> own distro-specific repos/mirrors. In that case, you are going to see entire
> classes of weird and subtle bugs appear where you either

These are not issues with Docker. The Dockerfile specifically states its
environment, so it matters not what the cloud providers use on their host
image.

~~~
contingencies
_You can use the exact same process._

Yes, my point was that the state is iffy... the architecture isn't clean. The
output itself isn't versioned, only the script being input. The product is
assumed-equivalent (with inputs from the wider world suggesting it's not
always going to be), and not known-same. That's a bug at the level of
architecture, and it's real.

 _The Dockerfile specifically states its environment_

Well, I wasn't talking about docker. I was talking about the reality of
different cloud providers. But in my direct experience if docker makes the
assumption that, say, 'ubuntu-12.04' on 5 cloud providers is equivalent, then
sooner or later it's going to encounter problems.

~~~
shykes
> _if docker makes the assumption that, say, 'ubuntu-12.04' on 5 cloud
> providers is equivalent, then sooner or later it's going to encounter
> problems._

You misunderstand how docker works. 'ubuntu:12.04' refers to a very particular
image on the docker registry
([https://index.docker.io/_/ubuntu/](https://index.docker.io/_/ubuntu/)). That
image _is_ in fact identical byte for byte on all servers which download it.
So any application built from it will, in fact, yield reproducible results on
all cloud providers.

~~~
contingencies
My bad. That sounds logical, though a bit SPOFfy. FYI on our system instead of
providing an image (since the format is hard to fix if we want to support
arbitrary OSs and arbitrary cloud providers) we first provide a script that
can assemble (or acquire) an image (after which it is versioned), and also
specify a linked test suite.

That way, a particular build of a platform ( _ubuntu-12.04-20130808_ ) that we
create on a cloud provider could be used, or alternatively a particular cloud
provider's stock image ( _someprovider-ubuntu-12.04-xyz_ ) or existing bare
metal machine matching the general OS class in a conventional hosting
environment could also be used.

The idea is that where bugs are found (defined as "application installs fine
on our images, but not on _< some other existing platform instead>_") new
tests can be added to the platform integrity test suite to detect such issues,
and/or workarounds can be added to facilitate their support.

That way, when an application developer says "app-3.1 targets _ubuntu_ " we
can potentially test against many different Ubuntu official release versions
on many different images on many different cloud providers or physical hosts.
(Possibly determining that it only works on _ubuntu_ above a certain version
number.) Similarly, the app could target a particular version of _ubuntu_ , or
a particular range of build numbers of a particular version of _ubuntu_.

It's sort of a mid-way point offering a compromise of flexibility versus pain
between the _chef_ / _puppet_ approach (which I intensely disagree with for
deployment purposes in this era of virt) and the _docker_ approach (which
makes sense but could be viewed as a bit restrictive when attempting to target
arbitrary platforms or use random existing or bare metal infrastructure).

Also, would you consider the architectural concern I outlined valid? I mean,
in the case you are pulling down network-provided packages or doing other
arbitrary network things when installing... it seems to be like there is a
serious risk of configuration drift or outright failure.

------
BHSPitMonkey
I spent the entire article wondering how deployment via git was at all
relevant, until I read the very last heading.

(The title is supposed to be read as "Docker is as powerful for deployment as
git is powerful for SCM!". There is no mention of git-based deployment
strategies like Heroku's.)

------
patrickaljord
If you're around Paris, we're doing a Docker meetup in October
[http://www.meetup.com/Docker-
Paris/events/136924002/](http://www.meetup.com/Docker-Paris/events/136924002/)

------
bryanlarsen
Did you look at vagrant-lxc? If you already have a vagrant setup it's very
easy to switch to and works very well.

------
txutxu
Hello,

Is there any way to run this on a recommendable way with other platforms than
ubuntu?

I could love to see it running smoothly with debian and centos to turn
immediately into a converted user.

Nice work, looks impressive.

~~~
nadaviv
They have official images for Ubuntu, Centos and Busybox, and unofficial ones
for many more (Gentoo, Arch, Debian, OpenSUSE, and others).

See [https://github.com/dotcloud/docker/wiki/Public-docker-
images](https://github.com/dotcloud/docker/wiki/Public-docker-images) and
[https://index.docker.io/](https://index.docker.io/)

~~~
cwoac
Note that that is for containers (i.e. guests). For the host side, only ubuntu
is officially supported; many people are using arch quite well, and a few have
got it running under fedora and suse that I know of (I'm going to assume
gentoo have it as well, knowing them). both require custom kernels however.

------
passfree
I am still not sure how useful Docker is but we have created a simple example
how to run Docker on Vortex which is sort of Vagrant alternative.
[https://github.com/websecurify/node-
vortex/tree/master/doc/e...](https://github.com/websecurify/node-
vortex/tree/master/doc/examples/docker-helloworld)

------
based2
Another one in progress:

[https://github.com/arnoo/git-deliver](https://github.com/arnoo/git-deliver)

[http://i.reddit.com/r/programming/comments/1j9n8v/gitdeliver...](http://i.reddit.com/r/programming/comments/1j9n8v/gitdeliver_better_software_deployment_with_git/)

------
ivoras
OS containers can be very cool. Want to see 1000 instances started on a small
machine? [http://ivoras.sharanet.org/blog/tree/2009-10-20.the-night-
of...](http://ivoras.sharanet.org/blog/tree/2009-10-20.the-night-
of-1000-jails.html)

------
gtani
[https://news.ycombinator.com/item?id=6252182](https://news.ycombinator.com/item?id=6252182)

(i would give credibility to the docker maintainer)

------
j_s
Get back in touch with me when Docker can help with Windows deployments. (Is
there an issue I can 'star' or otherwise track?)

Edit: To say Docker is analogous to Git for deployment fails outside of the
Linux server realm (lxc). I'm trying to think of an OS-specific source control
system to fix the analogy but can't come up with anything only-Linux... 'TFS
for deployment'? :)

I plan to check out
[http://ulteo.com/home/en/download/sourcecode](http://ulteo.com/home/en/download/sourcecode)
as an alternative.

~~~
seldo
Docker is a pretty thin usability wrapper around LXC, which has Linux right in
the name, so very unlikely to become useful on Windows (or even OS X, which is
BSD) anytime soon.

~~~
nickstinemates
Not sure I would call it thin, but today Docker uses LXC as a container
provider. But in saying so it misses the point of Docker. Keep in mind also,
this is all related to the daemon. Full client capabilities are available on
any platform today.

In a previous blogpost, the Docker team outlined how LXC will become a (albeit
native) plugin, just like AUFS. Running Docker on BSD (using Jails as the
container provider) is certainly a goal. If you're on OS X, you could use
chroot instead of full namespacing capabilities.

To be fair, this is not available today. But I don't think it's fair to say
that Docker will not be useful anytime soon on those environments.

------
herpy
The title should read: "Docker _is_ Git for deployment"

Currently, it implies that Docker is _using_ Git for deployment.

------
megaman821
How does Docker mesh with something like continuous deployment? How many
layers until the AUFS falls down?

~~~
shykes
> _How does Docker mesh with something like continuous deployment?_

Very well. Because any source repository with a Dockerfile can be built into a
container with no other out-of-band information, it is very easy to compose
your ideal delivery pipeline with docker as the "lego" brick.

Docker+Jenkins and Docker+buildbot are popular combinations.

> _How many layers until the AUFS falls down?_

The hardcoded limit is 42 layers. But future versions of docker will store a
container's history separately from the aufs layers, so you can have an
arbitrary number of build steps.

------
amjith
I don't think there is a way to rollback an image to the previous commit. Is
there?

~~~
nickstinemates
Absolutely.

Provided you haven't deleted the old container, it's just a matter of using
`docker tag <previous commit id> amjith/imagetorollback`

Problem solved.

------
cynusx
docker is pretty awesome, at my company we share auxilary server around with
docker (message queue system) and it's pretty RAD. I can't wait until they
made it rock-solid enough to run systems in production with it.

------
coherentpony
Would it be possible to deploy an email server inside a docker container?

~~~
bacongobbler
possible? yes. The better question you should ask yourself would be "does the
use case make sense?"

Linux containers are ephemeral, which means that you will lose all data from
your email server on a container restart. If you're just setting up a postfix
SMTP server in a container, and forwarding the guest's port 25 to the host, I
don't see why not. You probaby won't have any app scaling support or the
likes, though. I could be wrong about that. I've never attempted to set up a
clustered SMTP server configuration before.

~~~
nickstinemates
Docker containers are not ephemeral. Persistent storage can be easily shared
between containers using a VOLUME or bind mount.

~~~
bacongobbler
TIL. Thanks for the heads up.

------
infocollector
My top wishlist for docker: Native windows support.

~~~
Ixiaus
From my limited knowledge of Windows: that's highly unlikely. You're much more
likely to see Docker start supporting FreeBSD jails with ZFS than you are
Windows.

Then again I know nothing of Windows and any containerization technology it
may have...

~~~
emmelaich
[http://www.sandboxie.com/](http://www.sandboxie.com/) ?

Or perhaps build something off the Chromium sandboxing.

[http://www.chromium.org/developers/design-
documents/sandbox#...](http://www.chromium.org/developers/design-
documents/sandbox#TOC-Sandbox-windows-architecture)

