

Docker Development Patterns - ingve
http://www.hokstad.com/docker/patterns

======
cpuguy83
To be honest, I'm very much against this approach to containerized
development.

The project should have a single Dockerfile from which
dev/test/staging/production work from.

There are certaily things you may want to change in these envs (particularly
dev and test) but these can be runtime configurations (through env vars or
changing the command that gets run in the container).

When I do dev, if I need to do so interactively I build from my main
dockerfile and do `docker run -v $(pwd):/opt/myapp myapp bash`, which mounts
my live code into the container, overwriting the code that was built in. I'll
mess around with some things, exit and adjust my Dockerfile if needed based on
those changes.

To run tests (as an example): `docker run -e RAILS_ENV test -v
$(pwd):/opt/myapp myapp rake test` If all is well and I'm ready to push my
code up, go ahead and rebuild the image with the changed code, run the tests
again, etc.

Sometimes it is neccessary to have some tools like gdb, strace, etc, available
within the container. For this you can startup the container as you normally
would, then (with Docker 1.3) `docker exec -it /bin/bash`, install the tools I
need to debug something, and get out.

In general it is just going to be bad practice to have a significantly
different from the production env, and is indeed why dev handoff to production
is often wrought with trouble.

~~~
vidarh
> The project should have a single Dockerfile from which
> dev/test/staging/production work from.

I don't agree with this largely, I think because we actually substantially
agree about the need to test regularly in an environment identical to
staging/production during development. I _DO_ agree that there should be a
single Dockerfile to be used as the _basis_ , and I _DO_ agree that you should
regularly run that Dockerfile in dev/test/staging/production unmodified.

But part of the point is that there is no reason to limit yourself to a single
dev instance of your app. I appreciate that might have been somewhat obscured
because of some of my examples, which uses a separate dev container setup that
I use for some projects as the base, including as a base for the production
container for my blog.

So to reiterate, I absolutely agree with having a Dockerfile that is identical
to your live environment to bring up an instance with during development.

But that is _not_ a reason to not _also_ have a container that bring it up
with a code reloader, or a container that lets you test in different
environments. And it certainly is not a reason to avoid isolating dev/build
dependencies into yet other containers.

I actually much prefer that to the "Rails way" of having separate environments
"built into" the app, and that's probably part of the difference. Instead of
doing that, I use "FROM" in Dockerfiles to "inherit" from bases to provide me
with different "lenses" to work on the project from different viewpoints. Not
just dev vs. live, but also with debugging tools, or architecture differences
etc.

> Sometimes it is neccessary to have some tools like gdb, strace, etc,
> available within the container. For this you can startup the container as
> you normally would, then (with Docker 1.3) `docker exec -it /bin/bash`,
> install the tools I need to debug something, and get out.

But there's no reason to keep redoing these steps when they can be
encapsulated in a Dockerfile that inherits from your basic container. And you
can easily run it in parallel with your "pristine" containers, or bring it up
in seconds as needed.

> In general it is just going to be bad practice to have a significantly
> different from the production env, and is indeed why dev handoff to
> production is often wrought with trouble.

We fully agree with this. It's a decade of frustration doing devops that has
made me enjoy the opportunities Docker brings to layer situation-specific
dependencies on top of a pristine base: It takes away any excuse for letting
the basic setup diverge, or for being unable to run full tests on an
environment identical to live.

~~~
msane
The only difference between dev/stage/prod/whathaveyou with docker is the
topography of the servers. For the (local) dev case, the topography is 1 box;
an "all-in-one" system. This does not require you to have different
dockerfiles per environment, it just means you have to make dockerfiles that
don't care about the topography!

------
Oculus
From a sideline viewers perspective, persistence in production for Docker
containers is a big problem that I've yet to see a good solution for. You end
up having to keep DB servers and such outside of your container pool.

~~~
vidarh
You don't. That's the point of volumes. You do need to be careful to ensure
you mount volumes or everything that needs to persist, but in practice that's
not a very onerous limitation.

Since volumes are bind-mounted from the outside, you can put the volumes on
whatever storage pool you want that you can bring up on the host (as long as
it meets your apps requirements, e.g. POSIX semantics, locking etc).

E.g. at work we have a private Docker repository that runs on top of GlusterFS
volumes that are mounted on thehost and then bind mounted in as a volume in
the container.

I also run my postgres instances inside Docker, with the data on persistent
volumes.

We could certainly use better tools to manage it, though.

Another pattern I ought to have mentioned (that I've used myself) is to set up
"empty" containers whose only purpose is to act as a storage volume for
another container. I don't like that as much, mostly since I've not had as
much long-term experience with how the layering Docker uses would impact it.

~~~
peteridah
We decided not to use docker containers for our postgres DB in production;
Volume mounts just don't make me sleep easy. We use an S3-backed private
docker registry to store our images/repositories.

~~~
vidarh
What is your issue with volumes? Bind mounts have been battle tested over many
years. I e.g. have production Gluster volumes bind-mounted into LXC containers
that have been running uninterrupted for 5+ years.

~~~
klochner
With docker specifically, I've had frustrations with the user permissions on
the volume between {pg container, data container, host OS}, with extra trouble
when osx/boot2docker is added to the stack.

Also, docker doesn't add as much value for something like postgres that likely
lives on it's own machine.

~~~
vidarh
> I've had frustrations with the user permissions on the volume between {pg
> container, data container, host OS}

Then don't use data containers. I don't see much benefit from that either. The
stuff we put on data volumes is stuff we want to manage the availability of
very carefully, so I prefer more direct control.

And so when I use volumes it's always bind mounts from the host. Some of them
are local disk, some of them are network filesystems.

We have some Gluster volumes that are exported from Docker containers that
imports the raw storage via bind mounts from their respective hosts, for
example, and then mounted on other hosts and bind-mounted into our other
containers, just to make things convoluted - works great for high availability
(I'm _not_ recommending Gluster for Postgres, btw.; it "should" work with the
right config, but I'd not dare without very, very extensive testing; nothing
specific with Gluster, just generally terrified of databases on distributed
filesystesms).

> for something like postgres that likely lives on it's own machine.

We usually colocate all our postgres instances with other stuff. There's
generally a huge discrepancy between the most cost-effective amount of
storage/iops, RAM and processing power if you're aiming for high density
colocation, so it's far cheaper for us that way.

------
antocv
We are doing something similar at work, specifying exact few dependencies
needed on the host and their versions (mostly docker, tar etc), then we make a
chroot environment using debootstrap from known few packages and versions,
from this we make a docker image (tar cf - chroot | docker import) and use
this as base for build environment. We could use pacstrap or whatever too, the
initial selection of packages is important.

For any package which we need to build, we have many packages with
dependencies between them, are specified in a Makefile - because make is good
at dependency tracking, but when make needs to (re)-build a package, it does
so using the docker build-container-base image, whatever that package does is
committed for later analysis under the name of the package built and
timestamp, for traceability.

For every package built, the resulting docker image, with build-environment-
and-package-plus-timestamp name, is run but now, trying to install the package
and commit that image as well with 'install' tag. - so we also verify the
install step basically.

Btw, this is for openstack, many components, packages and dependencies all
over the place.

------
atbell
I'm extremely surprised that Packer
([http://www.packer.io](http://www.packer.io)) hasn't received more attention
in the Docker arena. Its Docker builder, coupled with any of its provisioners,
makes it a pretty attractive alternative to the simplistic layout and
limitations of Dockerfiles.

~~~
mtalantikite
Packer currently doesn't snapshot the build at each step, so you don't get a
history that you can checkout or push from. It's a feature that's on the way,
but it's a pretty big one that some teams need.

We use packer for a lot of other things though and it's great.

