

LXC, Docker, and the future of software delivery (LinuxCon) - julien421
http://www.slideshare.net/dotCloud/linuxcon-docker

======
peterwwillis
I still don't buy the idea of the Linux container as a universal way to do
anything. It depends on your kernel [opposite of what the slides claim], it
depends on your apps, it depends on your dependencies, it depends on your
architecture, etc.

The one phrase that's correct is _" it's chroot on steroids"_. That is in fact
exactly what it is. The exception is, it's even _less_ portable than just a
chroot environment. Docker adds extra features on top of the chroot, but
that's basically its core functionality.

So the first thing you have to ask yourself is: does my software need to be
run in a chroot environment or a VM isolated from all other applications? If
no, you very well may not need this _at all_ for your software deployment. If
anything, Docker images create a bigger burden on your deployment as you have
these large images to distribute, modify, manage. Of course they built in some
fancy network transmission magic to make it only copy changed parts of an
image, but this is still wildly less efficient than traditional means, and you
still have to fuck around with the image to make it incorporate your changes
before you push it.

If the big selling point is "commoditization", keep in mind that basically
everyone rolls their own environment and customizes their deployment. It's the
natural order of having your own architecture that fits your application. The
one thing that's never going to happen, is you taking Docker images from the
internet and never modifying them. This universal container system goes out
the window the first minute you have to start modifying everything to fit edge
cases, which is always going to happen.

~~~
shykes
> _Docker images create a bigger burden on your deployment as you have these
> large images to distribute, modify, manage_

Compared to what, though? Of course if your environment is homogenous enough
that you can express your application as, say a Jar or a gem - then you are
part of the lucky few and may indeed not need docker, because you share enough
context with your target infrastructure that a lot of the bits are already
implicitly deployed. (In other words: someone else had to move a big-ass
system image around so that you don't have to).

But the typical application stack is not like that. It is heterogeneous and
custom, and the only practical way to ship it reliably is to ship the entire
system with it, because you've run your tests on a particular libc, postgres
and ruby, built by a particular gcc, etc. In that case, your options are
limited: 1) ship a VM or 2) ship system packages.

And if your current options are indeed to either ship a VM or system packages
- then Docker suddenly doesn't seem that heavyweight after all :)

~~~
peterwwillis
Compared to, for example, using separate systems to manage the development,
change management, approval, deployment, bug tracking, revision control, and
other aspects of making a change to production.

The Docker image may include several of those subsystems, or cross multiple of
them, and so it now needs to be flexible enough to change one or several of
those parts before it can be pushed to production to fix a bug. And what
production environment will it be applied to? And is it possible that indeed
you will have several images that are almost the same, except for key parts
that can't be easily handled by yet-another overlay? (How many overlays will
you have? Will you eventually stop adding overlays and redo the image to
include the fix? What else will that affect?)

Complex systems require complex interaction, and Docker images do not allow
for that; they are monolithic, all-or-nothing changes which can only be
"modified" by either remaking the entire image (expensive), or adding another
layer of overhead on top, which I don't think anyone has ever investigated to
find bottlenecks or overhead problems.

To put it in simpler terms: Cfengine delivering a single change on a single
file to a dynamically-assigned set of nodes is a lot faster, lower overhead,
and direct than deploying a Docker change.

~~~
shykes
> _Complex systems require complex interaction, and Docker images do not allow
> for that_

Unless you compose a complex system by composing multiple containers - which
is the whole point of containers.

> _they are monolithic, all-or-nothing changes_

Compiling a binary is also a monolithic, all-or-nothing change. If you dig
deep enough there is always such a change. In the operation of distributed
systems (which you claim to be an expert of) that is a desirable property.

> _which can only be "modified" by either remaking the entire image
> (expensive)"_

Docker caches build steps. Which means it only rebuilds the layers which need
to be rebuilt.

Typically this means the application code is rebuilt when the developer pushes
a new version, while the underlying layers are left untouched. You know... the
same "rebuilding" your ghetto deployment script currently does.

> _or adding another layer of overhead on top, which I don 't think anyone has
> ever investigated to find bottlenecks or overhead problems._

We have been using aufs layers as the build mechanism for lxc containers at
dotCloud for roughly 3 years. In that time we've probably deployed half a
million containers, and served a few hundred million uniques (and I'm being
conservative). These containers included app servers, databases, and
everything in between.

So, yeah, they've been "investigated" for bottlenecks and overhead problems.

~~~
peterwwillis
1\. You don't understand what I mean by complex system, and i'm talking about
changes to a single file in a container. Running a new container does not
change the old container.

2\. Compiling a binary is not a 'monolithic change' (?), it is a static
configuration with dynamic elements. And please quote the line where I called
myself an expert of anything.

3\. Uh, I don't have a "ghetto deployment script", but thank you for the kind
words. Like I was saying before, your only option is to continue adding on
more unions every time you change a file, which _I bet_ will lead to
performance degradation, if not just general application headaches in the
future. If somehow Docker also prevents the need to stop and restart the
container when adding a new layer (which should be possible with Union),
that's great too! It still leaves a world of cruft behind in the form of old
filesystem layers and a maintenance hassle.

Awesome, you have real world performance numbers! So how many layers can you
add to a container without it going down before there's performance
degredation? How does it affect memory, or disk space from added layers? Does
aufs have an upper bound on the number of layers, or any other metrics? Would
love to see some numbers on this.

------
contingencies
Slide #40: _Typical Workflow_ is, very much like some other aspects of docker
(use of specific filesystems, use of entire filesystems within containers,
etc.), a false general case that is in fact unsuited to many people's
requirements.

Slide #44: _Docker roadmap_ towards _1.0_ seems to dodge the question of
significant differences in function with regards the apparent plan to adopt a
variety of storage backends with different capabilities, use of different
virtualization environments as targets, etc.

I support docker as a project but I still really think you guys need to stop
and ponder your architecture and goals before charging along too far. For
projects to survive long term and be useful sometimes separating concerns is
necessary, and I would suggest that's perhaps not being done well at present
with some one-size-fits-all assumptions that are pretty anti unix philosophy (
_do one thing and do it well_ ). What is the one thing? Is that really a
general need? In all cases? What does a user lose with this abstraction?
Rather than increasing scope, what would happen if you tried lopping those
bits off entirely?

