

A first look at docker.io  - gklein
http://www.alexhudson.com/2013/05/28/a-first-look-at-docker-io/

======
derefr
> docker is just slightly further along the spectrum than Xen: instead of
> using a special guest kernel, you use the host kernel. Instead of
> paravirtualisation ops, you use a combination of cgroups and lxc containers.
> Without the direct virtualisation of hardware devices, you don’t need the
> various special drivers to get performance, but there are also fewer
> security guarantees.

No. People seem to be really confused about this. Docker is a _container
standard_ , not a virtualization system. The thing you can download on the
Docker website, besides being a toolchain for creating containers, is a
_reference target_ for containers to deploy to, which just _happens_ to
(currently!) use cgroups and lxc and aufs overlays.

The point of Docker is to create one thing (basically a container file format
+ some metadata stored in a "container registry") that you _can_ deploy to all
sorts of different places, without changing what's inside it. In fact, one of
the goals of Docker is precisely that the reference target (cgroups+lxc+aufs)
could be entirely swapped out for something else in the next version of the
Docker toolchain, and none of your containers would have to change.

Docker containers _will_ be deployable as Vagrant boxes, Xen VMs, AWS
instances, whatever. Targets that have an overlay filesystem will use
overlays; targets that don't will build a flattened filesystem image from the
container stack. Targets that have paravirtualization will rely on their host
kernel; targets that don't will rely on the kernel in the container's base
image. And so forth. It isn't possible _yet_ , but that's the entire goal
here.

~~~
ealexhudson
(Disclaimer; I wrote the blog post a little while ago, and my thoughts on this
have changed a little bit - surprised to see it pop up here...)

I think the confusion here is two-fold; sure, I think many don't really grasp
what the point of the project is in its entirety, but I also think the aim of
the project is a bit confused.

The comment I made that you're referring to was specifically about the
runtime: that's the bit of the paragraph at the start you chopped out. Yes,
the runtime could change in the future, but the comment was about the existing
runtime.

The 'container standard' thing I think is potentially interesting, but
actually, I don't think it buys much. As a set of tools, it is substantially
weaker than the existing system development/spinning tools. And sure, getting
rid of overlays might be possible - but then, what's the point? If you're
going to flatten out trees, you may as well build the image properly in the
first place.

~~~
derefr
The point of the layered containers has nothing to do with how they're run; it
has to do with how they're developed. By splitting the OS from the runtime
from the application, each one can be updated when the layer above it needs it
to be, and is otherwise fixed to a binary-identical image. Then, new releases
of higher-level things (apps) can target the _old_ versions of lower-level
things (runtimes, OSes), knowing they will be _literally_ , bit-for-bit, the
same thing their other releases are using.

This is the guarantee Heroku makes, for example: updates to their servers will
never break your app, because although the packages making up their
_infrastructure_ might update, the packages making up _your container 's base
image_ are frozen in time unless _you_ switch out your container's "stack"
(base image) for a new one.

Having a frozen base OS image, and then a frozen runtime on top of that,
allows for perfect _repeatability_ in your deploy process. Once you've got a
tested-and-working runtime image, that references a tested-and-working OS
image, you just stop touching them altogether; you keep the container-ID of
that runtime image fixed in production, and deploy your new app releases on
top of it.

One neat side-effect of this: if containers have parent-images in common,
those parent-images can be _cached_ at the target. If all the containers
running on some server use the same common base-image, that base image only
needs to be downloaded to the server once. The second-through-Nth time, the
container only grabs what's different--the tiny "app" part of the image--and
then composes it with everything into a running container.

~~~
ealexhudson
I disagree entirely, tbh, the only advantage of layering I see is precisely in
the runtime: it makes the layers that change most very slim and easily
deployable.

Repeatability is great, but who doesn't have that already? Are people really
building new OS images for every deploy? I don't believe that for a second,
and I can't think of many tools off the top of my head that don't have that
baked in right from the start. That's the whole point of package management.

~~~
derefr
Package management is entirely _non_ -repeatable in almost every incarnation.
DEB, RPM, even Slackware tarballs, all run scripts on install. There are
shunts of existing files and directories, interactive prompts, EULAs to agree
to (try installing mscorefonts on Ubuntu), packages that immediately start
services that generate files on first run that the service expects to be there
from then on, packages that expect other packages to _not_ have been installed
(samba3 doesn't like samba4 much)--and so forth. Installing a package _can
fail_ , or produce widely-varying results from machine to machine.

"Configuration management", ala Chef/Puppet, is no better--it tries to
manipulate a system from an _unknown_ state into a _known_ state, without
having an awareness of the full set of interactions that the _unknown_ state
can have on the _known_ state. (For example, deploy Apache with default setup
via Chef, on a server that already had Nginx installed manually. What do you
get? Who knows!)

You'd think that, say, running the OS install media from scratch on each
"container-up", and then running a script to install a preset list of packages
with hardcoded versions, might be enough--but nope, OS install media is
absolutely non-deterministic, just like everything else. The installer could
decide from the container's IP address that it should talk to a different
package server (oh hey I'm in Canada let's use ca.archive.ubuntu.com!) and
then find itself unable to get past the deploy-infrastructure's firewall-
whitelist.

In short--anything that relies upon, or can make a decision based upon,
information that could be different from container to container (like the
container's IP address, for example) _isn 't guaranteed_ to produce binary-
identical results at the target. You _only_ get that by running through
whatever imperative process spits out all these files _once_ \--and then
freezing the results into a _declarative_ container.

So, what does work? Xen snapshots. Amazon AMIs. Vagrant images. All of these
are declarative. And all of these _are target formats_ for Docker. Docker is a
vendor-neutral thing which you will _turn into these_ , along with turning it
into its current LXC+aufs form.

And note, by design, the running images of the "final product" will be leaf-
nodes; you won't touch them or modify them or SSH into them, you won't base
new containers on them; you'll just spin them up and then down again, like on-
demand EC2 instances. Docker is not _for_ doing fancy things with the running
final products in their target format. Docker, by itself, is _not for
production at all!_ Once you've deployed a Docker container as a Xen snapshot
or an AMI or whatever, it's done; some other infrastructure takes care of the
_running the target-format containers_ part.

So what's all that junk in the Docker toolchain? Docker, as a standard, is an
_intermediary format_ that makes it easy for _developers_ to build these
vendor-neutral images. The reason you can start containers, stop them, freeze
them, and then fork new containers from them, is entirely to do with
_developing new container images_ , and not at all to do with deploying
containers in production. It's about re-using common parent images, by
reference, as easily as we currently stick a Github repo in our Gemfiles.

~~~
emilisto
> Docker is not for doing fancy things with the running final products in
> their target format. Docker, by itself, is not for production at all! Once
> you've deployed a Docker container as a Xen snapshot or an AMI or whatever,
> it's done; some other infrastructure takes care of the running the target-
> format containers part.

This is something very interesting and under-communicated. I've always assumed
docker is run on the production server and is used to pull updated images and
spawn containers. But you're suggesting one uses custom toolchain for making
an image out of the container filesystem and the LXC template, and then deploy
this container?

~~~
shykes
Both are possible. And since docker itself is not yet production-ready,
exporting docker containers to "inert" server images (for example AMIs) is a
good stopgap which allows you to use docker for dev and test.

But that is not the ideal workflow. The ideal workflow is to run docker on all
your machines, from development to production, and move containers across
them. If you don't do that, you will miss out on a big part of the value of
docker. To name a few: identical images in dev and prod; lightweight
deployment; and a toolchain that is less dependent on a particular
infrastructure.

------
pilif
Ever since I first read about docker I wondered: How are you dealing with
application data and updates of containers? Let's say you package everything
you need for your web application together as one container: Web Server,
Database-Server and Application Server: Where would the data go that the users
of that web applications generate?

In the container? That would means that the data goes away as the container
gets updated (if you just replace the old container with a new one). Or do you
just never replace the whole container but just update what's inside? Or would
you mount a device of the host? Does the container get access to device nodes
of the parent? I would assume not. Or can you provide a container with
something it can mount? Or are you stuck with some network based storage
solution? That would rule out running databases in containers once the load
raises.

Yes. I could just read up on all of this, but I have a feeling that other
people have the same question, so by asking here, I might help them too.

~~~
Joeboy
I think this issue somewhat compromises Dockerfiles as a means of distributing
applications. With a VM, you can supply something in a minimally configured,
plug-and-play state, and then leave it to the user to set it up. With Docker,
it's a little bit too painful to configure things after running an image. I'm
not sure what the answer to this is.

~~~
vidarh
Which is it: Minimally configured, or plug-and-play?

Either you'll be setting up your containers to prepare you to use something
like Chef or Puppet against the, or you'll be setting up your containers to be
fully configured apart from connection details to other components.

That last bit is down to service orchestration, which is outside the scope of
Docker as far as I understand it. You can "roll your own" easily enough: Write
a "/usr/local/bin/add-relation" script that takes a function and a list of
arguments to configure that function.

Write a script that reads a config file that defines the relationships between
container types (e.g. your MySQL container and your Web container), and
triggers execution of those add-relation scripts. E.g. your Web container gets
called with "add-relation Mysql --ip [foo] --username [bar] --password [baz]"
and the version of the script in that container knows how to add users to
MySQL.

That is _very roughly_ the approach that Juju (Ubuntu's orchestration system)
takes.

The point is to split all configuration details into thress classes: Those
that can be set statically on build. Those that can be decided on first boot
(e.g. re-generate a ssh-host-key, get IP address etc.). And finally those that
needs to be set dynamically based on what other containers are spun up.

The latter has no place in the Dockerfiles or images, but scripts or tools to
set these config details based on input from some external source, does.

It may sound painful, but only if you do it to a handful of vm's that rarely
change. The moment you need to manage dozens, or need to spin vm's up and down
regularly, taking the effort to set up proper orchestration, whichever method
you prefers, wins hands down to having a user set it up (I can guarantee with
near 100% certainty that said user will not set them up different VMs
consistently).

------
susi22
I'm quite excited seeing the open source community putting so much work into
Linux containers. For the most part they're a lot better than true
virtualization. We've seen it in the past, what you guys can do with Puppet as
a configuration system.

I just wish lxc was as secure as Solaris zones. Since containers are not
secure at all, they definitely won't be used for shared hosting. The team
seems to be working on it, but it will probably take a few years to get it
secure enough:

[https://wiki.ubuntu.com/LxcSecurity](https://wiki.ubuntu.com/LxcSecurity)

~~~
yebyen
> Since containers are not secure at all

I feel like you're just spreading FUD here. This was definitely true a few
years ago, but "Citation Needed" applies. The worst thing I found in the
article you linked is that guests use the same kernel as the host, so if the
host kernel is vulnerable, it will still be vulnerable from the guest...

Then it goes on to say "we have seccomp2 to lower the dimensions of attack
surface." It does not sound to me like your citation agrees with what you said
at all.

I just hope you've read the top-rated comment, where it's explained that
containers are not virtualization, and they solve different problems.

~~~
susi22
IF you have root on the container, you have root on the host. This is a HUGE
difference and probably one of the main reasons LXC isn't in huge use for VPS.

This is, btw, different than Solaris Zones, which give you a complete new user
management for each container. They're very isolated. Zones have had some
exploits to get out of the Zone, but they're pretty secure. LXC has started
moving towards a more secure design but it will take years (IMO) to get LXC
actually in production for _shared_ hosting.

See:

[https://www.suse.com/documentation/sles11/singlehtml/lxc_qui...](https://www.suse.com/documentation/sles11/singlehtml/lxc_quickstart/lxc_quickstart.html)

 _Security depends on the host system. LXC is not secure. If you need a secure
system, use KVM._

[http://www.funtoo.org/Linux_Containers](http://www.funtoo.org/Linux_Containers)

 _As of Linux kernel 3.1.5, LXC is usable for isolating your own private
workloads from one another. It is not yet ready to isolate potentially
malicious users from one another or the host system. For a more mature
containers solution that is appropriate for hosting environments, see OpenVZ._

[http://lwn.net/Articles/515034/](http://lwn.net/Articles/515034/)

 _" Containers are not for security", he said, because root inside the
container can always escape, so the container gets wrapped in SELinux to
restrict it_ ... _A number of steps have been taken to try to prevent root
from breaking out of the container, but there is more to be done. Both mount
and mknod will fail inside the container for example. These containers are not
as secure as full virtualization, Walsh said, but they are much easier to
manage than handling the multiple full operating systems that virtualization
requires. For many use cases, secure containers may be the right fit._

~~~
jpetazzo
Sorry, but this is FUD:

> IF you have root on the container, you have root on the host.

This is only true on badly configured systems. If you run some kind of public
shared hosting (like Heroku, dotCloud, etc.) you probably slap some extra
security on top of it. For instance, dotCloud uses GRSEC, limits root access,
and uses kernel capabilities.

It won't take years to get LXC in production for shared hosting: it has been
in production for shared hosting for years -- but by people who (more or less)
knew what they were doing.

Agreed, "out-of-the-box LXC" is probably not that secure; which is probably
why many people won't deploy it. And I can't blame them. Any technology
generally starts being usable (or usable safely) only for expert users, then
progressively gets more industrialized and ready to use for a broader
audience. It doesn't mean that the technology is not mature.

Also, the user separation that you mention has been implemented in the Linux
kernel for a while[1]; it's called "user namespace", and even if the default
LXC userland tools do not make use of it at this point, it's here.

[1]
[https://wiki.ubuntu.com/UserNamespace](https://wiki.ubuntu.com/UserNamespace)

~~~
susi22
I do know those things. It doesn't change the fact that you can _easily_ break
out of a Container and compromise neighboring Containers. No matter how much
to harden the system you implement.

Who is using LXC in a _shared_ hosting environment?

------
jokull
What makes me excited about Docker’s diff-based filesystem stack is that it
potentially allows a build of code to go through a CI chain untouched, all the
way into production really quickly. I’m not a virtual machine expert, but I
believe this is slower in traditional virtualization like Vagrant for example.
Am I far off? At work we’re using layered AMI build tools to roll new services
and app builds into production, and the process is not quick.

~~~
shykes
You hit the nail on the head. That is a major promise of Docker, and the
reason we are emphasizing Dockerfiles so much. It makes the act of building
your code 100% automated and discoverable, and the resulting build should be
usable, unchanged, all the way from development to production.

------
magickevin
> given the claims on the website I assumed there was something slightly more
> clever going on, but the only “special sauce” is the use of aufs to layer
> one file system upon another.

There's also the network stuff? From docker's website, a simple "docker run
ubuntu /bin/echo hello world" does the following:

    
    
        It downloaded the base image from the docker index
        it created a new LXC container
        It allocated a filesystem for it
        Mounted a read-write layer
        Allocated a network interface
        Setup an IP for it, with network address translation
        And then executed a process in there
        Captured its output and printed it to you

------
megaman821
What is the recommended workflow for using docker?

Say I have:

* a base Ubuntu container with a few utilities installed like htop and my user account * a base Python web container with Nginx and libxml, libjpeg, etc * a specific Python web app

If I update my app should I save the container state and push it to all my web
servers or use something like fabric to update the app on all my app servers?

If Nginx has a security release and I update my base Python web container, do
I now need to rebuild all my Python web containers or is there some way to
merge?

When the next LTS release of Ubuntu comes out and I upgrade, do I have to
manually apply that to every container or just the base?

------
Keyframe
Serious question. I've been looking at docker today and I can't seem to grasp
why would I use docker for dev and deployment over something like ansible also
both for dev (on VM) and deployment. I must be missing something. I now
understand vagrant somewhat that it alleviates manual hurdle of VM starting
and running, but not enough for me to use it.. yet I don't see where and how
docker fits in. Would I be wrong if I thought of it more as a
virtualenv(wrapper) but not only for python?

~~~
borplk
Read this comment:
[https://news.ycombinator.com/item?id=6178526](https://news.ycombinator.com/item?id=6178526)

------
nickstinemates
Thanks for the article. While I disagree with your conclusion, getting a clear
view of the the misconceptions that exist allow for more pointed documentation
and focus.

------
ferus
[http://www.readability.com/read?url=http%3A//www.alexhudson....](http://www.readability.com/read?url=http%3A//www.alexhudson.com/2013/05/28/a-first-
look-at-docker-io/)

Just a better way to read the article.

~~~
ealexhudson
I'll update my theme, thanks :-P

~~~
ferus
Anyways is there any option to grant external ftp access to docker instances?
I was wondering if I can move my multiplayer games instances to docker and
allow my users to still connect to ftp.

