Docker - the Linux container runtime

nicolast · on March 20, 2013

I recently changed our Jenkins CI infrastructure to use something like this: all jobs run in their own LXC container, using BTRFS and its snapshots/clones instead of AUFS. Works like a charm!

The script is available at https://github.com/Incubaid/jenkins-lxc (no docs for now, and several improvements are possible). It expects CI job scripts to be contained in the job repository, e.g. https://github.com/Incubaid/arakoon/tree/1.6/jenkins

niall_ohiggins · on March 20, 2013

That's pretty awesome.

At FrozenRidge.co, we work on our own CI server called Strider and have actually integrated Docker directly:

You can read about it at: http://blog.frozenridge.co/next-generation-continuous-integr...

jordanthoms · on March 21, 2013

This is great, I've been wanting to do the same thing for our jenkins setup.

shykes · on March 20, 2013

Wow! Did not expect this to show up on HN before actual release! (I work at dotCloud).

We're still polishing a few rough edges. If you want early access add your github ID to this thread and we'll add you right away!

georgebashi · on March 20, 2013

I've also been building something similar to docker by plugging into Puppet, Jenkins and ActiveMQ. Would love to take a looker at Docker. I'm http://github.com/georgebashi

shykes · on March 20, 2013

Also: there's an irc channel: #docker on freenode. Come hang out!

laumars · on March 20, 2013

After seeing all the github ID's posted, it reminds me how much I'd like to see a PM feature added to HN.

vtwoods · on March 20, 2013

https://github.com/VTWoods

zimbatm · on March 20, 2013

github/twitter: zimbatm

I've built a similar tool using Go. Wondering how you guys get around the lack of clone(2) in the stdlib :)

yebyen · on March 20, 2013

I want to hear more about your go version of Docker. Link?

You have a lot of projects in your github repository and I was not able to identify it from scanning the list.

zimbatm · on March 21, 2013

It's not open source (yet?) but it looks more like systemd's nspawn. I just had a quick look of Docker and our solution is much less complicated (and also less powerful).

Send me an email if you want to discuss: jonas@pandastream.com

davidwahler · on March 20, 2013

I'd love to have access, thanks. My github username is dwahler.

thoward37 · on March 20, 2013

github: thoward

How does this compare with warden? https://github.com/thoward/vagrant-warden

thoward37 · on March 20, 2013

BTW: If you're doing beta access, I could use this immediately for http://riakon.com .. Currently using something I hacked together w/ warden, but Docker looks like a more elegant solution (and I'd rather be part of a community than using my own one-off hack).

daxelrod · on March 21, 2013

I'm daxelrod on GitHub, and I would be stoked to get an invite. I've been in the early design stages of something similar to this. Thank you!

lsaferite · on March 21, 2013

Looking forward to testing this out. https://github.com/LeeSaferite

efnx · on March 20, 2013

Yes please and thank you! Github: schell

da_n · on March 21, 2013

Looks very interesting, look forward to testing

https://github.com/da-n

wereHamster · on March 20, 2013

wereHamster (don't be scared, I won't bite and infect you)

GuiA · on March 20, 2013

There was a lightning talk about Docker at Pycon; I'd assume that's where OP got the info from :)

ARothfusz · on March 20, 2013

I've posted a video of the lightning talk here: https://plus.google.com/photos/115695491015706412558/albums/...

It should be public so you can view it even without a G+ account (I think)

shykes · on March 20, 2013

That's possible. When we submitted the lightning talk, we pictured a poorly lit backroom with maybe 15 fellow kernel geeks. Instead, we presented in the main room to at least 500 people... So much for discreetly asking for feedback :)

pbogdan · on March 20, 2013

Would love to have a look - @pbogdan

hchinchilla · on March 20, 2013

http://github.com/hugochinchilla

bellbind · on March 21, 2013

We've been working on a minimal LXC manager for a week now, but docker seems to be exactly what we need. Can't wait to check this out. https://github.com/cpra-lcoffe

janneand · on March 20, 2013

Please add me, github.com/janne

stormbrew · on March 20, 2013

github id: stormbrew

I've been wondering when something like this would come around and if I'd have to try to write it myself. I've made smaller, less isolated, scale versions of this idea before but this looks snazzy.

noplay · on March 20, 2013

Github: noplay

Thanks, that sound interesting

reeze_xia · on March 21, 2013

Wow, really look forward for such a solutiion github.com/reeze please :)

BlackLagoon · on March 21, 2013

I am working with cpra-lcoffe,and looking forward to testing this out. https://github.com/cpr-mbelarbi

dedene · on March 20, 2013

Awesome! Could you add me too? Github id 'dedene'. Thanks!!

natejenkins · on March 21, 2013

At first glance I thought, 'I already have vagrant'. After watching the lightning talk my only thought is, 'Want!'

Looks awesome and crazy fast. Great work.

github id: natejenkins

jwhitlark · on March 20, 2013

Github: jwhitlark

Looks really useful.

fsniper · on March 20, 2013

github/fsniper Docker seems to be making old technologies reappear with new implementations.

Is this linux-vserver or openvz re implemented with lxc and cgrougs?

bbrunner · on March 20, 2013

this looks awesome. would love to check it out.

github: brianbrunner

JeanSebTr · on March 20, 2013

Génialissime. GH: @JeanSebTr thanks ! :)

philjr · on March 20, 2013

github: pjr

looked at the pycon demo ... looks awesome!

js4all · on March 20, 2013

Sounds promising. Github: dvbportal

js4all · on March 21, 2013

Thanks for the early access. From what I've seen so far, Docker is topnotch.

shykes · on March 21, 2013

Thanks! Glad you like it. We have a few cool features coming up... Can't wait to show them.

luisbebop · on March 20, 2013

amazing! congratz! github: luisbebop

themckman · on March 20, 2013

Would REALLY love to take a look at this.

Github: 198d

Tobu · on March 20, 2013

Oooh! Tobu. edit: building, thanks!

jnthn · on March 20, 2013

GH: joonathan

madisp · on March 20, 2013

been waiting for something like this,

github: madisp

themgt · on March 20, 2013

Very interesting, guys! GH: themgt

buster · on March 20, 2013

Yay, here too! Github: buster

rafiss · on March 21, 2013

github: rafiss

Thanks! This might help a lot for an idea that my friends and I are working on.

jaryd · on March 21, 2013

Cool! My github id is 'malbin' -- would love to take it for a test drive.

jzawodn · on March 20, 2013

github: jzawodn

kurt_ · on March 20, 2013

github: garnieretienne

wkharold · on March 21, 2013

Looks awesome. I love containers/lxc. github id: wkharold

chrisfarms · on March 20, 2013

great! ... chrisfarms

jackinloadup · on March 20, 2013

github: jackinloadup

kolektiv · on March 20, 2013

kolektiv on GH too.

tomjohnson3 · on March 20, 2013

github: tomjohnson3

hijinks · on March 21, 2013

I'd love early access

github: mzupan

sudorandom · on March 20, 2013

github: sudorandom

jmsduran · on March 21, 2013

Hey hope I'm not late, my github ID is: jmsduran

wiredfool · on March 20, 2013

github: wiredfool

warf · on March 26, 2013

github: warf

Built a similar system in-house at my workplace. Would definitely be interested in migrating/contributing to a wider effort!

cespare · on March 21, 2013

I'd love to check it out. Github id: cespare

zrail · on March 21, 2013

github.com/peterkeen please and thank you :)

robertfw · on March 20, 2013

github: robertfw

radimm · on March 20, 2013

github.com/radim

trotsky · on March 20, 2013

thanks - trotsky

jruhsmith · on March 20, 2013

github: jrsmith

alexchamberlain · on March 20, 2013

alexchamberlain

yebyen · on March 20, 2013

github: yebyen

contrahax · on March 20, 2013

github: Contra

avidal · on March 20, 2013

github: avidal

fgrehm · on March 20, 2013

github: fgrehm

scjr · on March 21, 2013

hey - https://github.com/scjr

leourbina · on March 27, 2013

This is awesome. GH: leourbina

nwg · on March 20, 2013

github: nwg

mrbill · on March 20, 2013

github: billbradford (thanks)

mrud · on March 20, 2013

ottbot · on March 20, 2013

ottbot

thanks!

nnutter · on March 20, 2013

nnutter

endlessvoid94 · on March 20, 2013

dpaola2

deepakprakash · on March 21, 2013

Github: deepakprakash

visualphoenix · on March 21, 2013

github: visualphoenix

notdonspaulding · on March 21, 2013

github: donspaulding

seletz · on March 20, 2013

wow. github: seletz

gonzo · on March 20, 2013

github/gonzopancho

gfunk911 · on March 20, 2013

github: mharris717

tagx · on March 20, 2013

github/tageorgiou

SingAlong · on March 20, 2013

github: HashNuke

tdmackey · on March 20, 2013

github: tdmackey

pnathan · on March 21, 2013

github: pnathan

Bjoern · on March 20, 2013

github: rennhak

baran1 · on March 20, 2013

github: baransn

cookrn · on March 21, 2013

github: cookrn

silasb · on March 20, 2013

github: silasb

ayosec · on March 20, 2013

github: ayosec

kami8845 · on March 20, 2013

github: doda

ptio · on March 20, 2013

github: pau

soldier · on March 21, 2013

huski

andykram · on March 21, 2013

andykram

Thanks!

secretagent · on March 20, 2013

tsabat

swdunlop · on March 20, 2013

swdunlop

178 · on March 21, 2013

oh yeah :) Github: eins78

phaedrus · on March 22, 2013

github: dennisferron

danellis · on March 21, 2013

danellis on GitHub.

xetorthio · on March 21, 2013

github/twitter: xetorthio

thanks!

naelyn · on March 23, 2013

github: naelyn

pepijndevos · on March 24, 2013

pepijndevos

stevvooe · on March 26, 2013

stevvooe

AaronFriel · on March 23, 2013

aaronfriel

asadjb · on March 20, 2013

I'm not familiar with any of the technologies used in this. Anybody care to comment on how strong the isolations would be security wise, compared to normal virtualization?

If the security is almost at par and the isolation is good enough that one bad process can't bring the whole system down, might this be a good alternative to virtualization, since I imagine it would definitely use less resources.

trotsky · on March 20, 2013

Container based virtualization can provide an impressive amount of isolation while improving density dramatically on light duty loads over virtualization. Solaris zones are very well regarded and are used for multi-tenant by Joyent, and many many linux hosts provide multi-tenant solutions based on virtuozzo which predates linux containers by a good number of years.

The main theoretical difference between hypervisor isolation and container isolation is one sits above the kernel, so a kernel level exploit only applies to a single virtual machine. With containers you're relying on the kernel to provide the isolation so you are still subject to (some) kernel level exploits.

Practically linux containers (the mainline implementation) have only provided full isolation in recent patches and probably shouldn't be considered full shaken out for something like full in the wild root level multi-tenant access.

They are super for application isolation for delivery of multiple single tenant workloads on one machine though - something people use hypervisors for quite a bit. The resources used can be a small fraction of what you're committing to with a hypervisor.

bcantrill · on March 20, 2013

As trotsky mentions, we at Joyent are fervent believers in OS-based virtualization -- to the point that in SmartOS, we run hardware virtualization within an OS container. There are many reasons to favor OS-based virtualization over hardware-based virtualization, but first among these (in my opinion) is DRAM utilization: with OS-based virtualization, all unused DRAM is available to the system at large, and in the SmartOS case is used as adaptive replacement cache (ARC) that benefits all tenants. Given that few tenants consume every byte of their allocated DRAM, this alone leads to huge efficiencies from both the perspective of the cloud operator and the cloud user -- a higher-performing, higher-margin service. By contrast, for hardware-based virtualization, unused DRAM remains with the guest and is simply wasted (kludges like kernel samepage mapping and memory ballooning notwithstanding).

DRAM isn't the only win, of course: for every other resource in the system (CPU, network, disk), OS-based virtualization offers tremendous (and insurmountable) efficiency advantages over hardware-based virtualization -- and it's great to see others make the same realization!

For more details on the relative performance of OS-based virtualization, hardware-based virtualization and para-virtualization, see my colleague Brendan Gregg's excellent blog post on the subject[1].

[1] http://dtrace.org/blogs/brendan/2013/01/11/virtualization-pe...

zobzu · on March 20, 2013

Solaris zones use similar concepts to LXC/namespaces, but are actually providing secure isolation.

Recent patches DO NOT provide "full isolation" and never did. What they add is usermode containers. Those are broken weekly since the release. Seriously. Have a look at http://blog.gmane.org/gmane.comp.security.oss.general

price · on March 20, 2013

> Those are broken weekly since the release. Seriously. > Have a look at http://blog.gmane.org/gmane.comp.security.oss.general

Funny you should say that. The latest virtualization-related CVEs there are actually in KVM -- a trio including two host memory corruptions, which usually enables completely owning the host. http://permalink.gmane.org/gmane.comp.security.oss.general/9...

And on the other hand, I don't see any container-related CVEs at all from 2013 in the CVE database: http://cve.mitre.org/cgi-bin/cvekey.cgi?keyword=linux+kernel (The KVM issues I mentioned don't show up yet either, because they're from today.) What vulnerabilities are you referring to?

Maybe you mean kernel vulnerabilities in general, some of which could be usable by a user inside a container. Everyone should stay on top of kernel updates in any event. If you hate the rebooting, Ksplice is free for Ubuntu (and Fedora.)

teraflop · on March 20, 2013

The Linux namespace stuff is evolving pretty fast, and I personally wouldn't trust it as the main line of defense for anything important.

With virtualization, a buggy or malicious guest is still limited to its sandbox unless there's a flaw in the hypervisor itself. With containers/namespaces, the host and guest are just different sets of processes that see different "views" of the same kernel, so bugs are much more likely to be exploitable. Plus, if you enable user namespaces, some code paths (like on-demand filesystem module loading) that used to require root are now available to unprivileged users.

There's already been at least one local root exploit that almost made it into 3.9: https://lkml.org/lkml/2013/3/13/361

derefr · on March 20, 2013

> The Linux namespace stuff is evolving pretty fast, and I personally wouldn't trust it as the main line of defense for anything important.

If I recall, Heroku uses cgroups (EDIT: and namespaces) exclusively for multitenant isolation (and by the looks of this, dotCloud does too), so that's two big votes in the "if it's good enough for them" category.

teraflop · on March 20, 2013

Sure, but cgroups and namespaces are kind-of-orthogonal features that both happen to be useful for making container-like things. cgroups are for limiting resource usage; namespaces are for providing the illusion of root access while actually being in a sandboxed environment.

And as far as I'm aware (speaking as an interested non-expert, so please correct me if I'm wrong) cgroups have no effect on permissions, whereas UID namespaces required a lot of very invasive changes to the kernel.

jpetazzo · on March 20, 2013

That's correct: cgroups have no effect on permissions. They only enforce resource usage limits.

Shameless plug: I work at dotCloud, and I wrote 4 blog posts explaining namespaces, cgroups, AUFS, GRSEC, and how they are relevant to "lightweight virtualization" and the particular case of PAAS. The articles have been grouped in a PDF that you can get here if you want a good technical read for your next plane/train/whatever travel ;-) http://blog.dotcloud.com/paas-under-the-hood-ebook

menage · on March 20, 2013

Fundamentally, the cgroups framework is just a way of creating some arbitrary kernel state and associating a set of processes with that state. For most cgroup subsystems, the kernel state is something to do with resource usage, but it can be used for anything that the cgroup subsystem creator wants. At least one subsystem (the devices cgroup) provides security (by controlling which device ids processes in that cgroup can access) rather than resource usage limiting.

lukeschlather · on March 21, 2013

Personally I think the biggest value these days with para-virtualization like this is in development. I can be running twenty or so different applications on the same physical machine, and for the most part (as long as they're idle since I'm only working with one) I don't even notice that they're running.

shykes · on March 20, 2013

Yes, you probably don't want to run untrusted code with root privileges inside a container if anything valuable is running on the same host.

However if that code is trusted, or if you're running it as an unprivileged user, or if nothing else of importance is sharing the same host, then I would not hesitate to use them.

Containers are awesome because they represent a logical component in your software stack. You can also use them as a unit of hardware resource allocation, but you don't have to: you can map a container 1-to-1 to a physical box, for example. But the logical unit remains the same regardless of the underlying hardware, which is truly awesome.

zdw · on March 20, 2013

Barring kernel bugs, it should prevent against the mentioned resource monopolization issues. Normal virtualization is pretty resource wasteful, especially if the guests are not hypervisor aware.

Getting away from huge per-VM block devices is a step in the right direction.

bradrydzewski · on March 20, 2013

This is still technically a virtualization technique, known as "operating system-level virtualization". http://en.wikipedia.org/wiki/Operating_system-level_virtuali...

Here are some of the technologies explained:

cgroups: Linux kernel feature that allows resource limiting and metering, as well as process isolation. The process isolation, also called namespaces, is important because it prevents a process from seeing or terminating other running processes.

lxc: this is a utility that glues together cgroups and chroots to provide virtualization. It helps you easily setup a guest OS by downloading your favorite distro and unpacking it (kind of like debootstrap). It can then "boot" the guest OS by starting it's "init" process. The init process runs in its own namespace, inside a chroot. This is why they call LXC a chroot on steroids. It does everything that chroot does, with full process isolation and metering.

aufs: this is sometimes called a "stacked" file system. It allows you to mount one file system on top of another. Why is this important? Because if you are managing a large number of virtual machines, each one with 1GB+ OS, it uses a lot of disk space. Also, the slowest part of creating a new container is copying the distro (can take up to 30 seconds). Using something like AUFS gives you much better performance.

So what about security? Well, like every (relatively) new technology LXC has its issues. If you use Ubuntu 12.04 they provide a set of Apparmor scripts to mitigate known security risks (like disabling reboot or shutdown commands inside containers, and write access to the /sys filesystem).

wslh · on March 20, 2013

I am familiar with Microsoft App-V, VMWare ThinApp, and Symantec Workspace Virtualization. They can help you as a security sandbox but not as a full protection. A virtual machine will be much more secure (and theoretically very strong), although there are security bugs there that enable you to escape it.

Those products work at two levels: using filtering drivers for registry and the filesystem, and hooking into the Windows operating system API.

laumars · on March 20, 2013

Virtual machines are not more secure. In fact there's been more documented attacks where root access on a guest VM has gained shell access on the host, than there's been against containers.

This doesn't mean that containers are more secure than VMs either. Attacking VMs attracts more security researchers from what I've seen (but I may be wrong on that point). However whether your running a container or a virtual machine, you still need some shared processes (eg the 'ticks' of a system clock) and with any sufficiently complicated code WILL have bugs that can be potentially exploited.

However the crux of the matter is regardless of whether you're running containers or full blown virtual machines, you cannot escape out of the sandbox without having elevated privileges on the guest to begin with. And if an attacker has that, then you've already lost - regardless of whether the attacker can or cannot escape the sandbox.

Lastly, I'm not sure if you're aware of this or not, but this is a Linux solution and has nothing to do with Windows (I only say this because your post seemed tailored towards Windows-hosted virtualisation)

wslh · on March 20, 2013

Are you saying that both approaches have the same level of security or probable insecurities? or that you can't currently estimate the difference?

Even being aware that this is a Linux solution I mentioned the Windows technologies that I know technically.

laumars · on March 20, 2013

> Are you saying that both approaches have the same level of security or probable insecurities? or that you can't currently estimate the difference?

A bit of both, but mostly the former. In practical terms, they both have the same level of security. But -as with any software- something could be published tomorrow exposing some massive flaw that totally blows one or the other out of the water. However neither offer any technical advantage over the other from a security stand point and from a practical perspective, the real question of security is whether your guest OSs are locked down to begin with (eg it's no good arguing which home security system is the most effective if you leave the front door open to begin with).

> Even being aware that this is a Linux solution I mentioned the Windows technologies that I know technically.

That's fair enough and I had suspected that was the case. I just wanted to make sure that we were both talking about the same thing :)

scarmig · on March 20, 2013

Back in January I got a new laptop, installed Arch on it, got it all nicely set up. And I decided it was high time to start playing with LXC because container virtualization seems extremely promising to me. Created an Ubuntu container, seemed to work fine, and then used lxc-destroy, which took some time.

It destroyed my entire file system. I have no clue how the hell it happened--it floors me that something like that would be possible--and I suspect it's probably simply the result of a newbie like myself somehow misusing userspace tools. But it was enough to turn me off of it for the time being.

wereHamster · on March 20, 2013

I'm interested to see how that standardization works across linux distributions. I mean, different distros use different versions of various libraries (openssl for example). So if your app links with libssl.so.X but the host only provides libssl.so.Y, then your app won't work.

Of course, you could bundle libssl with your app. But then the standardization is at the level of kernel/libc ABI. In which case the container is basically a full LXC guest.

But then why standardize an image format if you can create a small script which builds the image with lxc-create + installs whatever else necessary for your app. That script will be much smaller than the full image, even a barebones ubuntu lxc guest (debootstrap quantal) is ~400MB.

wmf · on March 20, 2013

Their base layer needs to specify exact versions of libraries.

Deploying from images should be much faster and less fragile (oops, is Github down?) than from scripts.

shykes · on March 20, 2013

Absolutely. Think of a Docker container as a universal build artifact. Your build step can be an arbitrary sequence of unix commands (download dependencies, build libssl, run shell scripts etc.). Docker can freeze the result of that build and guarantee that it will run in a repeatable and self-contained way, no matter where you run it.

So you get clean separation of build and run, which is a hugely important part of reliable deployment.

wereHamster · on March 20, 2013

So docker container ABI is basically the kernel? You just create a filesystem image and docker starts that as LXC guest. You could build a statically compiled binary and put that as /bin/init into the container image, right?

shykes · on March 20, 2013

Yes! Exactly :)

You wouldn't need to save it as /sbin/init. You would just type:

    $ docker run MYIMAGE /path/to/my/static/binary

Here's the smallest image I've personally used to run a docker container: http://get.docker.io/images/busybox

zdw · on March 20, 2013

Are the aufs filesystems easily navigable from the host OS, ala Solaris's zones feature?

zimbatm · on March 20, 2013

The aufs mount is (probably) only mounted on the guest's VFS so I'm not sure how you would access it from the host.

jpetazzo · on March 20, 2013

The AUFS mountpoint is also reachable from the host. However, each container uses its own `mnt` namespace, so further mounts (done within the container) will not automatically be visible.

shykes · on March 20, 2013

Docker lets you visualize changes on any container's filesystem live, as they happen. It also lets you snapshot changes from any container into a new image, and immediately start running new containers from that image. No manual post-processing of the image, no configuration files to templatize. The whole thing is 2 commands and maybe half a second.

buster · on March 20, 2013

Wow, i need this! What would be the requirements for docker? I suppose it wouldn't run on linux 2.6.x? (CentOS 5.x)? Any other requirements?

shykes · on March 20, 2013

The main requirement is a modern kernel with the aufs module. We do most of our testing on Ubuntu 12.04 and 12.10. But any modern distro should be fine. There are a few people testing it on CentOS as I write this.

zobzu · on March 20, 2013

Note that LXC DOES NOT PROVIDE SECURITY. It provides resource separation (to a point) and so on.

Breaking out of a filesystem container is as easy as creating a root block device. Breaking out of a network container is as easy as creating a network device

And in all cases, you can just inject memory, load lkms, etc. That's without mentioning the amount of weekly CVEs for Linux namespaces.

price · on March 20, 2013

> LXC DOES NOT PROVIDE SECURITY

This is out of date. As of Linux 3.8, or with out-of-tree patches in older kernels, LXC puts each container in its own user namespace, so that root in the container has no privileges outside. LXC also uses network namespaces, so the user inside the container can only do on the network what the admin allows them to do.

Because root inside a user namespace is unprivileged outside it, it can't scribble on memory or load modules, etc., either.

See https://wiki.ubuntu.com/LxcSecurity for a decent summary of the situation in Ubuntu's releases. Several Ubuntu contributors are also among the main drivers of LXC upstream.

It's true that user namespaces and other kernel features LXC relies on are beginning to get much more use than they used to, and probably still have flaws, though I think you exaggerate how many CVEs are actually being found. Ubuntu's LXC support also uses apparmor and seccomp to provide further isolation. Conservative users will probably wait a while more to see what bugs get shaken out.

throwaway1979 · on March 20, 2013

Thank for your comment! I've been afflicted by FUD related to security in LXC for over a year. This really helps.

zobzu · on March 20, 2013

And apparmor is not LXC. Indeed, if you use apparmor to further restrict LXC you can some kind of security (as long as there isn't a new CVE every week that is).

shykes · on March 20, 2013

(Copying my answer to a similar question)

Yes, you probably don't want to run untrusted code with root privileges inside a container if anything valuable is running on the same host.

However if that code is trusted, or if you're running it as an unprivileged user, or if nothing else of importance is sharing the same host, then I would not hesitate to use them.

Containers are awesome because they represent a logical component in your software stack. You can also use them as a unit of hardware resource allocation and multi-tenancy, but you don't have to: you can map a container 1-to-1 to a physical box, for example. But the logical unit remains the same regardless of the underlying hardware and multi-tenancy setup, which is truly awesome.

EDIT: details on multi-tenancy.

zobzu · on March 20, 2013

you're trying to justify the use of lxc for security, IMO. Your webpage does state "strong guarantees of isolation"

if you're sharing nothing of importance on the host, then, you don't really need LXC, unless you don't know how to setup mysql with more than one database, nginx with more than one virtual host, yada yada.

Here's the trick: you CAN use LXC and SUPPLEMENT it by something providing security such as SELinux.

jpetazzo · on March 20, 2013

dotCloud engineer here.

LXC lets you use cgroups, i.e. setup memory/cpu/IO limits per container. If you setup MySQL with more than database, you can't do that.

Also, we DO use LXC and SUPPLEMENT it by something providing security such as GRSEC (in the current version in production at dotCloud) and AppArmor (with docker) :-)

zobzu · on March 20, 2013

you can actually use cgroups without lxc, btw.

if you do use apparmor and grsec (as in RBAC's part of grsec in particular) it's probably acceptable, but I haven't seen it mentioned on the website - and people figure, they'll just use lxc "and be safe".

duked · on March 20, 2013

or use OpenVZ which has been designed with security in mind ;)

jpetazzo · on March 20, 2013

LXC is actually the work of the OpenVZ team.

When they were tired about seeing their patches rejected from the mainstream kernel, they decided to try a different approach, and that approach is LXC. In other words, LXC is a reimplementation of OpenVZ concepts by almost the same team.

LXC is actually more secure than OpenVZ, if only because it went through more scrutiny than OpenVZ.

Tobu · on March 20, 2013

Are you sure? I think LXC was developed by IBM. Team OpenVZ is still sticking with very old kernel releases (2.6.32 at most), they may adopt LXC for some of their features but they aren't very keen on upstream work.

pshc · on March 20, 2013

Will there be any possibility of running dev instances in OS X? Perhaps we'll be able to do brew install docker-compat at some point in the future and get a best-effort emulation layer even though the Linux APIs are missing. I hate messing around with virtualization.

niall_ohiggins · on March 20, 2013

I run a Docker host fine on OS X w/ Vagrant. The client runs on OS X but not Linux Containers.

epynonymous · on March 21, 2013

fyi, aufs which is the union mount filesystem required by docker (and cloud foundry warden), is not well supported on rhel, centos, and other linux distros, seems mainly support for ubuntu.

kalmar · on March 21, 2013

More than this, Ubuntu are trying to remove it from their distribution in favour of overlayfs [0]. As mentioned there, the reason is that aufs is not and will not be part of the mainline kernel. Precise 12.04 was going to be without AUFS, but some issues cropped up, keeping it around.

It now looks like overlayfs will make it into the 3.10 mainline kernel [1], so it may be a better choice in the future. I think that the Under the hood stuff on docker's page are implementation details that can change, so a switch to overlayfs when that becomes more suitable could be possible. (Confirmation from the dotClouders present would be apprecaited.)

[0] https://lists.ubuntu.com/archives/ubuntu-devel/2012-February...

[1] http://lwn.net/Articles/542707/

theatrus2 · on March 20, 2013

Whats the interaction between systems based on Mesos, which can (and many do) use containers? Is this really designed for more multi-tenancy with lower trust over same-org clouds?

shykes · on March 20, 2013

Docker solves the problem of running things in a repeatable and infrastructure-agnostic way. Mesos solves the problem of telling many nodes what to run and when.

In other words, Docker + Mesos is a killer combo. There is already experimentation underway to use Docker as an execution engine for Mesos.

duked · on March 20, 2013

That sounds like a definitive improvement over LXC specially the isolation properties but I'm not sure what is the added value of Docker compared to OpenVZ ?

Any ideas ?

zimbatm · on March 20, 2013

My guess is that the container specification is orthogonal to the sandboxing feature. In fact they're using LXC.

shykes · on March 20, 2013

Correct, Docker is currently based on lxc, but that is an implementation detail. In theory it could be ported to any process-level isolation tech with similar features: OpenVZ, Solaris Zones - you could also try using BSD jails although I don't know if they have all the required features.

To answer the original question: Docker extends LXC with a higher-level API which operates at the process level. OpenVZ helps you create "mini-servers". Docker lets you forget about servers and manage processes.

zimbatm · on March 20, 2013

Nice. Can't way to try it out. I've built a similar tool in Go for PandaStream to isolate or encoding processes.

nwg · on March 20, 2013

Does anyone know, or can anyone speculate on exactly how the "heterogenous payloads" are specified?

shykes · on March 20, 2013

The payload is anything which can be recorded on the filesystem by running a unix command. For example:

     # Run command which adds your payload
     $ docker run base apt-get install curl
     5b4a1ee8

     # Commit the result to a new image
     $ docker commit 5b4a1ee8 nwg/base-with-curl

     # Run a command from the new image. Your payload is available!
     $ docker run nwg/base-with-curl curl http://www.google.com

Docker doesn't care how the payload was added. apt-get, shell script, pip install, gem bundle... All the same. In the end it's just a change to the filesystem.

pushcx · on March 20, 2013

This sounds like a reimplementation of virtual machines at the os layer instead of hardware layer.

voidlogic · on March 20, 2013

It sounds like they are just building on cgroups etc, which are already part of the Linux kernel.

I would argue that virtual machines at a hypervisor/hardware level were just a hack for OSs not living up to their isolation promises/obligations. Strong OS level isolation implementations (cgroups, namespaces etc) allow people to put isolation back where it belongs, the OS.

The job of the OS is to control the hardware, wrapping the OS is software to emulate hardware is ridiculous and VMs generally have much more performance overhead than isolation containers.

laumars · on March 20, 2013

Containers have existed as long as virtual machines have. FreeBSD implemented "Jails" back in the late 90s / yr2000. Linux also has OpenVZ and Solaris has Zones.

If you couple a container with a CoW file system that supports snapshotting (eg ZFS or BtrFS), then you can have most of the features you'd expect from virtualisation but without as heavy footprint.

Containers are an underrated and often forgotten solution in my opinion.

batgaijin · on March 20, 2013

lxc leverages hvm I think... someone correct me?

edit: it's too early, sorry this has nothing to do with your post... but I hope someone does correct me about hvm.

rlpb · on March 20, 2013

How is this different from what you can already do with lxc on, for example, Ubuntu Server?

derefr · on March 20, 2013

It's not that it's any different, it's that it's standardized. The idea is that a Docker container would be portable between different PaaS hosts (and from your own staging environment to those hosts!) without rebuilding, because they'd all be using the "Docker standard for deployment."

A PaaS host saying they supported Docker would imply that they'd be using, for example, SquashFS for container format, AuFS instead of OverlayFS for union-mounts, LXC instead of OpenVZ/Xen/KVM for isolation, and any other set of things your container might subtly rely upon.

The culmination of this, I imagine, would be a PaaS host allowing you to specify the "stuff" you want to run just by the URL of the container-image.

0xbadcafebee · on March 20, 2013

Doesn't a standard involve, you know, standards? AFAIK a product name is not a standard.

What if the namespace changes? What if AuFS changes? What if LXC changes? Independently or all together? ABI changes? Version changes? Feature changes? Are all the licenses compatible? Will it ever support platforms other than just certain versions of Linux? Or languages other than Go?

I don't see a standard. I see marketing for a product and a mailing list to collect potential customers. But maybe i'm missing something.

shykes · on March 20, 2013

Hi Peter, this website was only meant to be seen once Docker is actually open-source, which will be the case very soon.

I do think there is a need for a standard way to package and share software at the filesystem and process level - we don't pretend to define that standard, but hopefully we can contribute to it by open-sourcing a real-world implementation.

0xbadcafebee · on March 20, 2013

I guess I read too far into it when I saw the word "standard" everywhere and got excited - sorry about that. Do you plan on adding to your implementation the ability to differentiate between compatible versions/platforms, so one could use this on several cloud instances that aren't built the same?

shykes · on March 20, 2013

Yes, that is definitely something we would like to add. And we will gladly accept pull requests :)

derefr · on March 20, 2013

I'm just presuming, but:

1. every one of those attributes would be fixed against a given version of the (coming) Docker spec, and a given host would specify what version(s) of the spec they were compatible with.

2. Go is, I think, just the language the glue code is written in; not the language your own things-deployed-using-Docker must be written in.

3. It might support other Linux distros (Fedora, probably), but it won't support other OSes as hosts--because the whole point is to run things that need a POSIX-alike as their "outer runtime" (i.e. not Windows programs, etc.) The way to run these containers on another host will be to run Linux in a VM on that host, and run the containers in the VM--just like the way to play a Super Nintendo game "container" on your computer is to run them in a Super Nintendo VM. [Actually, come to think of it, game ROMs are a great analogy for precompiled SquashFS containers. I would adopt it if I were them :)]

0xbadcafebee · on March 20, 2013

The trick here is that their xmame (Docker) may not be the same build on all hosts, so it may not play the ROMs all in the same way or support all ROMs. A standard works to improve interoperability between different builds/hosts/etc as well as provide an expected set of operations and their results. If all they provide is just one version of one product and call that standardized, that's like releasing a new version of Internet Explorer and calling it a web standard.

derefr · on March 20, 2013

Well, this is a good first step in the "free market" standardization process, though: get a public implementation out of what you would imagine standard-conformance to look like. Then, let the other guys (e.g. Heroku) get out their competing implementations. Then, find the similarities, resolve the differences, and write it down. Now you've got a standard.

0xbadcafebee · on March 20, 2013

In practice that does not work. Things get broken, people end up having to support 20 edge cases to use this "universal", "standardized" thing. Depends on the implementation, though.

derefr · on March 20, 2013

"HTML 1.0" was the particular standard I had in mind. I guess I'm too used to coding multiplatform Javascript, but "end[ing] up having to support 20 edge cases to use this 'universal', 'standardized' thing" sounds like success in my books--in that you now have a (painfully) interoperating ecosystem, where before you had none. And it all gradually gets smoothed out as the spec evolves over the years, until you can't really tell the difference from a BDUF spec.

regularfry · on March 21, 2013

The best standards come from ratifying practice, not dictat.

shykes · on March 20, 2013

Correct. Docker is the direct result of dotCloud's experience running hundreds of thousands of containers in production over the last 2 years. We tried very hard to put it in a form factor which makes it useful beyond the traditional PaaS.

We think Docker's API is a fundamental building block for running any process on the server.

batgaijin · on March 20, 2013

What about virtsandbox from fedora? how does this project overlap with that? can this handle certain situations better?

rlpb · on March 20, 2013

So why not use VM images? Why not a VMware vagrant box, for example?

derefr · on March 20, 2013

This might work if you only have to run three or four VMs on a box, and run several applications in each container. Full PC virtual machines are much too heavyweight, though, for isolating thousands of individual processes per box, especially when most of them might just sit there doing nothing most of the time.

Though! If you want to, you can think of this standard as specifying an "ABI format" for high-level, lightweight VMs that happens to run on a "Linux machine" instead of, say, an "IA32 machine."

niall_ohiggins · on March 20, 2013

One major benefit over VM images is Linux Containers have very fast spin-up time. This can be especially useful for PaaS providers and CI servers.

emidln · on March 20, 2013

If your images use kvm+qcow2 you can just spin up new vms as a delta image using qcow2's support for a backing store.

I want a new instance of WEBSERVER.qcow2?

    qemu-img create -b WEBSERVER.qcow2 -t qcow2 WEBSERVER-$SERIALNUMBER.qcow2

If you're doing this as a PaaS or for CI, you do this as part of your new image creation and then pass in the new qcow2 to your vm (maybe via libvirt). If you aren't doing this or something very similar, you're spinning your wheels and wasting time/resources.

shykes · on March 20, 2013

Other benefits: docker images are basically tarballs, which means they are much smaller.

And, importantly, Docker maintains a filesystem-level diff between versions of an image, and only needs to transmit each diff once. So you get tremendous bandwidth savings when transmitting multiple images created from the same base.

natefinch · on March 20, 2013

Not to be a stick in the mud, but you're using a copyrighted image as your logo without any attribution or acknowledgement of the original owner of the copyright (the Lego Group). You should probably fix that.

shykes · on March 20, 2013

You're right! We put it up there when it was an internal project, it's probably time to take care of that. Taking it down until we found a correct way to do it.

I'm a copyright noob: would simply acknowledging the copyright owner be enough and fall under fair use, or should we not use it unless we get written permission?

natefinch · on March 20, 2013

I am not a copyright expert, however I am familiar with the Lego Group (the company that produces Lego toys) ... the set is pretty old (1986), which means they're not worried about making money off it. If you just stick a disclaimer after it that it's copyright 1986 the Lego Group, that's probably fine, especially since this looks to be an open source project, so you're not going to be charging people for something with their picture on it.

This page has what is very likely the original image: http://www.peeron.com/scans/7823-1/ Peeron.com has special permission directly from Lego to display the images, so if you wanted to be extra careful you could email dan@peeron.com and ask for permission to deep link to their picture (they'd probably say yes, the admin is a linux geek too). But honestly, a simple copyright disclaimer is probably fine. Lego won't reach out and swat you even if they do decide they don't like it, they'll just ask you to take it down.

binarycrusader · on March 20, 2013

I would avoid using anyone else's images without their explicit permission. It's just safer that way.

allerratio · on March 20, 2013

Why should I use this when systemd offers the same (minus the AUFS root fs)?

shykes · on March 20, 2013

We get that question a lot... Until people start playing with it. Then they never ask it again :)

iand675 · on March 20, 2013

I'm not entirely sure that I understand what this does. Is it some sort of hybrid between provisioning automation and deployment automation?

derefr · on March 20, 2013

It's a white-labelling of dotCloud's implementation of Heroku's "slug" concept (https://devcenter.heroku.com/articles/slug-compiler): basically, a SquashFS image with a known SHA, storing a precompiled runtime+libraries+code artifact that will never change, able to be union-mounted atop a "base image" (a chroot filesystem, possibly also a known-SHA SquashFS image), then spun up as an ephemeral LXC container. I actually use the idea in my own ad-hoc deployment process; they're very convenient for ensuring repeatability.

(As a side-note, this is an example of an interesting bit of game theory: in a niche, the Majority Player will tend to keep their tech proprietary to stay ahead, while the Second String will tend to release everything OSS in order to remove the Majority Player's advantages. This one is dotCloud taking a stab at Heroku, but you can also think of, for example, Atlassian--who runs Github-competitor Bitbucket--poking at Github by releasing a generic Git GUI client, whereas Github released a Github client.)

shykes · on March 20, 2013

This is mostly accurate :)

I will add that our implementation predates Heroku's. Using a generic container layer early on (first OpenVZ-based prototypes in 2009) is what allowed us to launch multi-language support a year before any other paas. It's also how we operate both application servers and databases with the same underlying codebase, and the same ops team.

laumars · on March 20, 2013

A very bad description would be: "containers are essentially what you get when you cross virtualisation with chroot".

The reason why I give that description to begin with is because containers solve a lot of problems that often lead people towards the virtualisation route. And to the guest "OS", all the applications think they're running on a unique machine from the host. However containers differ from virtualisation in that it's just one OS (one shared kernel). This means you can only run one unique OS (though if you know what you're doing, you can run multiple different distros of Linux in different containers - but you couldn't run FreeBSD nor Windows inside a Linux container). Containers can also have their own resources and network interfaces (both virtual devices and dedicated hardware passed through).

Because you're not virtualising hardware with containers and because you're only running one OS, containers do have performance advantages over virtualisation while still being just as secure. So I personally think they're a massively underrated and under utilised solution.

If you're interested in investigating a little more into containers, Linux also has OpenVZ, and FreeBSD and Solaris has Jails and Zones (respectively). The wikipedia articles on each of them also offer some good details (despite the stigma attached to wikipedia entries).

I've not used Docker specifically, but I have used other containers in Linux and Solaris, so I'm happy to answer any other questions on those.

bazzargh · on March 20, 2013

There've been quite a few articles on containers on LWN over the last few months; worth digging around there if you're interested. A couple of the more recent ones:

https://lwn.net/Articles/524952/ - Glauber Costa's talk on the state of containers at LinuxCon Europe 2012

https://lwn.net/Articles/536033/ - systemd containers; these follow on from a theme of containerizing whole distros rather than single apps

zdw · on March 20, 2013

Containerization has been around a long time. Probably the start would be chroot, which was developed further by BSD's jails, Solaris zones, and Linux containers.

Basically, these are all ways a way of running programs with a partitioned subset of access to the OS - in some cases, it may even appear to be an entire installation under that subset.

This (and many of the other mentioned solutions) adds to that by being able to capture and deploy an environment for running a specific app.

rolandtritsch · on March 22, 2013

GitHub: rolandtritsch

felixr · on March 21, 2013

curl get.docker.io | sudo sh

Has anybody tried this?

andyl · on March 20, 2013

Looks cool. When is it gonna be open sourced?