1. Shared kernel - in this model all of the containers have a shared kernel, so any activity or even tuning (ie io scheduler) would impact all of the containers right?
2. Patching - when you need to apply a kernel security patch, all containers would need to agree on a change window unless you were using something like KSplice?
3. User space / kernel space dependencies - if we imagine even 5 years down the road, will containers using for example Red Hat Enterprise Linux 5 apps be containerized but broken? Ie the hosting provider will likely want to stay ahead of the curve and upgrade their kernel, but the app teams may not be as progressive, so when these upgrades occur the apps would break
The first question is a valid point. But you also get benefits from not having the overhead of N different kernels running. This is easiest to see when looking at VPS providers - a 512MB OpenVZ vps means you have 512MB of memory for your application to use. Yeah, kernel overhead isn't that much, especially if you're running a few high-resource instances, but it can help if you have lots of low memory instances. There's lots of discussion online about OpenVZ vs. xen/kvm vps hosts if you're curious.
As for patching - OpenVZ at least makes it very easy to do live migration of instances between servers (barring some weirdness if you have NFS mounts in the guest), although it appears that lxc (and therefore docker) can't do that . In any event, it shouldn't hard to shutdown the guest, migrate, and restart the guest - especially if you're using shared storage of some sort.
As for your third point - backward compatibility with RHEL/Centos is generally quite good (since that's kind of the point of RHEL). At work we're currently on Centos 5, and our migration strategy to Centos 6 is probably going to be to install Centos 6 OpenVZ hosts, then move the guests and worry about upgrading the guests later. Forward-compatibility is an issue, but I don't think there's an easy solution to that.
The other is not so easy, very recent Linux environments don't always run on RHEL6.
However those links are useful as the coversation is on topic, even if the source material is different.
Currently it recommends kernel version 3.8 or greater. This means that if you prefer Ubuntu then you need 13.04 or the ability to upgrade the kernel.
It also currently requires usage of AUFS which means that you need the AUFS kernel module installed. So, if you have a supported kernel then you still might need the ability to modify the kernel. They are working on supporting alternate implementations such as BTRFS though.
EC2 is a great option right now and it's what I'm using.
I agree with another comment mentioning this is the future. However, I wonder how long it will be before something like "Erlang On Xen" becomes more widespread, cutting out the OS completely.
ETA: I love watching this project, it has really taken off and the maintainers have been making fast progress. It seems that as soon as I run into a show stopping problem, it's fixed the next day. It's a bit inspiring and makes me look at the progress I have made on my own projects. ;)
If there are any hosting providers that you would like to see directions on how to get Docker up and running, please submit an issue, and we will do our best to add it to the docs. I'm working my way down a list, picking the most popular ones first. If I have a lot of requests for a particular host, I'll do that one next.
The show stopping problems were early in development. The major one is that I ran into the kernel issues early on and then they added the recommendation to use only 3.8 or higher. So, that's not a fix but it addressed my problems. I was also having problems running Docker in stand-alone mode per their own docs, they have since removed this and daemon mode works great. I don't remember what the others were.
One nitpick — not a big fan of the recommended installation method (curl get.docker.io | sh -x). Is it really that hard to ask people to download and run the script themselves?
However, I would like to discuss the docker design a little more in detail, on the basis of its ease of use. First of all, I too do not like to have random stuff piped into my shell, so I went looking for the Docker sources. It was darned easy to build from sources, and quick too. At the end, I had a single binary.
And the cool thing about this binary is that its both the server and the client in the same package! So - the sysadmin of your Linux machine can (and should, manually, for now) build from sources, install in a local/ or ~/bin, and add the daemon to start up as needed.
Then, anyone else on the machine - not needing su rights - can run docker images, and so on.
This isolation, simplicity of install, and .. frankly .. rocking feature set .. is a beautiful thing. Can it be that golang is the secret sauce here? I say: yes.
For the record, another reason we chose Golang is because it was not python. Docker is a rewrite of a 2 year old python codebase which has seen a lot of action in production. We wanted the operational learnings but not the spaghetthi code. So we made it impossible to cut corners by changing languages. No copy-pasting!
If there were a 'secret sauce' I'd say it's the kernel features it takes advantage of (cgroup, lxc, kernel namespaces, aufs, etc). :)
It seems to be on its way : http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=706060
More information available here: http://docs.docker.io/en/latest/installation/ubuntulinux.htm...
What dependency issues with Django apps have you been having a hard time solving? You should be able to solve everything with a decent configuration management system like puppet/chef/saltstack/ansible.
Why? Isn't that exactly what "curl get.docker.io | sh -x" does? It's the cli equivalent to clicking on a download link and then executing it. I think what you mean is "wouldn't it be better for the user to first read the script's code before executing it or run it in a VM?"
You just need to know what the limitations are, and make sure you build your system around LXC so that you are protected.
production ready doesn't mean much tho. you can use anything you like for prod. it doesn't make it better or worse.
a ton of things that are considered "production ready" today are crippled with bugs, design flaws, etc.
The major issue of linux namespaces (or containers or "lxc" if you will) is that they're generally used as a security feature and haven't not been designed primarily as a security feature.
(it wouldn't have entered the kernel if it had been designed as such anyways)
vm's provide a better level of isolation so far, even thus they're not perfect either.
and for what it's worth, freebsd for example (among some others) provide a similar namespacing that is much better security wise. also openvz, vservers are doing similar things.
Oh and rsbac's jail too. (it might be the "strongest" of the list)
Also does it make sense using Chef/Puppet with Docker?
It lets you use a Linode or AWS instance as a bunch of NATed containers. This makes it way easier to install just one thing in one container and not mess up the other ones. This is where configuration management is going.
Plug: CloudVolumes is awesome for Win apps.
Now that I'm looking at it, I don't think this provides any kind of isolation. That's not what it's for. It's for distributing packaged programs.
I'd guess it's (at most) as secure as the underlying OS containerization support.
People have told me they use Docker as a "vmware alternative", a "make alternative", a "rpm alternative", a "vagrant alternative" and even a "tomcat alternative". But people also use Docker together with all of these tools!
In that way Docker reminds me of Redis: depending on what you want to do, you could use it as a replacement for memcache, rabbitmq, couchdb, mysql or even zeromq. But you could also use it together with all of these tools. Over time we're getting more comfortable with the fact that Redis might just be a tool of its own - useful in its own unique way, in combination with whatever other tool is best for the job at hand.
... but none of that matters if nobody uses it, and to get people to use software you need catchy titles like "better than VMs" :)
Why keep posting this every two month or so? Would be better if there are improvements, major site using it to roll out etc.
That said there have been a lot of improvements - see the changelog here https://github.com/dotcloud/docker/blob/master/CHANGELOG.md
VMs are a much safer bet (though not perfect either).
Having said that... If someone has enough determination, they will manage to compromise your system regardless of how 'secure' it's has been made. :)
I generally like to use containers in addition to a virtual machine. I do find it a bit shocking when I see a company offering up containers as an alternative to a VM though. I suppose it's a compromise some companies are willing to make for the additional performance.
Some distance. But now a days when you can own the kernel, that distance shrinks to zero.
I just saw these a few seconds ago:
If you don't trust lxc to sandbox untrusted code, don't! Just deploy 1 production container per VM, or even per physical machine. But maybe you don't mind running multiple containers on your local dev vm or on your staging machine - I know I don't.
What containers give you is a constant payload which can be moved across very different hardware setups, from the most shared to the most isolated.
I always liked the idea of nicely integrating with the environment and utilizing the features of package managers, the file system, users, and all the rest of the niceties we have at our disposal. "I don't know how to organize a bunch of things together!" seems like a silly reason to containerize every component into a separate root fs.
But on the other hand, I can imagine this work flow does have some merit, and some folks save a lot of time and energy and potential headaches just popping things in containers.
One good reason to separate every component is that it facilitates moving them to separate machines down the road, or scaling them separately.
Another good reason is that it reduces the out-of-band requirements of the components. "all the niceties" you have at your disposal may very well be specific to your distro, or your site setup. By contrast, docker containers only require docker, which is a single binary file. A developer needs to know his component will run anywhere, not just on your particular setup.
I am certainly not disagreeing :). I do wonder how well the exploit would have worked if the system were also locked down with SELinux/Smack. Not that MACs are bulletproof, but again... more distance.
In the end though, the only secure system is one that's not powered. :)
> locked down with SELinux/Smack
If you're executing in kernel mode, then you can just disable these. It might be more interesting to point to the efficacy of various exploits on a Grsec/PaX kernel, however.
> In the end though, the only secure system is one that's not powered.
That's silly. Sure, airgap your sensitive stuff. But there's a reasonable level of security you can achieve that's far beyond what docker provides, while still retaining a flexible and reasonable computing environment. For example, Xen's xl tool makes making quick and cheap VM containers very simple.
Now you are just nitpicking :) "the" meaning, you gave me an example.
> If you're executing in kernel mode,.....Grsec/PaX kernel, however.
SELinux has a lot of overlap with grsec providing many of the same benefits while being more widely available. The flow of an exploit: 1) entry into a system (app compromise, shell access), 2) inserting/uploading of executable data, 3) execution of said data granting further access (Ex, exploiting a kernel bug, adding a backdoor, manipulating the host system in some way).
The goals of Grsec/SELinux, and marking data memory as non execute (NX bit, PaX, Exec Shield) are aimed at preventing #2 and #3. The idea is the prevention of access escalation in the first place.
On PaX, the kernel supports utilizing the NX bit on x86-64 and has for quite a while now. Not using a system supporting the NX bit or at least PaX/Exec Shield is pretty stupid.
> But there's a reasonable level of security you can achieve that's far beyond what docker provides
I had already agreed with you on VMs... no reason to argue this point. :) Since you mention VMs again however, I will also note that VMs are not entirely isolated from the host system. A Xen (as your example) exploit as an example: http://lists.xen.org/archives/html/xen-announce/2012-06/msg0...
Ahh sorry. Well to continue along that example, evidently it breaks out of lots of things -- https://grsecurity.net/~spender/logs.txt
> On PaX, the kernel supports utilizing the NX bit on x86-64 and has for quite a while now. Not using a system supporting the NX bit or at least PaX/Exec Shield is pretty stupid.
Not going to try to parse this, but you appear to be very mistaken. Wikipedia PaX.
> VMs are not entirely isolated from the host system
Correct, hence the parenthetical in the OP. That sysret bug was a great one.
"The major feature of PaX is the executable space protection it offers. These protections take advantage of the NX bit on certain processors to prevent the execution of arbitrary code. This staves off attacks involving code injection or shellcode. On IA-32 CPUs where there is no NX bit, PaX can emulate the functionality of one in various ways."
And then on NX bit support on Linux:
"The Linux kernel currently supports the NX bit on x86-64 CPUs and on x86 processors that implement it, such as the current 64-bit CPUs of AMD, Intel, Transmeta and VIA.
The support for this feature in the 64-bit mode on x86-64 CPUs was added in 2004 by Andi Kleen, and later the same year, Ingo Molnar added support for it in 32-bit mode on 64-bit CPUs. These features have been in the stable Linux kernel since release 2.6.8 in August 2004."
PaX also provides a few other features but the big defining one has been the NX bit support. Not sure why you seem to think I am mistaken in what I said.
The Juju team is keeping a close eye on Docker and it'll likely (but I can't promise) be in 13.10.
I would love to be able to setup a workflow on my Windows machine that allows me to be able to do Rails dev on the Windows machine, in as close to a native way as possible - but with using much of the same workflow I use on my MBP.
If I setup a container on my Windows machine, I would have still have to SSH into some virtual environment to be able to run my rails app, right?
Ideally, I would love to be able to just go to my localhost in my browser and see my app - will this help me be able to do that, rather than going to a browser within a VM or some 'contained environment'?
An option is Vagrant (Which Docker uses on the above OSes) + chef / puppet.
It uses VMs but works well enough for me and both configuration engines have widespread support. http://docs-v1.vagrantup.com/v1/docs/getting-started/
Another advantage is that docker on a vm is still docker: the container running on your local VM is 100% identical, bute-for-byte, to the container you will run on an octo-core box with half a tera of ram.
Rails already has some minor loading issues, adding a VM to that can be very frustrating - I imagine.
Am I mistaken, or will my Rails environment not sit in a VM in a Vagrant instance?
Vagrant creates a VirtualBox (default) VM, uses chef / puppet (preferably) to set the environment up then you link the host's rails directory to a suitable location on the VM.
So yes, the environment is a VM, you can use "vagrant ssh" to access that VM easily enough but for coding you use your usual code directory on the host with your usual editor.
The advantage is that you're not limiting yourself to a RedHat distro. But conceptually is it similar?
Thinstall Virtualization Suite
There are more...
Docker relies on cgroups for resource limuts and accounting. So anything you can do with cgroups, you can do with docker.