Hacker News new | past | comments | ask | show | jobs | submit login
Docker and Security [pdf] (ernw.de)
136 points by cujanovic on March 10, 2016 | hide | past | favorite | 48 comments

"You are absolutely deluded, if not stupid, if you think that a worldwide collection of software engineers who can't write operating systems or applications without security holes, can then turn around and suddenly write virtualization layers without security holes."

-- Theo de Raadt (on the statement "Virtualization seems to have a lot of security benefits")

Theo is wrong with respect to Xen. You might enjoy section 3.1 and 3.2 from the Qubes Architecture document: https://www.qubes-os.org/attachment/wiki/QubesArchitecture/a...

The attack surface of Xen is much smaller than a traditional kernel like linux. I'm guessing the same can be said for OpenBSD after looking at the number of system calls.

Xen has had it's fair share of vulnerabilities, and the author of the document you linked also had this to say: http://lists.xen.org/archives/html/xen-devel/2015-11/msg0060... (follow the links)

Absolutely. It should also be mentioned that many of the Xen vulnerabilities are actually due to qemu and that emulation can mostly be avoided by using PV/PVH.

The Xen attack surface is then reduced to about 20 hypercalls (even less for ARM). http://xenbits.xen.org/docs/unstable/hypercall/index.html

The number of hypercalls isn't a good indicator of the attack surface area.

As examples:

There is a full x86 instruction decoder which is technically not part of the hypercall interface. x86 instruction encoding is surprisingly complicated. There was a vulnerability in that code: http://xenbits.xen.org/xsa/advisory-123.html

Handling x86 page tables and all of the various feature bits is also surprisingly complicated. This is part of the hypercall interface, but there was also a vulnerability in that code: http://xenbits.xen.org/xsa/advisory-148.html

You might enjoy the fact that my early critiques to Qubes told Joanna Xen was a security problem she should've avoided in favor of separation kernels w/ user-mode Linux virtualization, user-mode drivers, and a trusted GUI subsystem. She vigorously argued against that on the mailing list. Years later, they now have some kind of trusted GUI schemes and she's griping at Xen for being insecure. I don't know if she reduced privilege of drivers or has since learned why that's a good idea. The 180 was hilarious.

I'm sure their product will have plenty more vulnerabilities as she lacked foundational knowledge in building secure systems. Meanwhile, with little labor, GenodeOS was incorporating many well-designed components from the start such as L4 work, Muen, seL4, Nova microhypervisor, Nitpicker GUI, etc. They're still alpha quality due to clean-slate work needed in that approach but moving on steadily. Check them out.

Note: For appliances and embedded, JX Operating System is also worth Googling. Remember that JVM can be replaced with Ada, Go, etc runtime or Rust.

Do you have a link for that discussion? I've been using Qubes for awhile now and the GUI has always been in DOM0 with no networking.

She censored my critiques from that point on. All I have is a cached copy of what I sent in response to her dismissing my comments on the mailing list. Fortunately, I often quote relevant points before countering them. You can easily see what each of us pushed. Her Mac OS X claim makes me laugh to this day.


"and the GUI has always been in DOM0 with no networking."

I'm clearly not a Qubes user. I'm basing my statement on what articles or comments here described as relatively recent work isolating the GUI stack further from attack. If that didn't happen, I retract it. The Xen, Dom0, and driver risks she ignored and countered directly so I was sure they'd be a problem.

EDIT to add: Btw, she later did a one-side rebuttal countering various things here on her blog while censoring my rebuttal of course. One real problem she identified was my use of "military-grade:" commonly a buzzword indicating snake oil. I read lots of defense stuff back then but forgot to translate. I meant Type 1 certified by NSA for high-assurance use in military. It's a rigorous process identifying and often preventing... many issues people found in proprietary and FOSS products designed without high assurance practices. ;) So, I started saying NSA Type 1 certified from that point on to avoid confusion.

Thanks for taking the time to post that. Many of those statements are surprising given how well designed Qubes looks on the surface. I wish more people nitpicked projects like this. Please consider trying Qubes 3.1 out and posting a review. I'd love to read it.

I'm down to a few pieces of hardware and wary of doing anything heavyweight with them. I do want to try it out esp it's usability and such. What's resource usage like for it on single or dual core hardware?

The minimum requirements are pretty low: https://www.qubes-os.org/doc/system-requirements/

I've only used it on mid-high end systems though. You should easily be able to launch a few small VMs (each AppVM is tunable) on a dual core machine with 4 GB of RAM though.

Xen is worse than Linux in terms of quality, and therefore security. That Linux is much bigger doesn't make Xen any better.

What de Raadt means to say is, generally speaking, you can't build security on top of bad code. No amount of patching, sandboxing, or whatever will help. Security comes from quality and Xen (like Linux) is very lacking in quality.

TCB matters. If you have data to back that quality statement up, I'd very much like to see it.

Paul Karger helped invent information security, pentests, secure coding, high assurance, and so on. He and the other old guard paved the way for the types of systems that survived rigorous pentesting. One of his works was a virtualization product for OpenVMS done because securing full OS's was too hard to achieve.


If we're talking experience and lineage, then we should ignore Theo on this issue immediately as he's never constructed a provably secure system. They don't even address covert channels much less be able to tell me every successful and failed execution trace of their OS w/ argument they obey security policy. Of course, if you look at Karger's design and assurance sections, you'll know the minimum required to get close to secure, predictable operation disqualifies Xen, KVM, VMware, etc from the picture as well.

Only the separation kernels (eg INTEGRITY-178B or LynxSecure), Nizza architecture, GenodeOS, JX OS, CHERIBSD, etc are consistently applying the lessons of the high-assurance community. One of them is that microkernels and core VMM's are the easiest to secure with other stuff running on top mediated by a security policy. Many such systems survived NSA and private pentesting. Monoliths and UNIXen rarely do.

I'd say that it's a trade-off whether you think the enhanced isolation provided by containerization/virtualization is more of a security benefit than the risks posed by the increased attack surface of another layer in the stack...

I think that statement is factually wrong, but before I get into that, I don't see the relevance of a claim about x86 virtualization (which is what de Raadt's post was about) to the article here, which involves no x86 virtualization at all. There's some discussion of Linux container isolation, but the analysis of whether you can get that to be more secure than uncontained Linux is basically completely different to an analysis of hardware virtualization (leaving aside that it is generous to call a data-less opinion from a decade ago an "analysis"). Am I missing a reason why this quote is relevant?

Here's the context: https://marc.info/?l=openbsd-misc&m=119318909016582

I don't think the actual technology he was talking about is relevant.

The message is that people writing the linux kernel make a lot of mistakes, and it result in security holes. Those same people (or the same kind of people) also write X (Docker, x86 virtualization, whatevs...), so mistakes and security holes are to be expected.

I do not necessarily agree with the message, but this is how I understand it.

I understand that message, but it's so little of a message that it doesn't seem useful. Why should we believe that it's equally easy to make mistakes and security holes when writing kernels, OS virtualization, and hardware virtualization? The security models of these things are very different (for instance, a kernel has multiple user accounts, hardware virtualization has just "admin" and "VM"), the underlying designs are very different in complexity, the requirements for legacy compatibility are different, etc. etc. etc.

I think we'll see more security incidents involving Docker in the future as it becomes more popular. I have worked with it for a few months now and I can already see a few attack vectors that could easily be exploited if the corresponding Docker features are not used properly.

For example, as Docker can by default mount anything that root can mount, running

    docker run -i --volume=/:/data -t ubuntu
will give you complete root access to your file system in the docker container (the talk mentions some Docker features that will mitigate this kind of attacks, notably UID mapping). Of course no one would willingly do that, but if you mount user resources into your containers and the resource name contains something that an external user might control (e.g. his/her username), then injection attacks become possible. Even with UID mapping enabled this can leak sensitive information about your host system into the container. And since people often use containers to run untrusted code (e.g. for CI systems), this can be a large security threat in my opinion.

Personally I really like Docker and I think it (or similar technologies) will change many aspects of IT/Devops/Data Analysis in the future, I just think that maybe they should have some more sensible defaults for security-relevant settings, i.e. only grant network access to containers if you ask for it, restrict default memory usage by default, limit the type of volumes you can mount, etc.

My problem with Docker is that although the advise was always not to run images with root, in practice all Docker tools defaults to do precisely that and there are no plans to change the defaults.

At least these days they do provide options to restrict containers so secure usage is possible (--cap-drop=all is a good start), but still the amount of efforts to follow good security practices is high.

Yes, this has been my experience as well. It's either add your user to the docker group (which makes it equivalent to a root account) or run everything docker related with sudo. Your best bet is to run docker in it's own virtual machine which is somewhat ironic.

With 1.10 you can just enable User namespaces, which allows for root in a container to map to a non-privileged user outside the container, that way it's a one-time (per instance) change.

While this is a nice security boost once you're in the container, don't you still need to be root (docker group) in order to start the container? It honestly doesn't help me much if I have to give users root in order to start a container, even if they are wrapped inside the container.

Yep, at the moment, with raw docker engine, if a user has access to create containers, they're basically able to get root on the box, as the docker daemon runs as root and there isn't any authorization control by default, so it doesn't work well for that kind of scenario.

With that said there's a couple of ways this is getting addressed.

1) in 1.10 authorization plugins landed as a feature,so it's possible to add this functionality. 2) there's a number of services which run on top of Docker Engine (e.g. Docker Universal Control Plane) which have authentication/authorisation at that level.

This is what sudo is for. Give each user access to run their containers (and only their containers) as a member of the docker group in sudoers.

This doesn't work when users may need to run arbitrary (or user-defined) containers. You can only sudo so much... But, perhaps you could restrict it to "sudo docker run". I'll have to give that a try. But that would make it extraordinarily more difficult for a user to stop / rm / kill a container. Plus, it's not like docker has the concept of an "owner" for a container - does it?

Nonetheless, you shouldn't need to run anything as root in order to start a container that doesn't require extra privileges .

Please clarify what you mean by "access to run ... only their containers". How is that possible?

sudo arguably is another vector of attack

It's curious to me that of the various container options, only lxc/lxd seems to put in the effort required to get truly non-privileged containers working without issues.

Lxc/lxd goes far enough that you can boot a systemd based container, get a sanely partitioned view of /proc, etc. The effort is non-trivial...they had to create cgmanager, lxcfs, etc.

Rkt, runc, etc, are all working on it, but don't seem close to a real solution.

Docker also needs to address the hardcoding of the repo


As someone in the issue mentioned that "local registries don't have authentication", here's a shameless plug for a free software (self-hostable) replacement for the Docker Hub (which has authentication -- even LDAP support, and at SUSE we're working on adding token-based authentication and automated build systems that also rebuild derived images): https://github.com/SUSE/Portus. The solution of "just prepend the registry you use to your image names" is quite frustrating because it means that pushing and pulling is intertwined with the names of the things you're pushing and pulling. It should've been implemented as a flag in the CLI (or even a flag in the daemon for the "default registry").

this issue is incredible

Good presentation. One thing I'd mention is that they talk about the CIS security guide, but it's currently pretty out of date as it covers 1.6 and therefore misses a lot of Docker features like Content Trust, User Namespaces and Seccomp-BPF.

In general I'd say that Docker security is getting better, although I'm really looking forward to getting a better authentication/authorisation model on the docker engine as right now it's all or nothing, which is a pretty blunt instrument. Also this model causes problems when people do things like mount docker.sock inside a container for introspection as anyone compromising that container can take over the host. A better authorisation model would allow safer introspection...

Also worth noting as it's not in the presentation, one of the key Docker security features, User Namespaces, is not switched on by default, so you need to enable it on the daemon.

I will never get the Docker rant.

At the beginning I loved it however the deeper I digged the more I hated it. Actually the only thing I need would be immutable infrastructure and resource isolation. However for most of these things you don't need docker at all. You would've need some sort of container that could be fully isolated with the kernel cgroups and have somehow a way to isolate the network, but docker grows to a extremely fat monolithic approach where everything is put into and still it gets worse every release and the problems will never be fully addressed.

And on top of that the Docker Inc. tries to monetize on that with software that other vendors could do as well. Docker Inc. should focus on Docker and not on everything surrounding that.

Docker should've been easy, but now to make a good use of it your infrastructure will definitly not as easy as you planned at first. The way how you deploy software will change, but I doubt that it's docker who will make the "global" change. Maybe it was docker who gave a first impression how it could look but it's definitely not the end.

For your convenience here are the URLs from that broken document:

http://blog.bofh.it/debian/id_413 http://reventlov.com/advisories/using-the-docker-command-to-... http://events.linuxfoundation.org/sites/events/files/slides/... http://opensource.com/business/14/9/security-for-docker https://zeltser.com/security-risks-and-benefits-of-docker-ap... https://benchmarks.cisecurity.org/tools2/docker/CIS_Docker_1... https://forums.grsecurity.net/viewtopic.php?f=7&t=2522 http://xebia.github.io/cd-with-docker/#/ http://www.schibsted.pl/2015/06/how-we-used-docker-to-deploy... http://itrevolution.com/the-three-ways-principles-underpinni...

Most of these will be already in your bookmarks, when you are following docker development.

Why was this talk is interesting? Is there a video that shows things that happened that are not reflected in the powerpoint pdf? What was new?

BTW the powerpoint document wins the "most annoying document of the week"- award. We have better ways to publish URLs in 2016 than powerpoint. Please use reveal or anything else next time, thanks!

Yet another reason to favor unikernels instead.

And why would those have no security holes?

Let me see:

* MirageOS runs on Xen. If you break into Xen, you own all unikernels.

* OSv runs on Linux. If you break into Linux, you own all unikernels.

The fact that Xen is used is just a matter of convenience, nothing forbids MirageOS to run bare metal if the developers would decide to spend effort writing device drivers.

However they rather focus in other more critical areas, for the time being.

One has to minimize the exposure of code that can be subject of malicious manipulations. I am not sure that a restricted Linux container with tight filters on syscalls that it can use is unconditionally worse than a uni-kernel against Xen.

Unikernels can run bare metal, the fact that some of them use Xen is just a matter of saving development hours writing device drivers.

The reason they run on a hypervisor is because people need multi-tenancy. Nobody is going to dedicate a whole machine for a single microservice.

Who says it is a micro-service?

Or do you see things like DNS resolution, databases, file servers and so many other possibilities as plain micro-services?

Companies even in 2016, do dedicate whole machines to those services.

In that case a bug in unikernel allows to take the whole machine with no option to contains the damage through extra layers of protection.

With the major difference that unikernels are mostly written in memory safe languages with a very thin exploit surface, when compared with the usual set of C based OSes which containers are based of.

We're going to need to apply the blockchain to this problem eventually. Someone call me when we're ready.

does docker still store credentials in cleartext ?

We're working on fixing that in two ways:

1. Implementing oauth: https://github.com/docker/distribution/pull/1418

2. Using credential helpers: https://github.com/docker/docker/pull/20107

We changed the url from https://www.insinuator.net/2016/03/docker-devops-security/, which points to this and doesn't seem to have any content of its own.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact