

Linux Container Security - akerl_
http://mjg59.dreamwidth.org/33170.html

======
legulere
It simply shows that Unix predates modern requirements towards operating
systems. There's no unified privilege hierarchy with capabilities. The only
vertical isolation Unix offers is userspace vs. kernelspace and root vs. other
users and for vertical isolation it's different users (which are abused, as
daemons don't really belong to a user) and processes.

Because of modern requirements features that already exist get abused
(vertical isolation of daemons through different users) and new features get
tacked onto already existing ones. This leads to very complicated code-paths
that are bug-prone for privilege escalation. Users already offer horizontal
isolation similar to containers that sit next to each other. But instead of
making users more general they are half-emulated in containers.

The way which taken to fix this is by adding even more complexity, so I guess
it will take long until we will come to a point where linux containers are
reliable.

~~~
taeric
I'm curious what the main downside to using multiple vms for this is. Seems
like a simpler system, all told. Sure, they are bigger parts, but they are
much more cleanly defined in how they interact with each other.

That is, finding ways to isolate applications on one machine from each other,
while sharing the same resources, just flat out sounds difficult. That two
applications can have different /etc/hosts files, for example, just doesn't
really grok that easily.

That two machines could have this is easy.

~~~
legulere
Well you have to emulate almost the whole hardware in software when doing
virtual machines. This costs lots of performance (less so if you do
paravirtualisation).

A virtual machine software essentially is an operating system that offers
emulated native hardware as an interface. You don't have to show both
applications the same file, you can mount different filesystems for them for
/etc/ for instance. Or maybe the applications shouldn't read /etc/hosts
directly

~~~
taeric
So, /etc/hosts is just a silly example. One that happens to be easy to
consider. You want "backend" for one service to be a different machine than
"backend" for another. (Again, simple example, not necessarily indicative of
how things should be.)

But, this really doesn't answer my question. Sure, emulating a full machine
can be expensive. That is typically why you do it with a very powerful machine
in the first place. If the losses are such that they are causing problems, you
can always go to real hardware, even.

More to the point, this implies that containers are somehow "free." I am
skeptical that could be the case. You are adding a whole new layer onto the
operating system to possibly present multiple copies of files or only subsets.
Simply put, there can be no free lunches. The price is paid somewhere.

~~~
laumars
Containers don't require a whole new layer since you're effectively using the
kernel as a hypervisor. Very simplistically put, containers are a little bit
like chroot processes - so a guest would have it's own /etc directory unique
to the host (you don't need to build containers this way if you don't wish to,
but this would be the standard type of set up).

With LXC, the guest processes are separated via cgroups, which is the same way
systemd separates processes on the same host during it's init.

~~~
taeric
For some reason, I just can't get past the thought that this overhead is more
than just a little.

And, again, my main concern is more of a cognitive one. It is fairly hard to
"pull back the curtains" and think about a system that has different /etc
files for each application/container on it. (Heck, the implication seems to be
that it would be possible for malware to somewhat easily hide stuff from the
other users on the machine.)

I mean, I realize it can be done. And pretty soon it would become the more
standard way to do things. Just seems a larger mental shift to me than you are
making it sound.

~~~
laumars
Just out of interest, have you ever played around with _chroot_? eg for SFTP
or perhaps you've installed ArchLinux lately (chrooting is the standard way to
install ArchLinux these days)

Having a different /etc doesn't require a hypervisor, you're just telling the
kernel that a PID's "/" is actually "/usr/local/containers/guest01/". Linux /
UNIX has been doing this stuff for years and years.

~~~
taeric
I've seen it done. Never done it.

My main concern would still be the cognitive overhead of my looking at it. Or,
is there a way to see a "global" / according to all processes? (That make
sense?)

------
Alupis
This post illustrates what a lot of Docker "nah-sayers" have been saying for a
long while. Docker does not provide the security an awful lot of users believe
it does.

Docker was about application portability first-and-foremost. Then, after a
while, they decided they better bolt on some security. Unfortunately it's not
quite there. Running multi-tenant vm's on a single host still is a far more
secure way to go than multi-tenant containers on a single host. If you are
running applications on a single-tenant host, then docker doesn't provide
really much for trusted apps other than portability.

~~~
zobzu
Yep. Lxc etc too since they all use the same technology. The thing is, this
layer of security never matters much to people. Then the world ends the day
theyve been compromise sadly and only then do they start caring.

Security is one these 'not going to happen to me' thing. Then of course
eventually it does.

------
rlpb
"If you're using KVM, ensure that you're using sVirt (either selinux or
apparmor backed) in order to restrict qemu's privileges."

What's wrong with qemu's support for AppArmor as available in Ubuntu? What
does sVirt do that this doesn't do?

~~~
eeZi
sVirt has an AppArmor backend as well, so it's the same thing. The SELinux
implementation appears to be more polished, though.

------
batbomb
Chris Kemp mentioned that Google runs one VM per machine and all of the
containers in that VM for security reasons. Any googlers out there that can
expand on this?

------
throwaway1979
So keeping containers from different users on the same host is a bad idea
(think VPS setting)? Why? I thought unprivileged containers in the newer
versions of Docker are pretty secure. Is there an issue giving root access in
this case?

------
Alupis
The post title probably should have said "Docker Security" because then it
would have stayed on the front page longer.

------
eeZi
Proxmox has no support for sVirt. All qemu processes running as root, too.

