> I think in the future there will be a big shift off of runc (docker) as the k8...

raesene9 · on Dec 21, 2018

I noticed that Kata have started integrating firecracker which will be interesting and should help their performance and security stories going forward.

I'd agree though that kata (or other VM based containerization solutions) won't completely replace runc based solutions.

One of the things I like about standard linux containers is the ease with which you can remove part of the isolation without turning it all off or on. Being able to easily do `--net=host` or add a capability is very handy in some circumstances.

Also the security story definitely isn't as clear as VMs>containers. Every isolation layer has had breakouts in the last year, VMs, gVisor, Linux containers.

swozey · on Dec 21, 2018

My last line was bait to get some low level container talk going on in here. Glad it worked ;p

What problems do you see/think arise from kata pretending to be a container?

cyphar · on Dec 21, 2018

Dammit, I got baited.

> What problems do you see/think arise from kata pretending to be a container?

There are a few.

One of the most obvious is that anything that requires fd passing simply cannot work, because file descriptors can't be transferred through the hypervisor (obviously). This means that certain kinds of console handling are completely off the table (in fact this was a pretty big argument between us and the Hyper.sh folks about 2 years ago now -- in runc we added this whole --console-socket semantic to allow for container-originated PTYs to be passed around, and you cannot do that with VM-based runtimes without some pretty awful emulation). But it turns out that most layers above us now just have high-level PTY operations like resizing (which I think is uglier and less flexible, but that's just my personal opinion).

Another one is that runtime hooks (such as LXC or OCI hooks) now are a bit more difficult to use. There's nothing stopping you from doing CNI with Kata, but it's one of those things where either the hook knows that it's working with a VM (which requires hook work) or the hook is tricked into thinking its dealing with a container (which requires lots of forwarding work, or running the hook in the VM). I'm really not sure how Jata handles this problem -- but the last time I spoke to the Kata folks the answer was "well, we're OCI compliant" which isn't really an answer IMHO (they're also cannot be OCI compliant, because OCI compliance testing still doesn't exist -- but that's a different topic). I imagine their point was "we copy runc", which is unfortunately what most people think when they say "OCI compliance".

There was a recent issue a colleague of mine (who works on Kata) mentioned, which is that currently "docker top" operates by getting the list of PIDs from the runtime and then fetching /proc information about them. Obviously this won't work with Kata and will require some pretty big changes to containerd and Docker to handle this (though I would argue this would be a good thing overall -- the current way people handle getting host PIDs for container processes is quite dodgy). There is currently some kernel work being done by Christian Brauner to add a new concept called procfds, and all of this work will be completely useless for Kata (even though it'll fix many PID races that exist).

But as I said, Kata is quite an interesting project (the work done for the agent is quite interesting) and it fulfills a very important need -- people are still worried about container security and adding a hypervisor which is lightweight will dissuade those fears.

swozey · on Dec 22, 2018

Really appreciate the response. I'm pretty new to the more low level aspects of containers.

Can you recommend any great blogs? Jess's is the only one I'm aware of that sometimes dives into multitenancy/container stuff.