
Container Runtime Interface (CRI) in Kubernetes - philips
http://blog.kubernetes.io/2016/12/container-runtime-interface-cri-in-kubernetes.html
======
vidarh
I can't get over how much of a smell the requirement of an RPC interface in
order to interface with tools that may or may not have any good reason to be
running on an ongoing basis is to me.

I'm sure there may be cases where you interact with the containers frequently
enough that spawning a process each time is actually a worthwhile
optimization, but more and more of these containerisation systems are becoming
an unholy mess of daemons that needs to run in order to run and manage
containers that need not depend on anything but the host init/systemd.

E.g. one of the really appealing things of rkt for me is the simplicity -
depending on the level of isolation everything is running either direclty
under systemd, or under an individual isolator like systemd-nspawn.

I disliked this tendency towards a herd of daemons intensely when Docker
continued as it started and used HTTP for volume/network plugins, and I
dislike it just as much now.

It's as if someone sat down and thought long and hard about how to add more
complexity and more "fun" failure modes.

~~~
wmf
It's the microservice philosophy: Why use a function call or fork/exec when
you can use RPC? (At least CRI is binary RPC instead of JSON over HTTP/1.)

Also, Go doesn't dlopen AFAIK.

~~~
vidarh
It gets better. Take a look at rktlet, a CRI implementation for rkt (EDIT: I
originally mistakenly wrote Docker). Specifically the runtime [1], which ends
up shelling out to the "rkt" binary.

So you end up running a new daemon that communicates with Kubernetes via gRPC,
that then spawns rkt anyway. So you get to RPC _and_ fork/exec.

I'm sure that ends up "optimized away" at some point by e.g. having a rkt CRI
implementation that just links in the relevant rkt code. But I'm left
wondering why we need this complexity in the first place.

[1] [https://github.com/kubernetes-
incubator/rktlet/tree/master/r...](https://github.com/kubernetes-
incubator/rktlet/tree/master/rktlet/runtime)

~~~
chrissnell
I don't get your argument. If I'm reading this correctly, you're arguing that
system calls and/or a call to a shared library function are cleaner than RPC
to another process?

The overhead of RPC in an application like this is tiny and the cost of an
additional process on 2016 equipment is non-existent.

~~~
vidarh
It's more things that can fail that now needs monitoring. (EDIT:) And in the
specific case of rktlet you _still_ end up forking/execing anyway.

The overhead isn't necessarily a big deal (and can easily go the other way -
if the request frequency is high enough, it's cheaper to keep the process
around), but it does also potentially add up.

------
philips
This was posted last week but here is rkt's roadmap around Kubernetes's CRI
and use of OCI's runc: [https://coreos.com/blog/rkt-and-
kubernetes.html](https://coreos.com/blog/rkt-and-kubernetes.html)

------
cyphar
Currently quite a few people from the OCI community (including myself) are
working on implementing a CRI-compliant runtime[1] around runC and the various
OCI specifications as well as the containers/image and containers/storage
projects. There's a lot of cool design to ocid which means that it doesn't
require a daemon to be constantly running.

[1]: [https://github.com/kubernetes-
incubator/cri-o](https://github.com/kubernetes-incubator/cri-o)

~~~
vidarh
Do you have any more specific pointers regarding using it without a daemon? As
the examples seem to start with starting a daemon, unless I misunderstand
something. If it doesn't need that, then that's a big plus in my book.

Though, I'm getting more and more disillusioned in general with where these
specs are heading - the complexity seems to be skyrocketing for sometimes very
little benefit.

Not necessarily specific to Kubernetes and/or OCI - Docker is a prime
offender.

E.g. typical example: the highly coupled nature of many of the networking
alternatives where routing, fabric and ip allocation gets muddled all up, when
there are well developed, stable, well tested independent and orthogonal
alternatives for tunnelling and route propagation there's a serious level of
Not Invented Here syndrome at work in many of the container projects. I'm sure
_some_ people need all the complexity, but I'm getting more and more tempted
to ditch many of the higher level tools in favour of composing smaller,
simpler tools.

(Incidentally I'll make one prediction: one good thing likely to come from CRI
is that I suspect it will lead to a new array of Kubernetes "replacements"
from simpler composable tools; the APIs don't look all that bad - I just don't
like the RPC dependency)

~~~
cyphar
> Do you have any more specific pointers regarding using it without a daemon?
> As the examples seem to start with starting a daemon, unless I misunderstand
> something

At the moment, the RPC requirement means that you need to have a process that
can accept RPC requests (a "daemon" if you like). However, unlike Docker (and
containerd), ocid's lifetime is not tied to the lifetime of its containers --
which is one of the main downsides of Docker/containerd IMO. So in principle
you could have ocid set up to only start up when kubelet is telling it to do
anything. The real benefit of the design behind ocid is that _in the future_
we could switch to a fork-exec model with the kubelet and it would still work.

For example, currently kubernetes is adding a requirement for runtimes to
include a "kpod" binary that can do container and image operations even if the
kubelet is down. My hope is that eventually they will just make the kubelet
shell out to this binary so the CRI is defined through some sort of "here are
the cli flags you need to accept" interface.

> the complexity seems to be skyrocketing for sometimes very little benefit.

I wouldn't call the current CRI "complicated", it's just that the gRPC
requirement IMO is a bit too much. However, I would hope that since ocid and
rkt both don't require daemons (well, rkt requires systemd but that's a given)
that they'll reconsider their method of communicating with container runtime
runners.

