
After Docker: Unikernels and Immutable Infrastructure - axelfontaine
https://medium.com/@darrenrush/after-docker-unikernels-and-immutable-infrastructure-93d5a91c849e
======
mato
Rump kernels ([http://rumpkernel.org/](http://rumpkernel.org/)) are
essentially Unikernels for POSIX. I'm currently working on running unmodified
application stacks (base firmware/"not-OS" \+ rump kernel + userland
application) on Xen and later, KVM and bare metal.

Will be giving a talk on this at
[http://operatingsystems.io/](http://operatingsystems.io/) in London on
November 25th.

~~~
m0th87
How'd you get started with rump kernels? I tried poking around and there's not
much documentation.

~~~
vertex-four
I found that the original research paper is currently the best (only?) proper
introduction to rump kernels.

~~~
justincormack
It is being updated
[https://github.com/rumpkernel/book](https://github.com/rumpkernel/book)

------
nl
I can't believe no one has mentioned ZeroVM[1] yet. The project page is
unfortunately non-descriptive, but Wikipedia has some important details[2]:

 _The ZRT[ZeroVM RunTime] also replaces C date and time functions such as time
to give program a fixed and deterministic environment. With fixed inputs,
every execution is guaranteed to give the same result. Even non-functional
programs become deterministic in this restricted environment. This makes
programs easier to debug since their behavior is fixed._

I've had a play with it - there's a version of python that runs on it, and
it's surprisingly usable.

[1] [http://www.zerovm.org/](http://www.zerovm.org/)

[2]
[https://en.wikipedia.org/wiki/ZeroVM](https://en.wikipedia.org/wiki/ZeroVM)

~~~
mercurial
This reminds me a bit of how (AFAIK) Nix packages are patched to avoid
timestamps and generate deterministic builds when needed.

------
Cbeck527
> "It remains virtually impossible to create a Ruby or Python web server
> virtual machine image that DOESN’T include build tools (gcc), ssh, and
> multiple latent shell executables."

At work, our tech team has found an interesting way around this for our Python
app. We build out the virtualenv in the docker container, and then run our
ansible-based deployments inside the same container. With that, our virtual
environments are rsync'd to the app servers so we can avoid installing
developer tools.

~~~
drdaeman
I'm ditching virtualenvs and going with old good Debian packaging and private
APT repository.

For VMs/containers that already run a single application, except for some
weird edge cases, there's really no point in having a virtual environment in a
virtual environment.

I have initial success with a few simpler projects, now looking into
transitioning more complex ones. Not sure whenever it'll go without any
hassle, but seems worth trying. At worst, I'd just waste my time and return to
virtualenvs.

~~~
grosskur
You could also try virtualenvs inside Debian packages:

[https://github.com/spotify/dh-virtualenv](https://github.com/spotify/dh-
virtualenv)

One reason to keep virtualenvs is that the system Python (VM or container)
includes extra Python packages that your app may or may not need. If you use a
virtualenv, you exclude these system-installed packages and guarantee a clean
starting point.

------
vezzy-fnord
Forgive me if I'm totally clueless, but isn't the idea of the unikernel
basically a throwback to the earliest, pre-OS days of computing when all
programs needed routines to initialize the base hardware resources before they
could perform tasks?

The idea of the unikernel and the libOS in general where applications can be
linked with their bare minimum OS runtime and packaged is certainly nifty, but
it's kind of funny that people are being so hyped over what sounds like a more
advanced form of what was regularly done in mainframes 60 years ago.

~~~
pjmlp
Because in 60 years the OS got useless layers of abstractions when used as
server OS with programming languages that come with batteries included.

If the programming language has a rich ecosystem with a runtime that is
already taking care of hardware abstractions and scheduling, why replicate it
a few times in lower layers?

How many schedulers or device drivers are needed to serve network requests?

~~~
krakensden
It's only useless until you discover that you're reimplementing it from
scratch on a full-time basis.

~~~
pjmlp
I assume you program in pure C without libc, using only syscalls for your OS
of choice.

------
wmf
The article is correct that there aren't yet best practices about building
minimal and secure Docker images, but it seems like switching to unikernels
would be much more work. Unikernels also suffer from the lack of VM resizing
and minimum VM sizes being too big in many cases.

~~~
mwcampbell
Agreed. The real problem is with the current tools for building images.

As a proof of concept, several months ago I built a few tiny Docker images
using musl libc and no package manager. But I had do deviate from the normal
image build process to do so.

[http://mwcampbell.us/blog/tiny-docker-musl-
images.html](http://mwcampbell.us/blog/tiny-docker-musl-images.html)

~~~
agumonkey
Ha, I wanted to do something similar to build tiny Virtual Box machines for
network labs. I too ended up reading about sabotage linux. Great link !

~~~
shykes
I am a big fan of these nano-images and really want to support them better in
Docker's builder. See "nested builds" and "image squashing". Basically, you
should be able to define a Dockerfile for the _build_ environment, then within
that define how to produce the final image. Then you need squashing to avoid
carrying the build layers.

I personally played with aboriginal Linux, but I believe it's the same idea :)

------
andrewstuart2
I think the biggest problem with Unikernels that I haven't seen addressed is
hypervisor inefficiency. Emulating any part of a kernel or multiple kernels
will just be slower. You have 20 guests on your hosts? That's 20 probably-
overlapping (uni)kernels running.

Sure you could optimize the heck out of the hypervisor, but now you've created
a kernel. And your applications run on that kernel.

With containers, you have one kernel that won't have to instantiate 20 drivers
for the disk subsystem. It can be smarter because it knows more about the
loads. It's what kernels have been built to do since day 0.

My main concern with unikernels is that eventually the hypervisor will need to
be a kernel to be any more optimized. I just worry it will be come something
of a self-defeating concept.

~~~
m0th87
AWS, Rackspace et al. use hypervisors already, so we're trending toward 4
layers:

    
    
      hypervisor -> monolithic kernel -> containers -> application
    

Unikernels collapse it to:

    
    
      hypervisor -> unikernel/application
    

It's certainly more elegant, although I'm skeptical of the purported
performance gains as well, simply because so many optimizations have been
thrown into traditional kernels.

~~~
jdf
If you are using a more minimal hypervisor (see my other comment on the
parent), then there do seem to be some measurable gains. I've seen a few
papers in this style:

[https://www.usenix.org/conference/osdi14/technical-
sessions/...](https://www.usenix.org/conference/osdi14/technical-
sessions/presentation/peter)

 _We describe the hardware and software changes needed to take advantage of
this new abstraction, and we illustrate its power by showing improvements of
2-5 in latency and 9 in throughput for a popular persistent NoSQL store
relative to a well-tuned Linux implementation._

That said, a simple application like memcached might be currently latency-
bound by the kernel's network stack, but a more complex application that reads
from disk (even SSD) won't be.

------
peterwwillis
There's so much wrong with this post, I don't know where to begin. The idea
that security is based on removing files, and not the holistic auditing and
hardening of a system. The idea that you can't remove a compiler from a system
image before packaging and deploying it (seriously? you don't know how to
remove a file before you run a packager?). The idea that you have to ship an
entire image to update a couple files. The idea that the entire design of an
operating system (which is designed to make it easier for programs to run and
interact without having to be tailor-made) is obsolete. It's like this guy has
never held an operations job in his life, yet he's telling people how systems
should be managed.

~~~
jfb
If you have an idempotent service, why do you need the accumulated cruft of 40
years of bad ideas to provide it?

~~~
SwellJoe
So, UNIX boils down to "the accumulated cruft of 40 years of bad ideas"?

I'm not disagreeing on the principle of immutable servers, but that's a pretty
bold claim.

I don't see getting rid of the "accumulated cruft" as being a particularly
interesting reason for exploring the unikernel or immutable server concept.
The benefit is in building for scale and redundancy. The lighter your image,
the easier it becomes to replicate it and maintain it, generally speaking.

Further, there is an argument to be made that building your own cruft into
your system is counter-productive compared to letting the operating system
cruft handle it. The Linux developers are probably better at it than you or
me; unless we understand our usage patterns dramatically better and the
options for optimizing them, it may be best to trust the OS "cruft" to do the
right thing.

In short, I'm not really taking a side on this one. I believe there is
interesting research to be done, and probably useful outcomes to be found, in
this direction. But, why dismiss 40 years of operating system refinement by
some of the brightest minds in the world as "accumulated cruft...of bad
ideas"?

~~~
jfb
Because that cruft has grown up in service of a timesharing model of
computation that no longer holds. Why does my phone have a root user? Why do I
have to escalate privileges to bind to ports < 1000? Why am I context
switching at all?

I think it's a mistake to conflate path dependence and correctness.

------
jessaustin
_Actually, CoreOS is a platform designed for orchestration and management of
Docker instances. It’s not intended to be used as a base image for Docker
containers. Specifically, CoreOS is based on Gentoo Linux, but the recommended
base Docker image is Debian._

Well I learned something...

~~~
jsmthrowaway
You learned wrong. CoreOS is a derivative of Chromium OS, which uses Portage
as its package manager. Simply typing "emerge" into a window does not Gentoo
make, which is a bummer, because it shows a lack of research on the part of
the author (that really came out in other areas, too).

It's also largely irrelevant, because CoreOS should in practice be read-only
when you boot it, and you're not extremely concerned with the details of how
it's put together (which its usage of Portage is).

------
fideloper
The ideas here are all very interesting, but I don't think we need to even
discuss issues with Docker to find the idea of an immutable server
interesting.

I also don't find the "problems" with Docker overly problematic.

* The use of many images is probably(?) not an issue? Do people just use "any old base image" without further thought? * An image of a few hundred megabytes isn't small, but it's not terribly large either.

Lastly, I see people's confusing over what CoreOS is besides the point. What
it is becomes pretty apparent after taking a look at coreos.com.

Overall I really like the idea of an immutable server though!

------
zobzu
One thing to be careful with these models is that you're moving the burden of
maintaining libraries to the application code.

So instead of updating packages and what not, you rely on the developer to
update the libraries and reship.

Sure its not far from today's model if the dev has to ship the whole
container, but it also makes it even harder. How do you know if you have lib x
or z when it's sometimes just dropped among a bunch of files? I think it's
much worse. it hides the problem and makes it difficult to detect.

I'm suspecting kernels will slowly converge toward plan9-like functionality
instead. It makes more sense. It's faster, more efficient, simpler.

The main barrier so far has been portability - but with more and more apps
being written on very portable languages (python, Go, C#, ...) its becoming
easier.

~~~
amirmc
For Mirage OS, all libs are released as packages in OPAM [0], so it's really
straightforward to find out which versions you're using (and
manage/update/remove them). In fact, we just did a set of releases recently
[1]. I'm not sure how it is for the other systems.

[0] [http://opam.ocaml.org](http://opam.ocaml.org)

[1] [https://github.com/ocaml/opam-
repository/pull/3028](https://github.com/ocaml/opam-repository/pull/3028)

------
jfb
I've always been a proponent of autarchy in these matters, and so am intrigued
by the idea of the unikernel. I'm also going to spend some time with OCaml, so
Mirage looks like something that might turn out to be really fun.

~~~
amirmc
Please do get involved! If you're learning OCaml, then
[http://realworldocaml.org](http://realworldocaml.org) is a great resource.
When you start trying out Mirage, join the mailing list and let us know how
you get on. Finally, to see where we can take this tech, have a look at
[http://nymote.org](http://nymote.org)

------
marknadal
Immutable systems are on the rise, and I'm glad that it is getting more
developer mind share. I literally just wrote an article about this a couple
days ago ( [https://medium.com/@marknadal/rise-of-the-immutable-
operatin...](https://medium.com/@marknadal/rise-of-the-immutable-operating-
system-f7945b1da993) ) and since then I've seen like 3 other posts about it on
top of HN.

------
jacques_chester
> _Heroku is a great example of immutable servers in action: every change to
> your application requires a ‘git push’ to overwrite the existing version._

Um, not it's not.

Heroku's buildpacks code caches a hell of a lot of stuff on each execution
agent. Still more code has to recognise and try to repair various broken
states. It's mutability, through and through.

------
cyberneticcook
How far away are Unikernels from being adopted in production ? Does it make
sense to consider them for a project starting today ?

------
preillyme
So what do you think of things like Apache Mesos in this regard?

