
Gone Full Unikernel - deferpanic
https://deferpanic.com/blog/gone-full-unikernel/
======
jrv
It seems this article gleefully admits many of the downsides of unikernels
mentioned in [https://www.joyent.com/blog/unikernels-are-unfit-for-
product...](https://www.joyent.com/blog/unikernels-are-unfit-for-production),
while being very brief and naive about the upsides (mainly the very contested
security argument).

I admittedly haven't studied the whole unikernel space yet, but intuitively
they do seem unfit for production unless we spend a decade rebuilding tooling
(debuggers, process diagnostics tools, etc.). And even then, other downsides
apply, as laid out in the Joyent article.

Happy to change my mind over time if it proves to be the other way around, but
for now I'm very skeptical.

~~~
brendangregg
I wouldn't say that unikernels were entirely undebuggable. I spent a few hours
hacking and came up with a proof of concept dom0 profiler, and learned some
debugging benefits: one symbol table for the entire binary, one place to turn
on frame pointers for everything, etc.

[http://www.brendangregg.com/blog/2016-01-27/unikernel-
profil...](http://www.brendangregg.com/blog/2016-01-27/unikernel-profiling-
from-dom0.html)

~~~
rjsw
There is nothing stopping people from creating a unikernel for a dynamic
language that also includes the development tools.

A Lisp Machine on Xen would be one model.

~~~
moosingin3space
I feel like Erlang-based unikernels are an extremely compelling alternative to
traditional UNIX deployments. Immutable systems with safe hot swap and
excellent debugging tools like `observer` and `debugger`.

~~~
reycharles
[http://erlangonxen.org/](http://erlangonxen.org/)

------
thekemkid
The good news is, for the average user, unikernels are pretty much guaranteed
to be mainstream and streamlined at some stage in the future, thanks to Docker
acquiring Unikernel Systems, and the awesome work that the like of deferpanic
are doing. :D

[1]:
[https://blog.docker.com/2016/01/unikernel/](https://blog.docker.com/2016/01/unikernel/)

[2]: [http://www.linuxjournal.com/content/unikernels-docker-and-
wh...](http://www.linuxjournal.com/content/unikernels-docker-and-why-you-
should-care)

~~~
coroutines
I'm with you - Docker has done an excellent job of showing application
developers the minimum they need for a runtime.

Side-thought: Can Android be dockerized?

~~~
superuser2
> Can Android be dockerized?

Why? Aren't Android apps already sufficiently sandboxed?

~~~
johncolanduoni
They're pretty sandboxed, but every Android app gets a JVM, no exceptions
allowed (even Android's pure C++ API is just a wrapper around JNI calls). And
in the case of the old Dalvik VM, it's a _terrible_ JVM.

~~~
sangnoir
> They're pretty sandboxed, but every Android app gets a JVM, no exceptions
> allowed (even Android's pure C++ API is just a wrapper around JNI calls).
> And in the case of the old Dalvik VM, it's a terrible JVM.

Wasn't Dalvik deprecated & replaced by ART (Andriod Runtime)? ART compiles
apps AoT - IIRC; upon installation pre-Marshmallow, and while charging/idle
Marshmallow going forward

~~~
johncolanduoni
Yep, hence me specifying the old one. ART cleans up a lot of things: no more
8-16KB main thread stacks, better code gen via AoT, fully precise collection,
moving GC. It has some really ingenious features as well, like switching to a
compacting GC with better throughput and space efficiency when an app goes in
the background and latency is irrelevant. The switch to ART as default was
actually in 5.0 (Lollipop).

It's _almost_ enough to make me stop cursing Android developers and their
children's children. Unfortunately version updates for non-Google devices are
rare and everybody is still stuck supporting the majority of devices that are
pre-Lollipop. Also it didn't make the APIs any better >:(

------
kev009
I can't help but think this is just a severe reaction to the tire fire that
most Linux distros are, especially RedHat/CentOS and Ubuntu. BSD or Alpine
Linux get in the way a lot less, are much more customization and compact, and
have a smaller attack surface while still catering to production operations
where you can run shells, profiling, logging, etc inside the execution
environment.

~~~
bcg1
There's more to unikernels than just being another virtualization technology.
Most of the conversation on HN (as well as the content of this article) seems
centered around unikernels vs. containers vs. a traditional OS in a VM, etc.
But that conversation sort of misses the point.

Rather than just being a competing virtualization solution, "Unikernels" are
really about eschewing the existing OS paradigm altogether. For example, the
Mirage folks seem to have asked themselves about how they could create a
"safe" OS and landed on the solution that they could achieve that by trusting
the OCaml compiler and runtime for "safety" and so wrote a brand new OS from
scratch in OCaml. That is a very different thing than a reaction to the "tire
fire" that you are describing!

Similarly, for rump kernels Antti Kantee (with the help of others I presume)
took several years to re-architect the NetBSD kernel to minimize the inter-
dependency of different components of the kernel through the creation of a
"hypercall" interface[0] and a carefully thought out separation of
concerns.[1] One of the end results of this architecture is that you can run
NetBSD drivers outside of the NetBSD kernel "just" by implementing the
rumpkernel hypercall interface. Want to write your own OS (in a "safe"
language language like OCaml for instance) but don't want to write a tcp stack
or a filesystem implementation or USB driver from scratch? Rumpkernels could
be an solution to that problem. Again, that is a very different problem space
than the "tire fire".

[0]: [http://netbsd.gw.com/cgi-bin/man-cgi?rumpuser++NetBSD-
curren...](http://netbsd.gw.com/cgi-bin/man-cgi?rumpuser++NetBSD-current) [1]:
[http://lib.tkk.fi/Diss/2012/isbn9789526049175/isbn9789526049...](http://lib.tkk.fi/Diss/2012/isbn9789526049175/isbn9789526049175.pdf)

~~~
pjmlp
The Mirage folks didn't discover anything new in that regard.

It is how the safe OS from Burroughs, DEC, Xerox Parc, ETHZ and many others
used to work.

Those OSes were written in strong typed systems programming languages, the
whole stack.

Part of their security was based on the language type system.

~~~
bcg1
Fair enough! I definitely wasn't trying to suggest that is a completely new
feature, nor that it is the only feature of Mirage OS ... really was just
trying to make the point that there is more to the unikernel story than just
figuring out whether it is better or worse for running my buggy crud
application than some other virtualization technique. Thank you for the info
though, I will have to read up on those things you mentioned.

~~~
pjmlp
You can find some links to those systems here

[https://news.ycombinator.com/item?id=11856479](https://news.ycombinator.com/item?id=11856479)

------
ryao
> Try 5, 10, 20 megabyte small.

OpenWRT/LEDE will happily work on a system with 4MB of storage:

[https://www.lede-project.org](https://www.lede-project.org)

QNX had a graphical environment, a web browser, a web server, a text editor,
image viewer, various games, a package manager, etcetera on a 1.44MB floppy:

[http://m.youtube.com/watch?v=K_VlI6IBEJ0](http://m.youtube.com/watch?v=K_VlI6IBEJ0)

Less is definitely more, but you do not need a Unikernel to achieve such sizes
and you lose observably by going with a Unikernel. If something goes wrong
with your application such as it becoming non-responsive, you need to attach
gdb or get a core dump like a kernel developer would to understand what
happened. Your production systems that are likely EC2 instances that lack such
functionality, which means debugging is much harder with a unikernels than it
would have been with a monolithic, hybrid or micro kernel. Furthermore, disk
space is cheap, which is why few opt for OpenWRT/LEDE over more full featured
Linux distributions in datacenters.

If you want the experience of a single address space and little more code than
your application, you could run FreeDOS, which also fits on a floppy and has a
code base that is mature. There are guides for doing this online. Here is one
for doing a web server:

[http://www.instructables.com/id/Retro-dos-web-
server/?ALLSTE...](http://www.instructables.com/id/Retro-dos-web-
server/?ALLSTEPS)

The world moved away from such designs because the observability and stability
were awful. We might have "safe" languages now that improve stability of the
application, but those could just run as a process in an environment where
proper debugging can be done when something goes wrong. The few percentage
points of performance that you get from eliminating the mechanisms that enable
you to understand what went wrong do not out justify discarding them.

Also, you lose the advantage of a shared memory pool with unikernels, which
are generally intended to run in VMs. Partitioning memory in VMs causes
internal fragmentation, which artificially lowers the density of applications
per machine. It also can lower block IO efficiency from double caching between
the host and guest. Hardware virtualization is a useful technology, but it is
an inefficiency that we need to eliminate with containers, rather than one
that we should to embrace with unikernels.

~~~
aseipp
> The world moved away from such designs because the observability and
> stability were awful. We might have "safe" languages now that improve
> stability of the application, but those could just run as a process in an
> environment where proper debugging can be done when something goes wrong.
> The few percentage points of performance that you get from eliminating the
> mechanisms that enable you to understand what went wrong when things go
> wrong does not out justify discarding them.

I think there is a lot of design space here that is unexplored, so I'm not so
sure it is as clear cut as you say. You might like this talk given earlier
this year at Compose Conference, entitled "Composing Network Operating
Systems" (I was a speaker at Compose and I <3'd this talk a lot.)

[https://www.youtube.com/watch?v=uXt4a_46qZ0](https://www.youtube.com/watch?v=uXt4a_46qZ0)

It is not just about performance in all cases. Mirage is the particular case
in question here - but with OCaml functors, it becomes possible to compose
components of kernel in truly modular ways. I was continuously surprised by
this talk.

Something that needs to write to a block device only needs an abstract functor
describing the interface to the device and some primitives to read or write to
it. There are many implementations of this interface.

This seems quite obvious but it allows powerful ideas. For example, in the
talk, you can see examples similar to this. But what if you want to test your
kernel? You can simply substitute in a _new_ implementation that has failure
modes. You can write a block device that randomly ignores every 100th write;
one that has unexpectedly high latencies, one that outright hangs on all I/O
requests... Doing this kind of fault injection today is possible, but it's
conceptually a lot nicer if it's just a "Mock" at the "block device" level
that you can easily control and extend. You can do all kinds of other things;
like have your system timer freak out, skew in random ways, run in reverse.

You mention observability, but when your systems are truly modular, this is
nothing more than an obvious follow up. An example in the talk is interposing
"Irmin", which is a distributed, Git-esque storagre system, into the network
subsystem of your kernel driver. Any time interface properties of the device
change, you write entries into the append-only Irmin log which are
distributed. Irmin also has a git interface for read-only analysis.

The short story is that means in the talk, there is a live example where you
can query a git repository to get a read-only changelog of all the networking
state in your application. In the particular example, I believe it was
interposed into the ARP implementation; every ARP packet and ARP response was
logged into Irmin, and every system change propagated as a result was logged
too. This gives you really amazing levels of persistent analysis and
introspection with very low developer cost. It's true you could do something
similar in a system today; but this is truly modular, works for any
application built to use a particular Functorised-API, etc. It's a programming
interface! And in theory there's also nothing stopping conventional tools like
`ocamldebug` from working either.

Mirage also abstracts over the true underlying runtime. So that same device
API can be switched with one that just talks to a POSIX-compliant filesystem,
you get an ELF executable, etc. This all works on normal systems too;
Unikernels are merely a different deployment target (for the most part).

This is not to say that Unikernels are the future or we should abandon our
stable systems we have now (I definitely won't be doing so anytime in the
future). But I found myself very surprised at what was quite easily possible,
and I wouldn't so quickly write it all off as a fad. Maybe for Huge
Enterprise, yeah... Operations experience separate from development is very
useful, and a lot easier to find. But there's definitely some really cool uses
for these things, especially in helping rethink and improve on some previous
ideas.

> Also, you lose the advantage of a shared memory pool with universals, which
> run in VMs. Partitioning memory in VMs causes internal fragmentation, which
> lowers densities. It also can cause double caching between the host and
> guest, which lowers block IO efficiency.

This is a good point that's often overlooked. But I don't look to Unikernels
for outright performance, either; to me, they are more interesting for
researching newer operating system designs with a much better ROI than
previous methods. I'm glad to see that happening, personally. And I might even
take a performance _loss_ if it meant winning some other guarantees in return.

~~~
ryao
Unikernel proponents seem to assume that hardware virtualization will forever
be the abstraction of cloud computing. However, hardware virtualization is the
wrong abstraction, which is why the industry is beginning to adopt containers.
There is no reason why you cannot run a unikernel in UNIX binary mode inside a
container, but then it is really just a different way of developing a userland
process rather than a unikernel. You definitely could still call it a
unikernel. You would get the advantages of modularity that you specified and
you would have all of the debuggability and observability that regular
applications have today with the tools that we have today. However, that is
rather different than the role in which they are intended to operate.

I guess my point is that the unikernel is always going to be the equivalent of
a userland process. The question is whether your bare-metal kernel is going to
be a traditional one or a hypervisor. They have definite performance
advantages over a traditional kernel when your bare metal kernel is a
hypervisor, but I believe that is the wrong abstraction when I consider
overhead.

------
haberman
Tell me if I'm missing something, but the premise of Unikernels seems to be
that a ring-0 x86 hardware environment is the perfect fit for a universal
container/host interface.

Or to put it more charitably, since cloud compute services are based around
booting VM images based on this model, we'll just go with it instead of trying
to use an abstraction that is actually designed for this.

Correct me if I'm wrong, but it seems to me that the first thing any unikernel
is going to do when it boots is switch the (virtualized) CPU out of x86 Real
Mode (which all x86 machines boot into for legacy reasons, but virtually no
one has needed since circa 1995) into protected mode.

Is it just me or does this seem a little bit crazy?

~~~
en4bz
We've gone full circle. Originally there was shared hosting whiched hosted
your app in ring 3 with other users on the same physical machine running which
ran in ring 0. Then we got fancy virtualization hardware where the hypervisor
ran in ring -1, your VM ran in ring 0 and your apps ran in ring 3. But that's
a lot of indirection so micro kernels move your app into ring 0. So now we're
basically back at shared hosting where your app runs one level higher than the
host OS. Except now your app also bundles a partial OS and has weak debugging
tools. It does have better isolation though then shared hosting so that's a
plus.

But containers are basically the same thing but with better debug support and
a more familiar OS environment. Problem is containers need to be deployed on
metal to be effective, not VMs. Unfortunately not many providers do this yet.

So yeah it is all kinda crazy.

~~~
mwcampbell
> Unfortunately not many providers do this yet.

Samsung just acquired Joyent, which provides multi-tenant container hosting on
bare metal via Illumos and LX-branded zones. So to me, the acquisition further
validates that approach.

------
jacques_chester
The EMC reference is probably to UniK[0], which provides a Docker-compatible
API and can be used under the supervision of Kubernetes or Cloud Foundry.

I was happy to see UniK, because I've long seen unikernels as an
"architecture-buster" for Cloud Foundry. Yet in practice the shift to Diego
made it much less painful than expected.

[0] [https://www.cloudfoundry.org/unik-build-run-unikernels-
with-...](https://www.cloudfoundry.org/unik-build-run-unikernels-with-ease/)

Disclaimer: I work for Pivotal, the majority contributor of engineer to Cloud
Foundry. EMC is a major shareholder in Pivotal.

------
4ad
I never understood why these people didn't bother at all upstreaming their Go
port. I even offered myself to be the person responsible for the review. I
would gladly do that.

I'm very happy about more Go ports, I've done the arm64 and the Solaris ports,
and now I am finishing the sparc64 port, but ports need to live upstream.

~~~
deferpanic
If you want to help out we would be more than grateful for this - there's a
lot of work involved.

------
thinkMOAR
"Nowadays very few developers interact with actual real hardware - it’s been
completely abstracted away."

And i think this is a very big problem, for them its some magic pizza box, and
complaint when things don't perform the way they expect it to be.

Good read though thanks!

------
bogomipz
Can someone explain why these rump kernels can not be run on AWS if deferpanic
has Xen as a target? AWS is Xen-based. I understand that there currently isn't
a target for Docker so that takes Google Cloud out of the equation. The
following two statement seem to be contradictory:

Can I use Google Cloud or AWS? You could - although you won’t write much more
than a toy app - not until things are changed.

DeferPanic offers managed services for both public and private cloud
environments and it's platform targets KVM, Xen, bare metal, and ESX.

Perhaps that falls under the "unfit"statement about these Cloud provider but
that seem pretty nebulous for a such a technical discussion.

~~~
jsolson
> I understand that there currently isn't a target for Docker so that takes
> Google Cloud out of the equation.

Google Compute Engine runs a lot more than just Docker images. It allows you
to run arbitrary x86 VMs, just like EC2. It is not based on Xen, however (it
is a combination of KVM and a non-QEMU VMM about which I wish I could say a
whole lot more, but I don't think we're prepared to do that just now).

~~~
bogomipz
Right I believe that GCE is docker but it runs in a KVM container, I'm not
sure why they do that however. Maybe someone else can explain? My guess would
be that its a hedge on container security.

However what they hand you is a docker container I believe so provided there's
docker target for whatever rump kernel it should _theoretically_ just work.
No?

It sounds like you work on GCE?

~~~
jsolson
GCE is just plain old VMs, no Docker involved.

There's also GKE which is managed Kubernetes complete with Docker containers.

(And yes, I work on the virtual machine monitor backing GCE)

------
edwintorok
What languages do you support for unikernel? Is it just Go, or do you plan to
support others in the future (e.g. OCaml for MirageOS)

~~~
deferpanic
So this is kind of a two part question:

1) What languages right now - Go, php, javascript, ruby through the rumpkernel
project - rumpkernel.org.

2) We are implementing support to support user supplied images which will let
you run mostly anything in the very near future. We plan to be completely
agnostic.

~~~
viraptor
Not from deferpanic, but there's also rumprun for rust:
[https://gandro.github.io/2015/09/27/rust-on-
rumprun/](https://gandro.github.io/2015/09/27/rust-on-rumprun/) (it's even
easier these days - integrated into cargo target)

------
wangchow
This whole movement seems strange to me. It's like these are statically
linking the entire OS to run a single app. Why not ditch the OS completely?
I'd say this is taking the whole container concept a bit far, but who knows
what will come next!

~~~
zenlikethat
If I'm not mistaken, the whole idea _is_ to ditch the OS completely. To avoid
a fully functioning kernel with lots of juicy device drivers to exploit, code
intended to work on systems your app will never need to worry about running
on, and layers upon layers of abstraction ready and waiting to be exploited
(e.g. shells).

One reason MirageOS uses OCaml, for instance, is for its memory safety
properties. A truly staggering amount of vulnerabilities (e.g. Heartbleed) are
due to abusing unintended ways of accessing memory in programs which face the
public Internet. Since we've proven over and over again at this point that we
can't reliably write safe C code, there's a reason folks are interested in
eliminating as much of it as possible, all the way down to the hypervisor
level. Since so many devices will be Internet-connected soon, having a way to
write apps without even a possibility of "Oops" bugs like this is even more
critical.

~~~
wangchow
Interesting. I'll have to do a bit more research on this.

------
Ericson2314
Traditional VMs suck. Containers a la Joyent and Unikernels are really two
points on the same spectrum of distribution of complexity between host and
client. Eventually they will converge, because neither POSIX nor (virtual)
hardware are interfaces designed this purpose.

The one-language library-centric ideology of e.g. MirageOS especially is
really orthogonal to questions of provisioning data centers. It is truly a
huge step in the right direction, and before the unikernel-container
convergence, could be applied to the host OS of a container rig.

------
josh_carterPDX
Docker is investing a lot of time and energy on unikernel as well. Makes a lot
of sense, but I agree with a lot of the comments here. It might be a year or
so before we start to see faster adoption.

------
voodootrucker
For debugging you can use qemu, and for "hypervisor level orchestration" you
can use CloudFormations with AWS.

Maybe I'm missing something?

------
justaaron
awesome sauce! why did you go (sorry :D) with Golang?

(garbage collection would seem to be an issue with a bare-metal language or?)

vs Rust...

~~~
moosingin3space
Not OP, but deferpanic is pretty invested in the Go ecosystem according to
their past HN posts.

