Hacker News new | comments | show | ask | jobs | submit login
My First Unikernel (roscidus.com)
267 points by the_eradicator on July 30, 2014 | hide | past | web | favorite | 51 comments



If it helps to add clarity to what this is about, Xen still requires a host (dom0) operating system, usually Linux but also NetBSD and OpenSolaris. Mirage OS (the basis for building your own unikernel) allows building applications as domU kernels. This is advantageous because it minimizes context-switching and allows all of the code in a Xen VM (domU) to be written in a safer language than C.

So the dom0 OS (Linux) still determines which devices are exposed to the Unikernel. The Unikernel implementation is simple relative to a full "bare metal" OS because it only needs to support the Xen interfaces to block devices, the network, etc., and does not have to deal with disk drives, ethernet controllers, etc.

If you are running your own hardware, you're probably better off using something like FreeBSD + Jails or Linux + LXC (Docker). The Unikernel approach is more appropriate for situations where you want to deploy applications on Xen-based cloud infrastructure (Amazon EC2, Rackspace Cloud Servers) and do not want to waste resources or increase security risk by running a full OS. The physical servers at Amazon and Rackspace are already running a full dom0 OS (probably Linux), so running another full OS on top of that just to run your app is inefficient.


Even if you have your own hardware there are scenarios where you might want stronger isolation, increased density and/or heterogeneous deployments. All of which is achievable with Unikernels running on Xen.


How would unikernels running on Xen give you better density than containers (which are really just namespaced processes) running on a shared Unix kernel on bare metal? Wouldn't the Xen-based approach have more overhead?


This isn't supported yet, but unikernels on Xen don't currently switch address spaces (cr2 on x86). We're thinking about how to increase density by supporting process switching on Xen, which isn't hard, but needs to be done carefully.

Xen also supports memory sharing among VMs if running in a hardware container, and Hwanju Kim added support for "PVH" mode to MiniOS recently which unlocks this functionality in Mirage. Early days on this work, but fun ones, since we aren't bound by the compatibility constraints of Linux containers, but have access to the same hardware resources.


Sure but then I'm limited to OCaML and have to run in kernel mode when developing apps.

Occasionally a good tradeoff, but a significant one.


There are other approaches that yield Unikernels [1]. I don't quite understand your point about kernel mode and why that would be a problem under this scenario. The code is your own and you only use the OS components you require. The ASPLOS article has more detail on the approach and trade-offs [2].

[1] see table 2 at http://queue.acm.org/detail.cfm?id=2566628

[2] http://anil.recoil.org/papers/2013-asplos-mirage.pdf


Debugging kernel mode code is painful, in general, even on a VM


Perhaps I'm wrong but I feel you've either (1) missed the point or (2) haven't actually looked at the references. Everything is written in a high-level language and the VM is an artefact produced as a result of the process (you don't debug on the VM, you debug the code that generated it).


Sure and then when production errors happen I'm left with having to deal with debugging a VM process rather than a User mode process :)

Enough can go wrong in the packaging and distribution that it's a real risk.

At the end of the unikernel generation pipeline I get a little Xen VM spit out that I can run. How do I debug this guy? It's running on a hypervisor but is essentially in a different environment. I'd love to be able to take code written in one environment and be 100% sure it'd work in another, but that ain't going to happen soon.


Even though Amazon EC2 is Xen-based (internally), you have to run a lightweight Linux distro to use Mirage on AWS EC2 (at least for now), as I learned from the MirageOS posting a few weeks ago.


(mirage dev here)

Not true -- you can run Mirage Xen kernels directly as EC2 guest VMs by using their AKI feature to boot a custom kernel. Instead of a custom Linux kernel, just boot a unikernel and don't bother with user space :-)

Go to http://github.com/mirage/mirage and check out scripts/ec2.sh for a shell script that automates this (needs to be more user friendly, patches welcome)


So I have been following this stuff with a lot of interest, but I am not very familiar with Ocaml and have tried looking at Mirage docs with some limited downtime.

If Ocaml can only handle one core and does not do SMP, how does it do in the cloud? Does this mean Mirage unikernels handle only one processor/core in Amazon and elsewhere?


Two answers: scaling through structured distributed systems abstractions is a key aim in Mirage. We deliberately want each VM to be predictable, single vCPU and scale via multiple VMs. It's far more efficient to scale via lots of small VMs that are scheduled independently than having a few big multicpu VMs (which have a lot of overhead since the CPU sync also needs to be virtualised). See our ASPLOS 2013 paper for some simple workloads on this topic.

We are building this library to simplify this style of distributed programming: http://openmirage.org/blog/introducing-irmin

Other answer: we will be talking about our multicore ocaml implementation at ICFP in September. I still don't want to see it in Mirage though :-)


It's far more efficient to scale via lots of small VMs that are scheduled independently than having a few big multicpu VMs (which have a lot of overhead since the CPU sync also needs to be virtualised).

Are you assuming that vCPUs are scheduled or pinned? It wasn't clear from skimming the paper. I would agree that two levels of scheduling are bad but if that's the problem pinning should fix it.

The situation may change if you eliminate the hypervisor. My intuition is that a single multi-threaded process with work-stealing will be faster than separate processes or VMs due to better load balance (see PX vs. 1X dynos) and faster inter-thread communication (if your app has any communication).


If you eliminate the hypervisor, you don't have vCPUs at all, so I'm not sure how this is a useful comparison.

The question of IPC performance is an interesting one. We've been building up a database of open source results that show wildly diverging results on different architectures. Surf through http://fable.io for it, or read this work in progress:

http://anil.recoil.org/papers/drafts/2012-usenix-ipc-draft1.... http://anil.recoil.org/talks/fosdem-io-2012.pdf (fosdem slides)

The TL;DR of these numbers is that it's very hard to make firm performance hypotheses about IPC across architectures, NUMA and hypervisors.


More on point:

So I am understanding correctly message passing will most likely by the paradigm you encourage?

And, less on point:

> Other answer: we will be talking about our multicore ocaml implementation at ICFP in September. I still don't want to see it in Mirage though :-)

Wait what what what? So it is ready for prime-time? (Note, I did not say production.) I remember reading about it on Jane Street, but thought there will still proposals and lots of theoretical and planning wrinkles to iron out before an implementation became reality.

I do not want to derail this informative post by you anymore, but do you have a link to more on that topic?

And I remember seeing mention of Irmin and not getting how ti fit into your work, asvm. Now seeing it as an answer to my question makes sense.

And thanks for your work on Real World Ocaml. I have decided to get back into programming, and some of my co-workers were taking CS classes and systems programming (ironically Harvard Extension School, I can't afford it but good for them) used OCaml. This reminded me to come take a look and I find your book, and the OCaml resurgence fascinating.

Maybe one done, I will look at the Haxe compiler and go: "it is not so complicated, I get that now." Only in OCaml could I see people doing such crazy shit.


> ... do you have a link to more on that topic?

The following may be of interest [1]. There's been more progress since and we'll be talking about it at OCaml 2014 [2].

[1] http://www.cl.cam.ac.uk/~sd601/multicore.md

[2] http://ocaml.org/meetings/ocaml/2014/program.html


I doubt. Here is another approach: http://erlangonxen.org/


> If Ocaml can only handle one core and does not do SMP, how does it do in the cloud?

+1 to this question? Any one experienced enough in OCaml to answer this?


One could argue that threading is not strictly necessary to utilize a multicore machine. A lot of applications (think web apps) can be implemented just fine as a single thread and then scaled up by increasing the number of processes (or the number of virtual machines in this case) rather than the number of threads.

A lot of popular web frameworks follow this paradigm and I would guess that most application code running "in the cloud" is in fact single threaded.


You'd probably have to move more to a CSP type concurrency versus threads if you were to leverage multicore machines. Erlang doesn't allow individual processes to share memory with each other, but Erlang is undoubtedly a great platform to build apps in that make use of concurrency (depending on use case). Something tells me that the use cases for which one would use Mirage for would be those more likely to use Erlang-type concurrency semantics versus highly multithreaded code using shared memory.


Yeah I am familiar with the theory of alternative methods in a general way (CSP, message passing between full or light procs, green threads, etc.) but what is the story with OCaml and Mirage. I am curious because if it is the same as OCaml I am curious how much performance can be eeked out of it without SMP.


This sounds like an awesome and fun project to hack on!

However, optimizing around context switches and task preemptions is something you would usually do if your application is actually bound by IO/context switching, is extremely latency sensitive or when you are trying to squeeze the last bits of performance out of a machine. Why did you choose to build such a microoptimized system in a garbage collected language? Doesn't this defeat the purpose of the whole exercise?

I feel the need for more powerful types and built-in/standardized exception handling too, but since performance seems to be one of your major goals, wouldn't something like C++ be a better fit here? You'd get proper error handling and a good type system (with some tradeoffs) without a significant performance penalty.

On a sidenote, I agree that code written in a functional language with a strong type system tends to be easier to get right than bare C, but this doesn't imply that all low-level code is bug ridden and unsafe. In fact the linux kernel is one of the most stable and reliable pieces of software I've had the pleasure to work with so far. Suggesting there is a problem with the linux kernel because it contains "a large amount of C code in security-critical places" seems a bit dishonest.


Don't think of it as optimizing around context switches—think of it as just omitting what you don't need, and losing context-switches as a side benefit. A multi-user OS has a lot of stuff which might not be necessary for a single virtualized service (various security mechanisms, lots of file system niceties, running other services, &c), so writing a unikernel like this allows you to select exactly as much as you need for your particular service.

OCaml isn't all that much slower than C++. To use the Programming Language Shootout as a rough esimation[^1], it can even come close to matching C++ in certain programs, and is rarely more than three times as slow. And of course—your type system will catch more errors and your resulting code will be much shorter (and in my opinion, at least, easier to understand.)

Finally: the article didn't say "the Linux kernel"—it said "Ubuntu." The kernel itself might be secure and reliable, but a running Linux system is much, much more than just the kernel. And while C can be security-audited, many of the properties that are important to verify in a C program come entirely for free from something like OCaml—e.g., an arbitrary piece of C code might not segfault given certain input, but a given piece of OCaml code definitely won't. So maybe a running Linux system is "secure enough", but a unikernel like this will have a much smaller attack surface and stronger inherent security properties with basically no extra work.

Disclaimer: I'm not the original author, I'm just speaking generally.

[^1]: http://benchmarksgame.alioth.debian.org/u64/benchmark.php?te...


> so writing a unikernel like this allows you to select exactly as much as you need for your particular service.

I agree this is a big upside of the authors approach. Less dependencies lead to fewer problems caused by external/upstream changes.

> OCaml isn't all that much slower than C++. To use the Programming Language Shootout as a rough esimation[^1], it can even come close to matching C++ in certain programs, and is rarely more than three times as slow.

I wasn't only referring to raw execution performance but also to GC pauses and GC overhead which I think are the bigger issue. The benchmark you linked tests for compute load so this doesn't really show up. Anecdotal point; in most real world apps I have worked on the GC was a limiting factor.

> Finally: the article didn't say, "the Linux kernel"—it said, "Ubuntu." The kernel itself might be secure and reliable, but a running Linux system is much, much more than just the kernel.

That's the beauty of having a kernel though. If one of those userland processes is broken it won't affect the whole system.

> an arbitrary piece of C code might not segfault given certain input, but a given piece of OCaml code definitely won't.

This assumes that the OCaml compiler/interpreter and the hardware are free of bugs...


Regarding GC pauses, remember that you already have them if your application stack is written in (Scala,Java,Go,OCaml,Haskell). But you also have other manual memory management going on everywhere!

With Mirage, it's all amortised in one consistent, fast GC! To give you a sense of the malloc vs OCaml GC trade off, see http://anil.recoil.org/papers/2007-eurosys-melange.pdf

Malloc and free list management is remarkably complex compared to a fast, simple GC. It would be interesting to build an OCaml runtime in Rust to experiment with these tradeoffs in a more controlled fashion.


> That's the beauty of having a kernel though. If one of those userland processes is broken it won't affect the whole system.

With a Unikernel there is nothing else to affect. You are running a virtual machine anyway so even if you kill the kernel you only took down yourself.

> This assumes that the OCaml compiler/interpreter and the hardware are free of bugs...

They are not free of bugs. For most purposes you can consider everything to have bugs in it. However there is a big difference between a compiler/interpreter bug (which usually just makes your system run differently) and a service bug (which can lead to all kinds of problems).


Additionally, Linux wasn't as secure as it is now out of starting gates. It's a very effective and well-written piece of software, but there have been high profile bugs and exploits. C code isn't impossible to make safe, but there is an argument to be made that it needs more attention towards safety (and therefore developer time) than a language that makes certain types of bugs impossible. The space shuttle software was written in assembly language, but it damn sure had a ton of people working on it who valued safety (not necessarily security as we know it now) as one of the highest priorities.


OSv does something similar to Mirage it seems and is written in C++. They also can run different languages, I believe I saw lua support and they commercially target Java too.


> You'd get proper error handling and a good type system

Except you wouldn't, which is a major reason to not use C++.


I'm a little confused about the direction things are going in. I like high level languages and I like that the OS manages certain resources so I don't have to. This guy is writing directly to block devices from OCaml. Don't get me wrong it's all pretty cool but there is some kind of dissonance there I can't reconcile. Is Xen the new OS now?


Note, though, that he is indeed writing to a block device--which is much higher-level than writing to a disk.

Remember back in the 80s, when the BIOS was actually an effective hardware abstraction layer--giving you a defined interrupt to ask the BIOS to, say, write to a disk--and the OS was just for module loading and scheduling and policy-based security? (Not that DOS did either of the latter.)

Well, Xen isn't the new OS; instead, the domU is the new BIOS, and hypercalls are the new BIOS interrupts.

I really hope to see Linux redone (or another *nix created) in this "unikernel" style, where everything hardware-like or HAL-like is taken out, and instead things like filesystem drivers are implemented directly in terms of hypercalls.

I also hope to see UEFI reimplemented as a resident domU, such that a plain old desktop or notebook computer could treat its user OS as a container-image to be slung around, rather than having it "own" the hardware. UEFI actually already supports this mode of operation--allowing you to boot "UEFI applications" that keep UEFI around to provide BIOS-like functionality--but I don't know of a single OS that makes use of that, rather than overwriting the processor interrupt vectors and claiming all of physical memory for itself.


>I also hope to see UEFI reimplemented as a resident domU, such that a plain old desktop or notebook computer could treat its user OS as a container-image to be slung around, rather than having it "own" the hardware. UEFI actually already supports this mode of operation--allowing you to boot "UEFI applications" that keep UEFI around to provide BIOS-like functionality--but I don't know of a single OS that makes use of that, rather than overwriting the processor interrupt vectors and claiming all of physical memory for itself.

This is really the opposite direction things are going, especially in x86_64. PV is horrendously inefficient in 64 bit mode, because of the removal of CPU ring 1 and 2.

http://wiki.xen.org/wiki/Virtualization_Spectrum#Problems_wi...

HVM allows PCI passthrough if your CPU and chipset support it, which means domU now has direct access to the hardware, with no dom0/qemu layer to get in the way and slow things down.


We have a feature issue tracking UEFI in Mirage, but noone's written the bootloader code yet: https://github.com/mirage/mirage/issues/187

Interestingly, it looks like the easiest way to support Azure...


Yes, basically. You let the host manage memory, disk, network etc and your program is a guest OS that doesn't need all the extraneous processes that usually go with a whole OS. No background services or cron jobs.

Edit: as enduser pointed out, in addition to saving resources, this means there is no context-switching from kernelspace to userspace in your guest VM!


You let the host manage memory, disk, network etc and your program is a [process] that doesn't need all the extraneous processes that usually go with a whole OS. No background services or cron jobs.

So it's like Docker.


No, this is fundamentally different from docker.

Docker/cgroups/namespaces allow you to isolate multiple applications running in userspace on the same kernel to a very high degree.

This gets rid of isolation and multiprocessing altogether and runs a single application without any kernel at all (i.e. with just a very limited amount of support code linked in).

Put simply; the technology behind docker improves isolation between processes and this does the exact opposite.


I don't see how this, "...does the exact opposite," of what Docker does—both are very different approaches to the similar goals of having individual processes run in isolated ways. Docker does it by isolating individual running Linux processes under a single kernel, whereas unikernel-based systems do it by building those processes as lightweight programs that get compiled to their own kernel images and then running those on a hypervisor to achieve isolation.


Yes, you are right, this is not a binary classification and it really depends on the perspective. What I was trying to say is that - on some levels - the two approaches are almost antithetical.


> This guy is writing directly to block devices from OCaml. Don't get me wrong it's all pretty cool but there is some kind of dissonance there I can't reconcile. Is Xen the new OS now?

For some applications removing this overhead might be interesting - e.g. databases tend to fight with the peculiarities in the host OS (scheduler, caching, disk access patterns).

The major upside of this approach IMO is the added robustness you get, since deploying turns into a very deterministic procedure, there's no underlying OS updates and whatnot to break your app.


See also this discussion of MirageOS from 2.5 months ago:

https://news.ycombinator.com/item?id=7726748


As someone who knows nothing about lower level OS type stuff, I found this article very easy to understand, interesting and well written.

Sounds like a lot of fun.

Thank you.


Mirage was just featured on FLOSS Weekly: http://twit.tv/show/floss-weekly/302


Since unikernel still needs to run in hypervisor, so I believe the performance is worse than docker.

Am I right?


interesting, although the old mantra of "hardware is cheap, developers are expensive" is still true. you could hire 10 perl/c/c++/javascript/php/etc dev with ease for your project, but struggle to find one ocaml dev. And even then your ocaml dev will need to know the mirage library, xen and have a good knowledge of way more stuff than your project scope

that said, it's still really cool, but it's not something i'd use, and especially not in production


There is nothing stopping people from implementing something similar in other languages.

The developer using this doesn't need to know anything about Xen, they just see a single address space system that runs their code.


for Haskell, there's HaLVM: https://github.com/GaloisInc/HaLVM

for Erlang, there's ErlangOnXen: http://try.erlangonxen.org/zerg

There are others, if you care to search for them.


i didn't say there was, but it's still complicating the task.

i'm not saying it isn't cool, but it's a very round about way of doing something, and as such it becomes more expensive in development time and skill required


More roundabout than the current paradigm of duplicate Linux stacks and supporting services for DevOps just to get stuff deployed? That doesn't make sense to me.

Your argument is actually "It seems very new. Not enough people know/use it. Therefore, I won't use it." -- which is fine, but please don't mischaracterise it in terms of increased complexity or expense.


My argument is that you'll need better developers who cost more money. Not only that, but you're making the project more complicated. i.e if the project is to create x, then you have to also create y first and, it's harder to debug. and you've also got the added overhead of running xen.

all in all it's a nice trick, but it's not very practical, because if it was practice we'd all be using dos for our vms, since you get the bare metal, and a little bit of a environment to bootstrap from.


> My argument is that you'll need better developers who cost more money.

Frankly I wouldn't want a developer who's too dumb to learn OCaml anywhere near my production code. Is this thing new? Yes. Will developers take time (=your money) to get up to speed on it? Yes. But do you need "better" developers, long-term? I don't think so.

> Not only that, but you're making the project more complicated. i.e if the project is to create x, then you have to also create y first and, it's harder to debug.

Maybe a valid concern, but I remember very similar arguments from C++ programmers in the early days of the JVM. Turns out the JVM is rock-solid and nowadays has better debugging tools than those for C++. There's no reason that couldn't be true for this approach. Or if it's easy to make a multi-target project that builds both a linux binary and a unikernel image, then debugging would be no harder than it is for existing OCaml code.

> and you've also got the added overhead of running xen.

If you're already running linux-in-xen then this is reducing overhead. Even if you're not, it could still improve overall performance by reducing context switching, in the same way as user-mode networking stacks.

> it's a nice trick, but it's not very practical, because if it was practice we'd all be using dos for our vms

This sounds rather like "this can't be a good idea because if it was we'd have it already". It's only in the last few years that xen and the "cloud" approach have become so popular, so a lot of new ideas and approaches are still being found.




Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: