The key concept here is about security.
Imagine you could compile your web application/website into an application that has nothing else but the required network drivers and minimal hardware drivers needed to execute. That is (as far as I can tell from a brief look) what Mirage does. Using such an approach there's nothing for hackers to log in to - no shell to use, no other insecure cruft on the system potentially leaving holes. The attack surface of such a system is extremely low relative to putting fully powered and configurable interactive operating systems onto the Internet.
Hackers are showing that almost everything is hackable without expensive, constant, expert attention to security of every aspect of your systems. Even then zero day exploits expose systems within seconds or minutes of the exploit becoming known.
This is not to say that such systems cannot be hacked, but rather to say that in such an approach the attack surface is dramatically lower than in current approaches to deployment.
Perhaps more likely to change the way that people think about the concept of web application development in the longer term is that the application/server allows for the possibility of "high resolution server instances" in which a virtual machine can be started in milliseconds, perhaps to service only one query, and then vanish again. Instead of renting your instance from Amazon that runs per hour and does nothing most of the time, your infrastructure will be truly scalable with nothing running at all until its needed and then precisely as much computing power will be used as is required to meet the inbound demand. No longer will you need to "start a web server" which then chugs and rumbles into existence as all its various unneeded-at-most-times processes start. Instead, connect your application/server AMI to the Amazon spot instance marketplace.
On a more on-topic note however, aside from the benefits of running any kernel, such as windows, I wonder if there are any practical benefits of something like this vs something like CoreOS/docker. I'm sure since it's Xen you could run a windows kernel on Linux, but aside from that, I don't know.
There are millions of lines less of code involved in the deployed Xen unikernel, since there's no Linux userspace/kernel divide involved any more.
You can also use the same application source code just fine with CoreOS/Docker if you prefer, since it can also compile to normal POSIX binaries that use kernel sockets (via `mirage configure --unix` instead of `mirage configure --xen`). This is the main benefit of using modular OCaml code that can swap out entire subsystems at compile time.
A bare-bones MirageOS deployment (e.g. a hello world TCP service) is only a few MB if I remember correctly.
I do agree with you that Xen is probably better at providing isolation than Linux, i.e. it is better at what
an OS is supposed to provide :)
Also if you are in an environment where your only option is to deploy Xen domUs, i.e. like EC2, then you probably have a performance advantage as well because you just eliminated one layer.
In the end not much changes conceptually compared to a traditional application:
* instead of being linked with libc it is linked with Mirage's runtime
* its "OS" is now Xen instead of Linux
* the drivers of the OS are unchanged (running Linux in dom0)
* developing a Mirage unikernel is much like developing a traditional application, if you restrict yourself to the Mirage provided interfaces
Well, you can drop into a driver domain model and not have a full Linux dom0 (if you don't mind fixing on a particular hardware model).
But don't forget that Mirage is about modularity though -- we have a kernel module version under development too, and a baremetal rPi one. The idea is that as the number of libraries grow, it becomes easier to pick and choose the set you need for the particular deployment environment you want to build an application on (including but not exclusively a Xen unikernel).
In the case of Erlang, for instance, it provides a built-in remote shell which allows one to connect to a running Erlang process, get multitude of information about the VM, loaded code, etc.
As long as there is a convenient way to maintain a running system, that same way can be used for hacking.
My point is that the whole workflow of creating and deploying applications becomes much simpler as the final VM can be an (almost) disposable item, rebuilt at-will. This is pretty much what the configuration management tools are all striving for but Unikernels work this way by design. Therefore, our notions of managing such systems will also change.
I've been doing Project Euler in OCaml recently so this is a perfect excuse to expand my horizons.
I made a start on modifying Movitz to build a Lisp system that would run on top of Xen.
Those three have in common the trait that they support single language runtime: Erlang for Erlang on Xen, OCaml for MirageOS, and Haskell for HaLVM.
I was thinking, earlier, that this approach could be taken even further: rather than relying on Xen, it'd be nice to extract the drivers from the Linux kernel into their own project, an exokernel library with a defined ABI. Platform runtimes like Erlang, OCaml, Haskell, etc. could each build a "complete bare-metal kernel-binary" version of themselves, simply by linking to that exokernel library.
(And Linux could obviously move over to consuming the exokernel library itself. I guess a POSIX environment would technically just be the "complete bare-metal C kernel-binary.")
Those "components" and "virtualized network" really resemble "processes" and "IPC" to me. Except that in MirageOS/Foo-on-Xen/etc each process has its own device drivers. And, indeed, those beg either for putting into a sort of shared component.
But... aren't we almost reinventing microkernels here? With the only distinction of using shared libraries instead of server processes?
Though, there's another advantage that comes specifically from thinking of each process as a separate "machine": each process gets to participate fully as an Internet peer, with its own unique, public-routable IPv6 address.
One advantage of using a BSD rather than Linux is licensing, as linking Linux kernel code into your application will probably force you to release as GPL. NetBSD also has an initial portability advantage, as you could always cross compile it.
Anyway, also see: OSv: http://osv.io/ similar idea but with the JVM. I believe some pretty well known Clever People are working on it - can't remember who, though.. anyone know?
This looks like something a really educated/disciplined company could use for creating some crazy cool infrastructure. It's hard to see it taking off for the average Joe.
An anecdote I use when describing the benefits is the story of a smart fridge that got hacked and became part of a botnet sending spam emails. Why did that fridge even have code that allowed it to send email? It wasn't necessary for its functioning. We should write software differently if were going to be deploying it to 10x the number of devices compared to today.
More personally, I worry about the software that's going to find it's way into the embedded health devices of the future (cf pacemakers). These devices will inevitably be 'connected' and I want to make sure that the code they use is safe and secure.
If it has a remote code execution vulnerability, it's trivial to make it send spam (or do all kinds of things) whether a MUA was already present or not.
It's much easier to reason about a bunch of OCaml code, so say the authors, than it is to understand the interaction between independently changing pieces of your "stack" that are written in different languages and integrate in disparate ways.
I'm not convinced that 'Xen + Mirage unikernel' has a performance advantage over 'Linux baremetal + Mirage direct mode', but I definetely see an advantage compared to 'Xen + Linux domU + Mirage direct mode'.
You have to also remember there is a historic connection between the Cambridge OCaml people and Xen, with some of Xen being written in OCaml.