It has been something of a puzzle as to why they have gotten no traction in any significant way.
My conclusion is that there are other technologies (i.e. containers/Docker) that are not as good as Unikernels, but are good enough to solve the core industry demand for lightweight nested virtualization. And indeed containers solve that problem better than unikernels in two important ways which is boot time (and I'm talking here about booting useful unikernel based applications as opposed to theoretical unikernel OS boot times which can be milliseconds), and also containers are better because they don't need nested hardware virtualization switched on in the CPU - that's important obviously for cloud computing where nested virtualization has now arrived but is not widespread.
And good enough is all that is needed.
It's hard for me to imagine a time when unikernels gain more acceptance than they already have, which is basically little more than none, and the staunch backing of operating systems researchers and a few true believers who really hang on to the theoretical benefits.
A big chunk of managing distributed systems at scale is trying to figure out what's wrong with them when they don't behave. Most operators will gladly trade some performance overhead for access to a suite of tools they're accustomed too.
A couple of examples. The Gnu Debugger (gdb) supports connecting to qemu-backed virtual machines, so you have the ability to everything that gdb can on a unikernel as well, given the unikernel is implemented in something gdb groks.
In addition we've created a trace layer, similar to strace that allows you to run strace inside the vm itself. Since the POSIX libs are just libs wrapping them inside something like strace is rather trivial.
Also the unikernels have a huge advantage of having everything in the same memory space, meaning that you can observe the application and operating system at the same time.
High level languages like Python or Node.js can typically have debuggers loaded dynamically and as such would be able to offer full debuggability without having to do tricks with gdb.
All this doesn't have to induce bloat. The socalled bloat in POSIX systems is mostly caused by the security barriers raised by a design that forces strict separation between unprivileged and privileged access to the system. In addition a POSIX kernel will typically be built to support "everything" that a user might need whereas a unikernel can take a build-time decision on what to include into the OS.
if one were starting from scratch, a uniform measurement/management interface along the lines of osquery, sysctl, or /proc would be alot more straightforward on both the back and front end than the current weird mix of command line tools that linux presents.
I think the most interesting question, as others have mentioned in the thread, is providing decent debugging and introspection in the distributed context. the development of those kinds of tools seems pretty orthogonal to the size of your kernel and whether or not it runs systemd.
Do you even need to?
If something so simple breaks, why not to use, say, JTAG?
Connect as low-end sbb to server's JTAG output, and have it connected to the network. JTAG is beautiful tool for debugging because it requires no modification of software being debugged.
On the other hand, it should be possible for the hypervisor to provide roughly the same capabilities in software, without having to modify the software inside the virtual machine. Indeed, most hypervisors already provide a GDB stub for this purpose, including, e.g., Xen ; it's just that public clouds don't typically give customers access to it. It would be nice if that changed.
This is an irrelevant objection that seems to get massive traction.
Just like some politicians have scathing criticisms of other politicians, which despite being not true, are easily and widely believed.
It's not even a relevant point. Nothing I ever did with Unikernels hit this as even an issue on the horizon. If you want to knock down the technology, this is the most effective way to do it because it very effectively frightens off people considering the technology.
The origin of the "Unikernels can be difficult to debug." is from someone who works at a company that bet everything on containers. Nuff said.
Since the application and kernel are joined together so closely, there aren't any parallel processes that could be monitoring the behavior of the unikernel app. You can't log in on the affected machine and pull logs really, because the process may not be able to even write a log about what's failing when its in the process of failing.
If you have a situation where a single server is having performance issues in a unikernel approach, there is no UNIX-style environment to log in and poke the hardware configuration and health, whereas on a more normal stack you could log in and see that NetworkManager has degraded the connection down from 10Gbps to 100Mbps because it can't negotiate a higher speed successfully, anymore, and then you can configure your alerting system to watch that stat, remove any boxes in that state from the load balancer, and inform DevOps about the bad hardware.
All of that to then say that I really do like the idea of Unikernels, and I believe they have their place, but that place tends to be very well-defined and infrequently changing problem domains that need very high performance, such as in supercomputers.
It doesn't work as well elsewhere because the application, by being joined so closely to the kernel, is also responsible for these things operating systems have traditionally taken care of for us, and problems in the application code can have an even more catastrophic effect than normal. Most applications are neither well-defined enough, time-invariant enough, or latency-constrained enough for the unikernel tradeoff to make sense.
They have, for example, a distributed tracing system  integrated into their RPC layer that lets you instantly view the complete set of RPCs kicked off by a request, the services touched, and statistics for latency either for a single request or across many requests. This means that if you're a developer trying to integrate a new service into the search stack, you can get an answer within a couple of minutes as to "Your service will be on the critical path, so you need to count microseconds in implementation" vs. "Your service is off the critical path, go wild" vs. "You have a budget of 25ms, as long as you come in under that you should be fine." It lets you tell whether your application requests were slow because you triggered an edge-case in your code vs. because that BigTable your dependency relies upon was undergoing a compaction. Combined with some other application-level logging, you could tell if your ranking algorithm was slow vs. it happened to share a machine with a process that hogged the machine resources.
It's probably one of the features I miss most now that I'm no longer at Google and doing my own startup, and I've tried to apply the same development philosophy to my own code. It's been invaluable, for instance, to tell whether I'm not getting results because I can't parse a page but forgot to write the error-handler to mark it as done & unparseable vs. because a complicated algorithm I use is accidentally exponential-time in some cases vs. because an HTTP connection is taking too long to complete vs. because CloudFlare blocks robots and the site really doesn't want to be crawled so I should give up on it vs. because an algorithm I wrote to avoid overloading hosts has a bug that makes it get stuck performing requests once every month. Some of these could perhaps be caught with lsof or top, but they'd give me nowhere near as precise info when trying to track down the problem.
All this is to say that perhaps the application (or a library used by it) really is the right place for those tools to go, because it knows far more about its operation and the likely questions you'll ask than a generic system-wide tool. This'd require a change in how we write applications and what we consider to be the responsibilities of an application developer, which probably explains why we don't have mainstream adoption yet. But in a world that's moving from multiple-apps-on-one-computer to multiple-computers-for-one-app, it makes sense, and so I wonder if the long-term trend will go in that direction.
But there are a whole class of application failures that are handled out-of-band, which is what I was referring to: If you need to debug issues involving hardware then not having an interface to that hardware separate from your application can make it harder to inspect and determine what failure you're running into.
And again I point to application failures within the application frameworks themselves having issues (such as logs not being written out at all) that having an out-of-band monitoring process with a mostly-uncorrelated probability of failure seems to me to be very necessary for handling these less common but potentially more catastrophic issues.
I'll admit that this next statement is a bit of a Call to Authority fallacy, but you'll note that not even Uber or Google use unikernels in production, and I believe these two problem domains are why.
Very little of this infrastructure has been built yet, which is why we don't see this in production deployments. I'm talking decades out - I'll predict that the software industry will eventually tend towards this architecture because it's more efficient and better fits what we actually do with software now, but until it's a pressing pain point there's little urgency in getting there. Much like how all the folks who predicted in the 80s that CPUs would tend toward RISC were right, but it took 30 years and the ARM/mobile revolution to get there, and most desktops still have a hardware x86 emulation layer on top of RISC microcode.
But unikernels running on top of a hypervisor seems like a terrible solution to the problem? You're trading away the convenience of the OS tooling that's built up over the past 40+ years to get closer to the metal and squeeze more performance out, then injecting virtualization in between and throwing that performance gain back away? Why not just make a regular application at that point and not throw away all of the runtime debugging support? A unikernel application running on top of a hypervisor basically turns the unikernel part into just an incredibly large and inefficient libc, in my mind.
A few reasons:
- There exist large public clouds willing to run any code you want under a hypervisor, but services that give you shell access to a user under a traditional kernel have mostly died out. This can partially be justified by security concerns: traditional syscall interfaces tend to be more complex and thus have more attack surface than the VM<->hypervisor interface. Hypervisors also tend to make it easier to divide system resources, e.g. by giving each VM a fixed RAM allocation.
- Some clouds, like EC2 with the Elastic Network Adapter, give virtual machines direct access to (custom) networking hardware, rather than making them trap to the hypervisor for every send and receive. This should mitigate much of the performance overhead of using a VM, at least as far as networking is concerned.
- Anyway, unikernels can put everything from filesystems and TCP to threading and even page table management "in-process"; this can reduce the number of syscalls that have to be performed and thus syscall overhead, even for operations that do ultimately delegate to "syscalls" in the hypervisor. In other words, they shouldn't be compared to just libc; they're also taking over many of the functions of a traditional kernel (just not all of them).
Shared hosting is alive and well, shell access is becoming more common to my knowledge.
>Hypervisors also tend to make it easier to divide system resources, e.g. by giving each VM a fixed RAM allocation.
CGroups. LXC and Docker are capable of RAM limits nowadays.
>traditional syscall interfaces tend to be more complex and thus have more attack surface than the VM<->hypervisor interface
A syscall interface is an assembly instruction and several registers that may point to some memory. The kernel is very thorough in checking the validity of such pointers (unless you use an ancient non-LTS kernel)
> give virtual machines direct access to (custom) networking hardware, rather than making them trap to the hypervisor for every send and receive.
Yes, VFIO and IOMMU have been around for a while. They do get pretty close to native performance (close enough for gaming atleast). It's not exactly new tech and full VMs with Linux images have been able to utilize full speed networking for a while now too.
Also note that virtio adapters are close to baremetal even without passthrough of the adapter.
>Anyway, unikernels can put everything from filesystems and TCP to threading and even page table management "in-process"; this can reduce the number of syscalls that have to be performed and thus syscall overhead
For all the stuff you mentioned syscall overhead isn't the driving performance factor unless you're google or facebook scale. TCP, threading, filesystems do spend most of their time waiting for DMA or other interrupts.
Page tables is usually not even a syscall and rather a interrupt from the CPU, the performance difference should be negligible.
> In other words, they shouldn't be compared to just libc; they're also taking over many of the functions of a traditional kernel (just not all of them).
It should be compared to libc because in the ideal deployment scenario there should be no difference.
It's pretty standard for non final builds of games to have a number of custom debugging and performance tools built in. These may display information in game or capture to a local log but commonly (especially for console games) they connect to an external tool running on another machine over a socket or custom debugging interface. This is both because it can be easier to build UIs on a PC and because, particularly for performance profiling, displaying the information locally can impact what you're trying to measure significantly.
Consoles are normally debugged using a remote debugger running on a PC but that's often useful even for debugging PC games. Often console devkits had special hardware to support this which wouldn't be present in retail hardware which made debugging issues that only showed up on retail hardware challenging.
While engines typically have some graphics debugging and profiling tools built in, it is also common to use external tools, often connecting from another machine. There are many of these: PIX, Visual Studio Graphics Debugger, NVIDIA Nsight, Intel Graphics Performance Analyzer, GPUView, Windows Performance Analyzer... They all have strengths and weaknesses so it's common to use more than one of them.
Prior generation consoles were often unikernel-like as all OS type functionality was statically linked into your executable and you'd boot right into your game. Current generation consoles have something more like a full OS but you still get much more clearly defined guaranteed minimum resources and deal with hardware at a lower level than on a PC.
Overall there's a wide range of both custom and generic but specialized tooling used for debugging and performance analysis and that's been a fairly stable reality for many years. I don't think this is a situation where one particular paradigm has to win out in the long run, more a case of using the right tool for the job.
If anything your own response here is like the typical politician- avoiding the issue, providing anecdotes that promotes your view, going on the offense and changing the subject.
It is just a matter of tool maturity.
Err.. today's remote debugging is strongly based on the premises of the issues being isolated to some degree: broken app will not hit the network stack over which you run you remote session, misbehaving network stack doesn't impact serial communication, so using gdbserver is still possible (but already significantly harder), etc.
Will it be the same with unikernels?
When I talk to JEE servers via JMX I don't care if they run on as OS process, on a container, hypervisor or even a bare-metal JVM.
Likewise I get to enjoy cluster monitoring tools like Visual VM, Java Mission Control, New Relic APM, DataDog and many others, all blissful unaware of how the JEE server are actually deployed.
So I don't see why unikernels can't provide debugging access points for such kind of tooling.
But I think the whole point on the thread was that, making all this possible on unikernels, would require putting back a big part of the code that went away when making the unikernel lean and mean (and interesting because of that).
An optional debug layer for JVM, Erlang, CLR running bare-metal (just as possible language examples) is tailored made for such runtimes and much thinner than all the services a general purpose OS needs to provide.
I'm not saying applications don't need to be debugged, that would be silly. What I am saying is that this "Unikernels are hard to debug" is a hand wavey way of knocking down the technology, and it has really worked - witness this subthread.
The unikernel is the final build, the package, the output of application development and debugging which happens upstream, before you put it into the package.
There's always ways you can expose the inner workings of a unikernel to gain insight into whatever might be going wrong for whatever reason - if you really need to do it at the stage that it is a running unikernel - but in most cases its just not necessary because application development and debugging happens, as I say, upstream of the final building of the thing into a unikernel.
Applications absolutely need to be debugged in production, debugging does not only happen upstream before deployments. Unexpected things happen so you might need to list the current TCP connections, view the socket buffer sizes, produce a core dump, view metrics you didn't anticipate at development - even attach a debugger directly on a very bad day. (Please don't get hung up on these specific examples though - the point is more that the last decades have allowed us great system introspection into a production environment running on a traditional OS).
The perfect world where all that would be unnecessary for most people is not even close yet. People are largely not yet willing to give up on being able to ssh into their server and use the available tools to do troubleshooting
, nor are they willing to change the development practice where that could largely be avoided.
Exceptions? Could be raised at any level. Most of the time you can't predict at all which exceptions can happen.
Server-side errors? Could happen at any level. DNS switch/your provider forgot to restart DNS/Server out of RAM/...
Now try to do something along these lines at 1M requests/day. I mean, requests to several API providers who are probably not even providing good docs for their APIs.
There's no way you can have a scaled-up IT business without online debugging. Most of the time I spend on is debugging issues that I could absolutely not predict (or realistically, I could only predict them if my budget was a multiple of what it was) when I was writing the program.
Real-life problems are based on sets of contradicting requirements. And they are contradicting due to them being at the edge of our understanding; and that's why they are problems.
I believe the objection is clearly about debugging in production, not during development. This is absolutely essential if you want to be able to fix things in production that broke unexpectedly long after development/deployment.
There's no issue with logging through network as opposed to logging to a file that you need to pull to see the contents (containers).
It's also common to have a debugging interface over network that you can use to inspect program state (eg. Clojure nREPL, node.js inspector protocol, and this can be built into any environment), I'm pretty sure even Rust can have networked REPL that can satisfy most debugging needs.
Except in unikernels that have proudly eliminated the network stack.
Unikernels have network stack, just not BSD sockets.
Writing to the network can be certainly be done in whatever debugging framwework, production variables they end up providing.
Remote debugging and cluster monitoring tools like JEE servers provide would already be quite good.
Debugging UNIX way is not the only path.
I think the hardest part is like you say, docker "works well enough" and the pressure isn't on to deliver the unikernel solution.
I solved this in my research, but it's still not enough to get unikernels any real traction because containers have won the battle.
I love this project, but I wonder if getting behind something like feL4 might have more interesting payoffs. A really small lightweight kernel, with only the drivers and libraries you need. Running side card debuggers become easier I would think, and make them more debuggable. Redox probably fits in this category as a potential, if less stress tested, option.
Then there is nebulet which by using wasm, would remove the system call expense.
All of these are exciting, it makes me happy to see people continuing to play around in this area.
I'm not sold on this, but perhaps the next natural step in the progression after unikernels is serverless architectures.
If that is the case, we may never see the widespread use of unikernels. Containers are good enough. The full promise of unikernels has not fully matured. And serverless architectures seem to be gaining some traction. In the end, maybe unikernels just get passed over.
That would be shame for me personally. When I read my first blog post on unikernels, some vague thoughts on the boundary between code and environment snapped into place. Unikernels seemed like the only game in town that made perfect sense thinking from first principles. I have been hankering for an opportunity to fit unikernels into a serious project ever since.
There is a lot of stuff that isn't done in containers or at least isn't done well. Network functions, CPU-based IoT, serverless.
Packaging up a complex application in a container and moving it around seems to work quite well. You get to abstract away the core operating system and that is all well and good.
- layered filesystem
- a full-fledged OS, which is useful for debugging
In addition, I haven't seen something comparable to Docker Hub (yet).
Stuff like the ATmega chips run the application barematal, yes, but they are cheap microcontrollers with 2KiB of RAM and no MMU. You can't run a kernel on that so your application runs bare.
A lot of the higher-tiered embedded stuff usually either runs a Linux kernel or DOS (and older) from my experience.
Systems running Linux are the exception.
Linux does not fulfill the security requirements of high integrity systems nor the real time constraints.
Embedded systems running Linux are extremely common, and becoming more and more popular because it runs on so many architectures (esp. ARM, but also PPC). For many applications, hard real-time is not necessary, and Linux is FAR easier to develop on.
They are not "the exception" by any means. Look at almost any automotive "infotainment" system on current models: it's likely running on Linux. There's countless others.
For high-integrity systems for critical safety applications, you would not use Linux, you'd use a dedicated RTOS, of which there are countless choices. But these all have big disadvantages: poor performance (they favor determinism over performance), poor development support, high license costs, etc. If you're doing avionics, then sure you'd use an RTOS, but if you're making a "smart TV" to play Netflix, you wouldn't, you'd use Linux.
Also, Linux does have real-time capabilities, it just isn't hard real-time, only soft. It works well enough for controlling hobbyist-level CNC machines. I've used it myself for a pretty critical real-time system where ease of development and performance were more important than absolute real-time guarantees.
This means you also have no numbers to back up your claim.
Apparently I was wrong, see it wasn't that hard to find.
I think the perception of Unikernels as something that is suppose to replace containers is a bit artificial. They have a much stronger appeal in replacing Linux where Linux struggles. So CPU-based IoT devices where Linux security is a problem, serverless architectures where Unikernel boot-times of 5-10ms can provide more elasticity or network functions where the GNU/Linux footprint is significant.
Certain containers could be a unikernel case down the line when the tooling improves. Once you can do "npm unikernel" and it spits out a bootable node-app might make node-based Unikernels attractive. But nobody is there yet, AFAIK.
Roughly, K8s and many other container solution tries to commoditize hardware. And they require an evolutionary, and modest, changes to application development lifecycle. They emphasize horizontal scaling.
Unikernel requires app developers to substantially change how their applications are written. It emphasize vertical scaling. It's imaginable that hardware vendors can build a cluster-like-machine, where lots of cpu/gpu/ram/spindle/ssd/accelerators, and offer a Unikernel scheduler; which greatly reduces the influence of the higher-level container-based solutions.
I am interested in learning how others think about this view point.
But once you add a hypervisor and a scheduler, you've essentially reinvented the OS. I struggle to see the point: just compile static executables and run them on Linux, natively installed on physical hardware, like we did 30 years ago. Deployment is just as easy (i.e. copy a file), performance should be comparable (calling into the kernel should be comparable to calling into the hypervisor), and the tooling is way more mature.
What I imagine is a giant machine with thousands of cores, running a unikernel scheduler, for horizontal-scaling apps.
That means hardware vendors can sell their produce directly to application developers, instead of being shoved behind scene by Cloud vendors.
That might means reinventing OS, which can be argued for K8s as well, which is also reinventing similar OS concepts (process management == job management, threading == sharding, ipc == SOA, kernel scheduling == container scheduling).
But the point is that it offers a produce that has technical superiority, and hardware vendors would have more expertise to build them, i.e., unikernel would have much more intimate relationship with baremetal than containers, where hardware vendors would naturally know better to optimize.
I thought memory safety comes from enforcing proper memory usage through ownership rules (borrow checker), not through zero cost abstraction, which simply indicates that high level language features are translated into efficient system level code.
For 'larger masses' of 'application-in-a-box', I think, however, a NetBSD(like) with .NET CLR (or even .NET core) would be a much route.
this was discussed on HN in 2016