Hacker News new | past | comments | ask | show | jobs | submit login
Unikernels are unfit for production (joyent.com)
359 points by anujbahuguna on Jan 22, 2016 | hide | past | web | favorite | 318 comments

Bryan may certainly be right (I neither know him nor much about unikernels), but some parts of his argument seem incredibly weak.

  The primary reason to implement functionality in the
  operating system kernel is for performance...
OK, this seems like a promising start. Proponents say that unikernels offer better performance, and presumably he's going to demonstrate that in practice they have not yet managed to do so, and offer evidence that indicates they never will.

  But it’s not worth dwelling on performance too much; let’s 
  just say that the performance arguments to be made in favor 
  of unikernels have some well-grounded counter-arguments and 
  move on.
"Let's just say"? You start by saying the that the "primary reason" for unikernels is performance, and finish the same paragraph with "it’s not worth dwelling on performance"? And this is because there are "well-grounded counter-arguments" that they cannot perform well?

No, either they are faster, or they are not. If someone has benchmarks showing they are faster, then I don't care about your counter-argument, because it must be wrong. If you believe there are no benchmarks showing unikernels to be faster, then make a falsifiable claim rather than claiming we should "move on".

Are they faster? I don't know, but there are papers out there with titles like "A Performance Evaluation of Unikernels" with conclusions like "OSv significantly exceeded the performance of Linux in every category" and "[Mirage OS's] DNS server was significantly higher than both Linux and OSv". http://media.taricorp.net/performance-evaluation-unikernels....

I would find the argument against unikernels to be more convincing if it addressed the benchmarks that do exist (even if they are flawed) rather than claiming that there is no need for benchmarks because theory precludes positive results.

Edit: I don't mean to be too harsh here. I'm bothered by the style of argument, but the article can still valuable even if just as expert opinion. Writing is hard, finding flaws is easy, and having an article to focus the discussion is better than not having an article at all.

I have no dog in this fight. That is, I'm neither financially nor emotionally invested in Docker or any sort of container technology, nor their Solaris OS thing, nor CoreOS, nor any unikernel. I read this blog not knowing who he is and very little about Joyent so I held no previous malcontent against the man (edit: nor do I now, in case that clause implied otherwise). That being said - you weren't even remotely harsh.

He made a bunch of wild claims without backing it up even with simple links to buqtraq/securityfocus/whatever providing evidence that hypervisors are inherently additive to your 'surface area' by which you can be attacked. He also, as you mentioned, failed to provide even cursory benchmarks, much less cite any 3rd party, academic, peer reviewed analyses. Thirdly, he asserted a false choice between unikernels and on-the-metal. There's nothing stopping you from firing up a heterogeneous environment, using unikernels when they perform well, and containers when the situation dictates them. So yeah, you weren't too harsh - IMO, your post was more well-balanced and thought out than his entire blog post. But hey, who knows, maybe he intentionally wanted to be incendiary so we'd all be talking about his company's product (in some capacity at least) on a slow Friday afternoon.

Unikernels, to me, are basically a resurrection of the type of operating system exemplified by Mac OS 9, 8, 7... No memory protection; programs just duking it out against each other with pointers in the same arena, like gladiators. But to invoke an image of disciplined violence in the Roman Empire is too kind; really, this is more like a regression to the Stone Age.

Right there aren't supposed to be multiple applications in there? But eventually there will. Things only start small, as a rule.

Look, we have had pages and memory management units since the 1960's already. Protection was considered performant enough to be wortwhile on mainframes built from discrete integrated circuits on numerous circuit boards, topping out at clock speeds of some 30 mHz. Fast forwarding 30 years, people were happily running servers using 80386 and 80486 boxes with MMU-based operating systems.

Why would I want to give up the user/kernel separation and protection on hardware that blows away the protected-memory machine I had 20 years ago.

That would be true if applications ran on only one computer at a time. But nowadays applications run across many computers - sometimes tens of thousands of them. These applications don't need the operating system to protect processes from each other, because they are not running on the same computer.

Now that hypervisors are a mature commodity, this model is practical at smaller scale too: instead of running in separate physical computers, process run in separate virtual computers.

In short: unikernels make way more sense if you zoom out and think of your entire swarm of computers as a single computer.

That would be true if computers ran only one unikernel at a time, but you need to protect each unikernel from each other

Computers do run only one unikernel at a time. It's just that sometimes they are virtual computers. Remember that virtualization is increasingly hardware-assisted, and the software parts are mature. So for many use cases it's reasonable to separate concerns and just assume that VMs are just a special type of computer.

For the remaining use cases where hypervisors are not secure enough, use physical computers instead.

For the remaining use cases where the overhead of 1 hypervisor per physical computer is not acceptable, build unikernels against a bare metal target instead. (the tooling for this still has ways to go).

If for your use cass hypervisors are not secure enough, 1 hypervisor per physical machine is too much overhead, and the tooling for bare metal targets is not adequate, then unikernels are not a good solution for your use case.

Yes, and processes in a traditional OS have hardware support too through the MMU and things. At the end of the day something needs to schedule processes/vms, and something needs to coordinate disk writes and what I'm driving at here is that whether you call them processes and a kernel or vms and a hypervisor, you arrive at the same thing, except operating systems are mature at that task, and hypervisors are not.

Why do you need to have both a kernel and a hypervisor, though? You've got two levels of abstraction that do the same thing, and just like with M:N threading over processes, they often work at cross purposes.

For most cloud deployments nowadays, the hypervisor is a given. Given that, why not get rid of the kernel?

Except you don't get rid of the kernel, they're not called unikernels for nothing I presume.

Surely, you should be saying: why do you need all of a kernel and hypervisor and an app when you could subsume the app into the kernel and just run the hypervisor and the kernelized app (or single-appified kernel, call it what you want).

I'm having a hard time seeing the benefits given the obvious increase in complexity.

What features of a full-fat OS do unikernels retain? If the answer is very little because hypervisors provide all the hardware access then it would be fair to say that hypervisor has become the OS and the traditional kernel (a Linux one in this case I presume) has become virtually * ahem * redundant.

> hypervisor has become the OS

A hypervisor is an OS. A rose by any other name.

> I'm having a hard time seeing the benefits given the obvious increase in complexity.

What? It's simpler. You remove all the overhead of separating the kernel and app. And that of running a full-featured multiuser, multiprocess kernel for the sake of a single app.

> What features of a full-fat OS do unikernels retain? If the answer is very little because hypervisors provide all the hardware access then it would be fair to say that hypervisor has become the OS and the traditional kernel (a Linux one in this case I presume) has become virtually * ahem * redundant.

Yes, that's entirely fair.

>hypervisor has become the OS and the traditional kernel (a Linux one in this case I presume) has become virtually * ahem * redundant.

That's the point of unikernels right there :)

I think we could spend a long time discussing the relative strengths and weaknesses of hypervisors and traditional operating systems. It's definitely not a one-size-fits-all situation (which is kind of what you're implying).

In any case, I was not arguing that hypervisors are superior to traditional operating systems. I was simply pointing out why the comparison of unikernels to macos8 and calling it a "regression to the stone age" was missing the point entirely, because of the distributed nature of modern applications.

The distributed nature just means that if attackers find one exploit, they can apply it repeatedly to the distributed application, to give themselves an entire botnet.

All code is privileged, so any remote execution exploit in any piece of code makes you own the whole machine (physical or virtual, as the case may be). A buffer overflow in some HTML-template-stuffing code is as good as one in an ethernet interrupt routine. Wee!

> The distributed nature just means that if attackers find one exploit, they can apply it repeatedly to the distributed application, to give themselves an entire botnet

That may or may not be true... In any case it's completely orthogonal to unikernels. Distributed applications, and any security advantages/disadvantages, are a fact of life.

> All code is privileged, so any remote execution exploit in any piece of code makes you own the whole machine (physical or virtual, as the case may be). A buffer overflow in some HTML-template-stuffing code is as good as one in an ethernet interrupt routine. Wee!

I'm afraid you're parroting what you learned about security without really understanding it. Yes, an exploit will give you access to the individual machine. But what does that mean if the machine is a single trust domain to begin with, with no privileged access to anything other than the application that is already compromised? In traditional shared systems, running code in ring0 is a big deal because the machine hosts multiple trust domains and privileged code can hop between them. That doesn't exist with unikernels.

Add to that the tactical advantages of unikernels: vastly reduced attack surface, a tendency to use unikernels for "immutable infrastructure" which means you're less likely to find an opportunity to plant a rootkit before the machine is wiped, and the fact that unikernels are vastly less homogeneous in their layout (because more happens at build time), making each attack more labor-intensive. The result is that the security story of unikernels, in practice and in theory, is very strong.

You're assuming here that there aren't and never will be exploits that break out of the hypervisor. This is not the world we live in. In literally exactly the same way that you can break out of an application in to kernel space, you can break out of a guest VM in to hypervisor space. VM guests are processes, and hypervisors are operating systems. We've switched the terminology around a bit, but in doing so we've given up decades of OS development

> You're assuming here that there aren't and never will be exploits that break out of the hypervisor. This is not the world we live in.

Really? Here's what I wrote in this very thread, just above your message: If for your use case hypervisors are not secure enough, 1 hypervisor per physical machine is too much overhead, and the tooling for bare metal targets is not adequate, then unikernels are not a good solution for your use case. [1]

At this point I believe we are talking past each other, you are not addressing (and apparently not reading) any of my points, so let's agree to disagree.

[1] https://news.ycombinator.com/item?id=10956899

Well hopefully your VMs are at least as well isolated as a linux process is.

On the contrary, you could argue that there is _more_ isolation, insofar as the multiple applications will be separate unikernels on the same hypervisor, and that the hypervisor will enforce a stricter separation between VMs/unikernels than your OS will between processes.

YEs, sometimes I wonder why UniKnernel isn't called the resurrection of DOS.

Because we're not using DOS?

held no previous malcontent against the man

That does not sound practical at all.

Writing is Nature's way of letting you know how sloppy your thinking is -- Guindon

From what I understand of MirageOS an impetus for the "security theatre," is that they believe libc itself is a vulnerability. Therefore no matter how secure their applications are they will always be vulnerable. They will be vulnerable to the host OS as well as any process it is running and the vulnerabilities they expose. It's not security by obscurity but a reduction in attack surface which is a well-known and encouraged tactic. I don't see any prepositional tautology there.

Yes they will still be reliant on Type-1 hypervisors... and Xen has had its share of vulnerabilities in the last year. That's a much smaller surface to worry about.

The other benefit is that jitsu could be one interesting avenue to further reduce the attack surface. Summoning a unikernel on-demand to service a single request has demonstrated to be plausible. Instead of leaving a long-running process running you have a highly-restricted machine summon your application process in an isolated unikernel for a few milliseconds before it's gone.

The kinds of architectures unikernels enable have yet to be fully explored. The ideas being explored by MirageOS are by no means new but they haven't been given serious consideration. They may not be "ready for production," yet but given some experimentation and formal specification it may yet prove fruitful.

> Summoning a unikernel on-demand to service a single request has demonstrated to be plausible. Instead of leaving a long-running process running you have a highly-restricted machine summon your application process in an isolated unikernel for a few milliseconds before it's gone.

From a "far enough" pov (but not too far...), how is that system different from a kernel running processes on-demand? Why any replacement for the libc would contain less vulnerability? Same question for replacing a kernel with an "hypervisor".

I feel I still don't know enough on these subject to think this whole game in the end consist in renaming various components, rewriting parts of them in the process for no real reason. But maybe this is actually it.

Why don't you read http://www.skjegstad.com/blog/2015/08/17/jitsu-v02/ and decide for yourself?

The gist of it is that for a typical DNS server you can boot a unikernel per-request to service the query instead of leaving a long-running process going on a typical server. You can boot these things fast enough to even map them to URLs and create a unikernel to serve each page of your site.

It's hard to approach it from the perspective of what you already know to be true and reliable. It's still experimental and we've only begun to explore the possibilities.

Honestly to my untrained eyes, it stills looks like a weird operating system except more complicated, all of that to implement features you could have implemented more simply with a less crazy architecture that does not involve launching a program as if it was the complete system of a standalone computer.

Now I'm not saying that it is bad to experiment, but if you don't come up beforehand with actual things to test that were not possible on current modern systems or at least substantially more difficult to do instead of quite the opposite, it's not a very useful experiment but just a curious contraption.

> Why any replacement for the libc would contain less vulnerability?

Because it would be written in better languages.

> Same question for replacing a kernel with an "hypervisor".

Because the hypervisor implements a much smaller, better specified API than a kernel does.

That's because his argument is orthogonal to the performance and security arguments. His argument is basically even if unikernels are faster and even if they are just as secure, they are still operationally broken because you cannot debug them.

He doesn't need to present a great argument against security or performance. There doesn't even need to be such an argument. If you've ever spent six months trying to find out why a content management system blows up under the strangest of conditions, even when you have a full debug stack, you understand why that argument may be able to stand alone.

The place where his argument falls down, IMO, is, like others have said, in assuming that everything is binary: everything is unikernel or it is not. And that's just silly.

His argument is basically even if unikernels are faster and even if they are just as secure, they are still operationally broken because you cannot debug them.

I personally agree that this would be a stronger argument, but unfortunately it's not the argument he's making. Instead, he's "pleading in the alternative", which is less logical, but can in some situations can be more effective. The classic example is from a legendary defense lawyer nicknamed "Racehorse" Haynes:

“Say you sue me because you say my dog bit you,” he told the audience. “Well, now this is my defense: My dog doesn’t bite. And second, in the alternative, my dog was tied up that night. And third, I don’t believe you really got bit.” His final defense, he said, would be: “I don’t have a dog.”

It maps excellently: "As everyone knows, unikernels never have a performance advantage. And even when they are faster, they are always terribly insecure. And even after people solve the security nightmare, they're still impossible to debug. But what's the point in spending time talking about something that doesn't even exist!"


The Racehorse example isn't the best example of arguing in the alternative, because the first three "alternatives" are fully compatible with one another; you could easily argue that all three were true. The real alternative branch is "my dog doesn't bite, and in the alternative, I don't have a dog".

The place where his argument falls down is... that you can actually debug unikernels. I do it almost everyday.

So if the performance and security arguments are just distractions, and the core argument that they're "undebuggable" is just baldly incorrect, then what's left?

It would be a great argument if it were true. But while he mentions rumprun, he doesn't seem to have noticed that it can do all the things he claims unikernels can't do. Nor is there a claim that the current methods are necessarily ideal; it is an exploration of what else is possible and how to make it work in practice.

What it means is that it doesn't matter if they are faster or not because the OS isn't the bottleneck. The bottleneck is the framework or the user application in most of the cases.

one does not typically find that applications are limited by user-kernel context switches.

the OS isn't the bottleneck.

Curious, then why are we seeing articles here all the time on bypassing the linux kernel for low latency networking?

You don't bypass the kernel, you bypass the TCP stack of the kernel and this is for very specific applications.

Bypassing the kernel entirely is pretty normal in HPC applications. Infiniband implementations typically memory-map the device's registers into user-space so that applications can send and receive messages without a system call.

This is not bypassing a kernel. This is called Remote Direct Memory Access (RDMA) and there is still a kernel.

FYI most of the devices inside your computer work through DMA.

On in particular is network capturing via libpcap.

It's basically an alternative driver that comes with additional capabilities. Such as capturing promiscuously, and filtering captures.

Just curious, how does a unikernel solve network latency?

The issue GP talks about comes from the cost of context-switching on a syscall (going into "kernel mode", performing the call, then going back into "application mode"). There's no context switch in a unikernel.

Well, unless you count the hypervisor context switch, which you do.

And if you need super high performance, you can run a unikernel on bare metal.

I guess in the glorious future everyone will be using SR-IOV.

Are you assuming SRV-IOV passthrough (which has its own performance profile) ? Because normal virt -definitely- hits a context switch when it goes from unikernels virtual NIC to real NIC, if not twice.

Unless you are running a load balancer, static http server or in-memory key-value store. Those aren't exactly fringe use cases.

Even in those examples, it applies mostly to the subset of users who have very large numbers of very simple requests.

The ratio of kernel to userland work is bad if you're receiving an entire network request just to increment a word in memory but usually quite tolerable if, say, you're shoveling video files out and most of the time is spent in something like sendfile() or if your small request actually requires a non-trivial amount of computation.

If you're doing nontrivial computation that absolutely dominates the kernel overhead (almost by definition). But how frequent is that? A huge amount of programming boils down to CRUD and very basic transformations, maybe a little bit of fanning out/in, all of which involve minimal actual compute.

Developers of such software might care about unikernels, users, probably less.

This is a correct approach from a developers point of view, however, it is out of scope from the operators point of view. You can always improve your software stack as a developer. These are not exclusive, and are different, should not be conflated.

That, and you're giving up a lot to gain that marginal increase in performance.

the perf gains are just as a possible without a unikernel, most high perf net apps at this point are onto userspace net stack to avoid interrupt+switch cost ala intel dpdk, an opensource example would be https://github.com/SnabbCo/snabbswitch/blob/master/README.md but tons of others in commercial spaces.

ie. as per what bryan said there's plenty of counter examples on perf

It's indeed not worth dwelling on performance (even if it is one of the main arguments for unikernels) if your main topic is that they are poor from a security standpoint.

You missed the whole section where he compared unikernals on Xen vs. Linux (or Solaris) on metal. Unikernals have to run on Xen so either way, the best you're going to do is have one-level of abstraction (OS or Hypervisor) between you and your application.

Rumpkernel can run on bare metal, too.

Bryan Cantrill seems to have some personal interest in denigrating OS research (defined as virtually everything post-Unix) as all being part of a misguided "anti-Unix Dark Ages of Operating Systems". He has expressed this sentiment multiple times before, and places a great deal of faith on Unix being a timeless edifice which needs only renovation. Naturally, he regards DTrace and MDB to be the pinnacles of OS design in the past 20 years and never stops yapping on about them, this article being no exception. It's his thought-terminating cliche.

He voiced all this here [1], and so I countered by listing stuck paradigms in traditional monolithic Unixes, as well as reopening my inquiry on Sun's Spring research system, which he seems to scoff at, but over which I am impressed by the academic research it yielded. He has yet to respond to my challenge.

[1] https://news.ycombinator.com/item?id=10324211

There's a lot of stuff Solaris got right, long before the Linux world did, and in some ways, it's still catching up.

DTrace? Sure, there's a plethora of dynamic tracing tools in linux, but it's honestly just now starting to catch up with eBPF. If eBPF stack traces make it into 4.5, that might be the first time I can look at Linux dynamic tracing and go "Yep, it's arrived"

Systemd certainly apes quite a bit of it's functionality from SMF (and, personally, I'd argue SMF still does basically everything better...)

ZFS? Still the king of filesystem/logical volume management. Btrfs might catch up someday

Zones? Again, still ahead of the linux equivalent in functionality. And lx branded zones are awesome.

I'm not saying nothing needs to advance ever again in the world of OS research, but I think we need to be honest about how much we owe to Sun for having created the modern template for a great amount of functionality that Linux is just now having come into existence, and it's not like things at Joyent, OmniIT, Nexenta, and other Illumos developing shops have stagnated either.

I often disagree with Bryan on specific points, but I think you do him a disservice. While many of the things Sun did might be viewed as an incremental update to existing concepts, it's still something that the Linux community has yet to catch up with.

Anyway, even when he is wrong, I think he often brings up a viewpoint that results in some interesting discussion.

Joyent is claiming that they are getting pretty impressive numbers out of running docker images on Solaris, by using some sort of funky shim layer to run Linux in a zone.

I also like to point out that Java, another Sun project, was a big impetus for a bunch of OSes to get properly functioning pre-emptive threading implemented. At the beginning were dark times, and even Solaris was no cake walk. C has copied the Java Memory Model for shared state concurrency, and most of the best papers on Garbage Collection were written in the mid to late nineties, often targeting Java. Subsequently in the 00's, escape analysis made pretty giant leaps forward, leading to the memory system you have in Rust today.

I think people who argue about the best data models for shared state concurrency often forget that they're taking us-vs-them stances on conventions that were all documented by the same individual, the Late, Great, Sir Tony Hoare.

There have really only been two breakthroughs since then. Transactional memory, which is still in a state of discovery and who knows when we'll get forward motion on that (especially since Intel's hardware support went belly up), and object ownership based on escape analysis, as used internally in recent JVMs, and explicitly in a few new languages, like Rust.

Now if only they'd pushed type theory forward, instead of regressing and taking the rest of us with them...

  > Subsequently in the 00's, escape analysis made pretty 
  > giant leaps forward, leading to the memory system you 
  > have in Rust today.
Rust doesn't use escape analysis, its analyses are based on Cyclone's regions system (which did arise in the mid-'00s). Or are you implying that there's a lineage leading from escape analysis to regions systems? (I'm unclear on Cyclone's own heritage.)

I may be making a causal inference based on a correlation. What I had read about the compile time guarantees in Rust indicated that escape analysis was being used to implement them, but I don't know if I read that or assumed.

I hadn't encountered Cyclone, but reading through a paper about it now, and it sounds a lot like reading about how Java uses escape analysis with its generational garbage collection to decide where to allocate objects. In fact in 8 goes a lot farther, and it can eliminate monitors, memory fences, and copy constructors that have no observable side effects.

Of course if Java ever conservatively wrong (retaining a dead allocation), the GC will take care of it later. In a language without a collector you have to be right in both directions.

Escape analysis uses somewhat similar machinery in that it cares about the scopes in which variables are used, but the similarity pretty much ends there. Lifetimes, by contrast, are a type system feature. The closest thing to lifetimes in terms of implementation is actually generics (which makes sense, because lifetimes are just another kind of generic type parameter).

Late, Great, Sir Tony Hoare

Tony Hoare is very much still alive to the best of my knowledge!

Yup, in fact I met him recently at a conference, looked great.

It's quite possible that he meant that he believes he has a time management problem, but finds it irrelevant and endearing due to his greatness.

Oh gosh, I have my deceased founding fathers mixed up.

I never said anything of Linux (I am critical of it), and I don't know how you came up with this odd Linux/Solaris dichotomy.

What are the alternatives, in the *nix world? The BSDs are in much the same place as Linux on most of those specifics, or further behind, or heavily using the tech from the Solaris side.

HP-UX? AIX? What do you think is doing better than Solaris or Illumos on those things?

You don't do it in UNIX because UNIX is fundamentally broken. That's the point. You do it through specialist software that works around the various problems. A common approach (i.e. compromise) in high-assurance security was to split systems between an untrusted, UNIX VM and critical components running directly on a microkernel/hypervisor. The kernels themselves were built as robust as possible sometimes with every potential, control or error state known. The apps outside VM's often used Ada or Java runtimes whose assurance and features were customized for apps needs. Robust middleware handled mediation and sometimes recovery. Many systems like this survived strong pentesting and worked without reported failures the field (or only failed-safe).

Then, there's the shit most people are doing and that Bryan advocates with UNIX. I also called him out on it listing numerous counters... with references backing them... to his comment at vezzy-fnord above.


Secure UNIX had already been tried by geniuses for a decade or two with always same results: architecture, language, and TCB were inherently so complex and insecure that even smallest versions had many 0-days and dozens of covert channels. He didn't respond, likely since he uses assertions instead of evidence. And all published evidence that I've ever seen contradicts his view of UNIX model superiority in robustness of any kind, sometimes performance.

Now, he's making all kinds of assertions about unikernels, deliberately avoiding evidence on some (eg performance), praising UNIX model/tech, and ignoring faults in his own approach. Should we take his word this time given prior result? Definitely not.

I think we can say there is a kind of 'bulb user' in regards to OS, from those that only know UNIX and Windows.

That is why is so important to spread the word of old OS designs and welcome any attempts to move forward.

This is why I like the changes in mobile OSes architecture, pushing safer languages and with a stack where the kernel type is actually irrelevant to user applications.

I agree. Tannenbaum noticed this, too. He calls it the television model. Skip to 1:25 in the video below to watch him hilariously contrast the television and computing experience for the lay buyer:


Now, tablets and smartphones are closer to the television model. Mac OS X got fairly close to it for desktops. So, I know it can be done. There's just this willingness to... DO IT.

In parallel, we can do the Burrough's model for servers and the Genera model for hackers. Cuz seriously, what real hacker is using C on inefficient black boxes of black boxes? Do they know what they're missing?

I've seen this comment about Unix being fundamentally broken a few times... could you point to a source that explains this in more detail?

(This question is not meant to be disagreeable, I'm genuinely interested in finding out what the issues are).

> Unikernels are entirely undebuggable

Funky, the Lisp Machine uni-kernel OS was probably one of the most debuggable OS ever... with the most sophisticated error handling system, processes, backtraces, self-descriptive descriptive data-structures, full source code integration, seamless switching between compiled and interpreted code, run-time type checking, runtime bounds checking, inspectors, integrated IDE, ...

I would add Mesa/Cedar and Smalltalk to the list, but I only seen the original ones from Xerox documents.

Oberon was also quite good, but I guess less than Lisp Machines.

Did you see my list in reply to that comment? Genera and Oberon are on it. I'll consider Mesa/Ceder and Smalltalk. The former was in the Hansen overview I started with. Might deserve individual mention, might not. Throw in your opinion on why if so as I just can't recall its traits.

Smalltalk, too. Especially as the topic is stuff that's still better than UNIX in some attribute. I haven't studied it enough to know what you like about it past probably safety and great component, architecture.

Yeah I read it. It is a very nice overview.

I just didn't knew what comment was better to reply to.

As for Smalltalk, I loved its expressioness, specially since my experience with it was in the mid-90's with VisualWorks at the university, before Java was introduced to the world.

But back then one still needed to code the VM primitives in Assembly. Meanwhile with Pharo and Squeak it is turtles all the way down.

Still waiting for a Dynabook though.

re language

Ahh. I believe it was VisualWorks mentioned when I last looked at it. The impression I had from that description was that it was the ultimate, component language. They said you don't have a main function and directives like most languages. You really just have a pile of objects that you glue together with the glue being the main application. And that this was highly integrated into the IDE's to make that easy to manage.

Was that the experience you had?

re VM primitives in assembly

I'm actually fine with that. I'm not like other people that think language X's runtime must always be written in X. Maybe a version of that for testing and reference implementation. People can understand it. Yet, I'll throw the best ASM coder I can at a VM or even its critical paths if I have to get highest performance that way. Can't let C++ win so easily over the dynamic languages. ;)

I learned OOP with Turbo Pascal 5.5, and already used a few other versions up to Delphi 1.0, C++ and VB, before I got to use Smalltalk.

So I was already comfortable with the concepts as such.

But playing with one of the foundations of OOP concepts had some fun to it, the environment fully dynamic that you could explorer and change anything (better save the image before).

Also it was my first contact with FP, given the Lisp influence on Smalltalk blocks and collection methods. The original LINQ if you wish.

Then the mind blogging idea of meta-classes and the interesting things one could do with them.

Smalltalk given its Language OS didn't had any main as such, you were supposed to use the Transcript (REPL) or the class browser to launch applications.

As an IDE, you could prune the image and select a specific class as entry point to create a production executable.

But after that semester, I lost access to it, so eventually I spent more time reading those wonderful Xerox PARC books about Smalltalk-80 than using Visual Works.

As for the VM primitives in Assembly, I also liked it, but many people see that as a disadvantage, like you mention.

Thanks for the detailed reply. Yeah, that is interesting. Closer to the LISP machines than most development environments. Might make me revisit Smalltalk just because it's so hard to get people to try LISP.

Not organized to be able to just throw out a reference and many disappeared over time as old web faded. It's more something you see relative to other OS's than an absolute. I really need to try to integrate all the examples sometime. Here's a few, esp from past, that give you an idea.

Historical look at quite a few http://brinch-hansen.net/papers/2001b.pdf

Note: A number were concurrency safe, had a nucleus that preserved consistency, or were organized in layers that could be tested independently. UNIX's was actually a watered down MULTIC's & he's harsh on it there. I suggest you google it too.

Burrough's B5000 Architecture (1961-) http://www.smecc.org/The%20Architecture%20%20of%20the%20Burr...

Note: Written in ALGOL variant, protected stack, bounds checks, type-checked procedure calls dynamically, isolation of processes, froze rogue ones w/ restart allowed if feasible, and sharing components. Forward thinking.

IBM System/38 (became AS/400) https://homes.cs.washington.edu/~levy/capabook/Chapter8.pdf

Note: Capability architecture at HW level. Used intermediate code for future-proofing. OS mostly in high-level language. Integrated database functionality for OS & apps. Many companies I worked for had them and nobody can remember them getting repaired. :)

Oberon System http://www.projectoberon.com/ http://www.cfbsoftware.com/modula2/Lilith.pdf

Note: Brilliance started in Lilith where two people in two years built HW, OS, and tooling with performance, safety, and consistency. Designed ideal assembly, safe system language (Modula-2), compiler, OS, and tied it all together. Kept it up as it evolved into Oberon, Active Oberon, etc. Now have a RISC processor ideal for it. Hansen did similar on very PDP-11 UNIX was invented on with Edison system, which had safety & Wirth-like simplicity.

OpenVMS https://en.wikipedia.org/wiki/OpenVMS

Note: Individual systems with good security architecture & reliability. Clustering released in 80's with up to 90 nodes at hundreds of miles w/ uptime up to 17 years. Rolling upgrades, fault-tolerance, versioned filesystem using "records," integrated DB, clear commands, consistent design, and great cross-language support since all had to support calling convention and stuff. Used in mainframe-style apps, UNIX-style, real-time, and so on. Declined, pulled off market, and recently re-released.

Genera LISP environment http://www.symbolics-dks.com/Genera-why-1.htm

Note: LISP was easy to parse, had REPL, supported all paradigms, macro's let you customize it, memory-safe, incremental compilation of functions, and even update apps while running. Genera was a machine/OS written in LISP specifically for hackers with lots of advanced functionality. Today's systems still can't replicate the flow and holistic experience of that. Wish they could, with or without LISP itself.

BeOS Multimedia Desktop http://birdhouse.org/beos/byte/29-10000ft/ https://www.youtube.com/watch?v=BsVydyC8ZGQ

Note: Article lists plenty of benefits that I didn't have with alternatives for long time and still barely do. Mainly due to great concurrency model and primitives (eg "benaphors"). Skip ahead to 16:10 to be amazed at what load it handled on older hardware. Haiku is an OSS project to try to re-create it.

EROS http://www.eros-os.org/papers/IEEE-Software-Jan-2002.pdf

Note: Capability-secure OS that redid things like networking stacks and GUI for more trustworthyness. It was fast. Also had persistence where a failure could only loose so much of your running state. MINIX 3 and Genode-OS continue the microkernel tradition in a state where you can actually use them today. MINIX 3 has self-healing capabilities. QNX was first to pull it off with POSIX/UNIX compatibility, hard real-time, and great performance. INTEGRITY RTOS bulletproofs the architecture further with good design.

SPIN OS http://www-spin.cs.washington.edu/

Note: Coded OS in safe Modula-3 language with additions for better concurrency and type-safe linking. Could isolate apps in user-mode then link performance-critical stuff directly into the kernel with language & type system adding safety. Like Wirth & Hansen, eliminates all the abstraction gaps & inconsistency in various layers on top of that. JX OS http://www4.cs.fau.de/Projects/JX/publications/jx-sec.pdf

Note: Builds on language-oriented approach. Puts drivers and trusted components in Java VM for safety. Microkernel outside it. Internal architecture builds security kernel/model on top of integrity model. Already doing well in tests. Open-source. High tech answer is probably Duffy's articles on Microsoft Midori.

So, there's a summary of OS architectures that did vastly better than UNIX in all kinds of ways. They range from 1961 mainframes to 1970-80's minicomputers to 1990's-2000's desktops. In many cases, aspects of their design could've been ported with effort but just weren't. UNIX retained unsafe language, root, setuid, discretionary controls, heavyweight components (apps + pipes), no robustness throughout, GUI issues and so on. Endless problems many others lacked by design.

Hope the list gives you stuff to think about or contribute to. :)


I believe that vezzy-fnord was saying that the unix world was not the ultimate in OS design. Therefore asking where in the unix world is better is missing the point.

Off topic: How do you get an asterisk in your comment without HN flipping it into italics?

But what is better? Even outside of it?

Nowhere in my comment was I saying that the *Nix world is perfect and it needs no further improvement. I'm saying that attacking Cantrill on his thoughts on OS research is kind of disingenuous because Sun and the Illumos contributors are still on the cutting edge for a lot of features as far as widespread OS distributions go.

But what is better? Even outside of it?

If you don't know the answer to that question, then you are in no position to draw sweeping false dichotomies and grand proclamations as you did above. I have no interest in being your history teacher. I've posted plenty of papers on here, as has nickpsecurity here in the comments, and the link in my parent post provides a decent academic summary by one of the greats in the field.

(I'm not even sure if illumos can be described as "widespread" in any fair sense. In any event, being better than some relative competitors doesn't give one the carte blanche to be a pseudohistorian and denialist.)

I haven't spent my entire time on HN reading over every single comment you've made or link you've posted, so I don't think it's particularly fair or productive to act as if it is my fault that I don't have knowledge from something you wrote on some undetermined other comment. I wouldn't expect you to know something I said in some other random place. If you are going to operate under the assumption that anyone you engage in discourse with is going to be familiar with the entire body of your comment history and blog posting work, I feel you're often going to be disappointed.

I'm trying to engage in some fairly civil discourse here, and I am coming into it with an open mind - I am not an academic. Nor am I a systems programmer. My frame of discussion is purely based on what I know, which is mainstream, or relatively mainstream, systems. I work with these systems on a (very) large scale. I know what frame of reference that gives me, and how that applies to me.

I've looked over the other links in these comments. As best I can tell, they deal with academic scenarios, or things that are no longer in widespread use. Cantrill's article is dealing with the here and now of today's production demands. As best I can tell, your argument is that there has been work outside of today's widespread deployments that may be superior than what is currently in it.

It's apparent I misunderstood your original comment to a large extent, and I apologize for that - but the argument I made was in good faith, and while I'm not expecting you to give me a history lesson, I don't think it's unfair to ask that if you are going to engage in online discourse, you be prepared to elaborate on your point if someone is trying to rationally discuss it with you.

I enjoy a lot of what vezzy-fnord has posted (e.g. discussions of init systems) when I see it, but I am also woefully behind on it. I'd love to put these links together in some way that's good for people to browse through and quickly get familiar with options for OS and plumbing design -- would you or vezzy-fnord be interested in helping out with such a website?

I'll warn you that I have a serious bias towards systems that are or could be in wide use, though I think part of helping systems get to wide use, or at least get to the point of informing current development, is making information on them easily accessible. I've discovered a few interesting OS projects in this thread alone.

Especially when you assert X, and someone asks for more information about X, mocking them for not knowing X and saying that you're not willing to be their teacher... that's just being highly rude. Even to say, "as I said, see [URL]" would be helpful, and would take no longer to write than what you [vezzy-fnord] wrote.

Mate, I have nothing but the greatest of respect for you as you know, but this comes off a bit... arrogant. I think I know you well enough now that you aren't, but it would be good to answer the question, even if it's to say "have a look at this resource here (insert URL)".

I'm definitely no fan on Bryan Cantrill, who I consider to be insufferably arrogant and hypocritical (he of the "have you ever kissed a girl", but would fire someone who he didn't employ over a personal pronoun...) but he does have a lot of experience and whilst his views are often controversial, it's probably better to counter them with information :-)

When I find myself in this sort of situation repeatedly, as it appears you do on this topic in this forum, my solution has generally been to assemble a blog post that covers not frequently asked questions but instead frequently delivered answers.

Then rather than getting aggravated at having to repeat myself I just hand them a link to said post and move on.

(and in case you're curious, yes I'm mst on IRC as well, though I think your network got disabled in the last or last but one config prune)

Just found a brief formatting guide that say you put exactly one space after the first asterisk. asterisk space stuff in between asterisk.

Let's try it: * no italics*

EDIT: Ok, space is visible after first and second one disappears. Now I want to know what the commenter is doing too lol.

EDIT 2: The closing asterisk reappeared after I added text after it. Stranger.

It's working for me because I'm only using one asterisk, but not using any special formatting.

As long as I only say *Nix once, no italics...

Ohhh. Yeah, you only used it once. So using it twice is still open to experimentation. Hackers, have at it!

We could use ﹡these﹡ instead. U+FE61, small asterisk.

We could use ∗these∗ instead. U+2217, asterisk operator.

Interestingly, U+2731, heavy asterisk, is stripped by HN it seems.

But to be honest, none of them feel the same as the one true *. I looked at https://en.wikipedia.org/wiki/Unicode_subscripts_and_supersc... hoping there was a way to elevate and scale down a character by some special code to make it look like a regular asterisk but didn't see any. Also, with unusual code points, font support is generally lacking so to some readers of HN, the above mentioned variations will probably render as something such as just squares. Furthermore, one of those I used above mess with apparent line spacing for me.

Ha! I was looking for one of those in character map! Quit when one glitched me. The second one is pretty close. I agree that none look right compared to the real one.

"Furthermore, the ones I used above mess with apparent line spacing."

Yeah, yeah, that's the glitch I was talking about. Dropped 2-3 lines on me and I couldn't even see the character.

Note: We could use the second one in comments the way people use asterisks. Just to screw with people who will wonder why ours are showing. If they ask about it, just act like it looks fine on your end: all italics. Haha.

We'll settle on the second, then.

They look the best as you said and they are also the ones that ∗didn't∗ mess with apparent line spacing.

Regarding messing with people, use the browser dev tools to replace them with the italics open and end tags, then screenshot the result and say to people, "what are you talking about? looks fine to me" and link said screenshot. I'm affraid our scheme will soon be thwarted by other HNers though who will jump in to say "yeah, I see asterisk as well" and then someone else will say "they are using a different unicode character".

Saving them in a text file. :)



  I want to say * nix, and then to *italicize * nix*.

I want to say * nix, and then to italicize nix.

That also left things in italics, which I'll close now.

Test 2:

  I want to say * nix, and then to *italicize *nix*.
Result 2:

I want to say * nix, and then to italicize nix.

Again, this left italics on.

So there doesn't seem to be a way to do what I was trying to do. This is why I said "unix" rather than, you know...

More tests.






(print (+ * (i "NIX")))

EDIT: Darn, I was almost sure the last one would activate one of pg's old routines here. Note that the asterisk disappeared when a backslash came first. Did get NIX italicized but couldn't get rid of the space. Let me try something...

EDIT 2: Didn't work. May not be possible due to formatting rule here. Be nice if they modified it to add an escape code or something to let us drop an arbitrary asterisk.

Please tell me you aren't fuzz testing HN? :-)

With a human computer, no less! Just one comment worth, though. No disruption. Guidelines didn't have a rule against it either. Hopefully in the clear. ;)

lol! All good, just amusing to see you doing this, almost like it's muscle-memory.

It was but more from various forums and programming languages lol...

Your opinion on the author has no bearing on the contents of the article, which i found well argued.

They're overwhelmingly argument by assertion. I'm not sure that counts as "well argued"?

Like the stuff at the end about STM and M:N scheduling. There are systems that use both to tremendous effect. Erlang only offers an M:N scheduling model, and it works extremely well for building the kinds of systems Erlang was built for.

Yeah, both examples plus him not backing up his performance assertion showed he was full of crap. Haskell did nearly bulletproof STM that I hear. I've seen other examples. Your Erlang example is good on M:N. Microkernel with work in AI planning still using it successfully with ever more efficient algorithms. JeOS performance characteristics of the past suggest unikernels might improve performance over monoliths or plain virtualization. Need real-world data rather than assertions.

He seems to be all talk. He's also more critical of unikernel level security and robustness over well-known issues like POLA while ignoring that his own platform violates both POLA and other rules we've learned about INFOSEC over time. The only proven approaches were separation kernels a la Nizza architecture, capability model at OS or component level, and language-based protection w/ extra care on HW interfaces. He doesn't mention those because they oppose UNIX approach, which he ideologically supports, plus show his product is likely not trustworthy.

This article isn't objective or reliable.

I think Bryan Cantrill is referring to both STM and M:N in the context of high performance applications (aka, the norm for him, since he's a kernel engineer). In that sense, both of them do have problems.

Circa 2000, Solaris, Linux, and some (all?) BSDs had M:N threading, and all of them threw it away because there are numerous performance artifacts caused by two schedulers (OS and the M:N) conflicting with each other. Mr. Cantrill was there when this all happened, and even wrote a paper on it: ftp://ftp.cs.brown.edu/pub/techreports/96/cs96-19.pdf

This doesn't affect Erlang as much simply because Erlang is really slow. I think of it as a waterline -- the lower the overhead of the language, the more likely you will run into the serious problems that M:N typically inflicts. E.g. priority inversion, and long and unpredictable latencies when the OS swaps out a thread that M:N wanted running.

As for Haskell and STM, all I know on the topic is that some pretty damn smart engineers at IBM spent a good while trying to make STM work well, and gave up. See this ACM article: https://queue.acm.org/detail.cfm?id=1454466. Perhaps Haskell managed to pull it off through the language restrictions it allows, but I’d like to see some compelling articles on that by engineers who know a thing or two; just because a language has STM doesn’t mean it does it well.

Honestly, I think your comments about M:N and STM are more revealing of your ignorance rather that Cantrill's. While you might disagree with him, there’s sufficient literature to give his stance some legitimacy. Sweepingly calling his stance "full of crap" says a thing or two.

Erlang's M:N scheduling isn't what makes it slow. And not all of Erlang is slow either for that matter.

The reason it works for Erlang is becuase M:N is the only model. There's no alternate supported model trying to maintain some competing set of semantics to muck up the works.

You can induce problems by running code that doesn't fit inside the model such as NIF code. But as of R18 you can consume a separately maintained threadpool for that stuff.

When the paradigms collide there are sometimes bizarre non-deterministic interactions, but there's nothing wrong with M:N scheduling, craploads of core infrastructure systems are running on M:N scheduling kernels/runtimes.

If you've used money in any way whatsoever ever at any point in your life you've critically depended on this "fad" and will continue to do so for a long, long, long time.

At least in NetBSD, M:N was abandoned not for performance reasons but because it was too complex and could not be fully debugged. I don't know the details, but I could at least easily imagine multiple models being the main issue.

It's something that, if it could be optimal, would only be optimal in specific situations. A lot of tactics CompSci discovers are like that. So, we experiment with them, stash them, whatever. Mainstream and crowd effects in academia both have a tendency to occasionally jump on something like a crusade. Many promises are made, much money thrown at it, and it never delivers. Inability to integrate with models of legacy systems, either at all or with desired attributes, is often one result of this. I knew the moment I saw the activity that it couldn't be trusted without further review. It was at least useful in security kernel model but most of those are gone now. So, we stash it in case it benefits another problem one day or someone has a use for it likely in a clean-slate system.

That's how these things go. I wouldn't expect it to work on NetBSD or any UNIX. Even the good ideas often fail to integrate into UNIX. Half-ass ones like M:N more so. UNIX model is just too broken. Hence, my opposing it here.

> In that sense, both of them do have problems.

POSIX thread scheduling implemented as M:N has problems. General M:N or 1:N schedulers don't necessarily have to have problems, as it is demonstrated in practice by virtualization. Virtualized kernels implement their own scheduler on top of the hypervisor (which also has a scheduler), and it all works great. Performance overhead of virtualization is usually in other places, not in scheduler interactions. But even when it is, people do use VMs, so it works well enough, certainly not "pathologically bad"

I'm not sure about virtualised kernels on hypervisors working well. Under the right conditions they can probably be made to, but I've seen cases where apps that could scale reasonably well to 32 cores on a bare-metal OS having difficulty go beyond 4 cores when inside hardware virtualisation; the hypervisor was migrating threads between cores, which the guest OS was unaware of.

My own experience with KVM has been that performance is problematic, thus I don't really find it compelling evidence that M:N schedulers can work well. Maybe other hypervisors can do it better, but I don't see how they could get around fundamental realities: there usually are more hypervisor threads than real cores available to service them, yet guest OS's aren't designed to gracefully handle virtual cores unexpectedly freezing for periods of time.

Scheduler binding topologies go great lengths to mitigate this problem.

Don't over-subscribe the cores, disable hyperthreading, and bind scheduler threads to physical cores if your platform supports it.

Unfortunately, that raises the question: what is over-subscription?

Unless you're prepared to have many potentially-idle cores around, due to 1-1 bindings with an unknown, erratic, or uncontrollable workload, we're suddenly in a pretty tricky realm. You can get around this, but only in very specific and known use-cases you have control over; and I suspect the majority of use-cases fall well outside this.

If you want to more efficiently take advantage of free resources with relatively unpredictable or uncontrollable workloads (e.g. most cloud providers), binding threads to cores is a problem.

"I think Bryan Cantrill is referring to both STM and M:N in the context of high performance applications (aka, the norm for him, since he's a kernel engineer). In that sense, both of them do have problems."

"Honestly, I think your comments about M:N and STM are more revealing of your ignorance rather that Cantrill's."

Probably was. Idk who he is or what he does. I'm aware there were fads to claim the two could solve every problem under the Sun in their category. I'm one of those few that ignore fads to focus on any lessons we can learn from certain tech or research. He made a blanket assertion against at least three technologies in one article with no context. Anyone reading might think the techs don't work well in any context. So, I briefly called that.

As evidence, I gave an example of STM that worked. In OS's, it had mixed results unless hardware accelerated and one typically had to protect I/O stuff with locks due to irreversibility. Far as M:N, only found OS use for it in security kernels (eg GEMSOS) to help supress covert, timing channels. Didn't expect him to know that use-case, though, since the cloud products are usually full of covert channels. And then there's MirageOS on unikernel end using type- and memory-safety for robust, network applications already showing fewer 0-days than alternatives did.

So, all three already have real-world, beneficial use. Quite the opposite of his claim that leads one to believe that never happens.

"Sweepingly calling his stance "full of crap" says a thing or two."

In other comments, I did present data and specific examples to reject his claims about UNIX being a usable foundation for this sort of thing among others. His approach to such things was to ignore all counter evidence and/or leave the discussion. Eventually, a new article shows him making an adverti... err, a bunch of claims, some backed up and some assertions, about all kinds of tech in an anti-unikernel rant.

Strange that you're fine with him ignoring his critic's counterpoints... with references... to his major claims but have trouble with how I handle minor, tangential ones. Says a thing or two indeed.

I comment on the things I know about. Strange indeed.

Did you know that bulletproof, useful STM implementations existed? And that covert channel suppression was a pre-requisite for secure, cloud applications? And that M:N had been used to help handle that? It seemed to me you didn't know these things.

Instead, you agreed with the blanket statement that those two techs were totally useless instead of a more qualified one that they overpromised and couldn't deliver for "legacy" systems whose models didn't fit. I also bet money that Cantrill's product is still vulnerable to covert, timing channels. I'll win by default because they show up every time a resource is shared, that's his business model, and countering them usually requires ditching UNIX or x86. Want to place that bet? ;)

Citations please. This is the second time I'm asking.

If there are bulletproof STM implementations, there should be some papers which back that up (preferably not academic toys). What throughtput and latency profiles did the STM implementations get, how did that scale with cores, what work profiles do they work well with, and how did all that compare with regular threads? How were livelock and side-effects handled? Was this put in production, and what shortcomings were there (there are always shortcomings)? Etc, etc, etc.

Also, reread my initial post more carefully. I qualified it at the very beginning with "in the context of high performance applications”, and I qualified it again near the end; I didn’t simply agree with any sweeping claims. Yet here I am, getting lectured by someone who says that "performance assertion showed he was full of crap"... which I have demonstrated isn't true.

Back your claims up; maybe I'll learn something interesting.

"Citations please. This is the second time I'm asking."

I was giving you no references, only stuff you could Google, because you opened with an insult. I don't do homework for people that do that. That said, I think I figured out where this really started:

"by someone who says that "performance assertion showed he was full of crap"... which I have demonstrated isn't true"

This assertion wasn't about M:N or STM. Those came way after the performance assertion. In his third paragraph, he tried to dismiss unikernels with a blanket assertion about [basically] how their performance can't be good enough. Sums it up as saving context switches isn't enough, as if that's supporters' key metric. Other commenters wanted actual data on different types of unikernels and alternatives supporting his assertion. Instead of providing that, he says " let’s just say that the performance arguments to be made in favor of unikernels have some well-grounded counter-arguments and move on."

Seriously? He just opened a counter to unikernels by saying their bad performance makes them worthless compared to alternatives like his, gives context-switches as a counterpoint, and then asks the reader to just take his word otherwise that there's a good counter to anything they'd ask? Unreal... How could I not look at it as either arrogant or propaganda piece after that?

That was the performance assertion some others and I we're talking about. That's what I meant when I said "plus" the performance assertion. I can see how it could be read like it was connected to other two. The M:N and STM claim I made came from this sweeping statement he did for show far as I can tell:

"the impracticality of their approach for production systems. As such, they will join transactional memory and the M-to-N scheduling model as"

That was outside his section on performance toward the end after he talked about all kinds issues. So, he just says they were impractical for production systems. Despite being used in production systems beneficially. Not for ultra-high performance, like you said, but delivered on claims in some deployments and not purely a fad. So, he doesn't substantiate his prior claims on HN when asked (with references supporting counters), his opening claim there, and doesn't qualify those two. So, I called him out on all of it with the assumption this is an ego or ad piece more than any collection of solid, technical arguments. Only debugging claim really panned out and only partially.

I think we just got to talking cross-purposes from then on where the discussion conflated what the performance assertion was with my gripe about his overstatemetns on other two technologies. So, I'm dropping that tangent from here on now that I see what happened. Pointless as I don't disagree with your performance assertion about two technologies so much as his unsubstantiated assertion about unikernels and overstatements about two technologies. So, I'm done there.

Far as my original comment, we're still waiting on him to back up that performance assertion he made that unikernels provide no performance advantage despite reports to contrary from users. I'm also waiting for his security claims on OS vs hypervisors as I've seen more robust, even formally-verified, hypervisors than OS's in general with ZERO verified or secure UNIX's. Merely attempts. I've got citations going back to the 70's supporting a minimal-TCB, kernelized model as superior to unrestrained, monolithic kernels in security. Love to see what his counterpoint is on that. Well, counterpoints given the volume and quality of evidence on my side. :)

I don't often see you commenting.

I personally, as a Rust guy, was disappointed by the loss of M:N threading; without that, you cannot effectively use threads as a design tool rather than simply ("simply" :-)) a parallelization/performance tool.

Erlang has a lot of neat ideas.

I thought the idea in Rust was to provide solid primitives so that libraries can then provide m:n/green threading if wanted. And sure enough we have stuff like coio and mioco which provide coroutines. Is there something missing in those that you'd want from m:n threading?

m:n does not an erlang make though (witness Go), and Rust was not headed in that direction.

I know, I know. And Rust is a better systems language for it.

Still, I can dream.

Take a look at pony-lang. It's sorta like what Rust was once going to be. It's sort of implicitly going for a bare-metal Erlang-shaped thing.

There's a lot missing, not the least of which is a set of abstractions over the rich primitives which provide simple programming interfaces that don't force a lot of the language's complexity on you, but if what you want is something that might one day be a really fast Erlangish it's probably the front-runner at the moment.

I still have high hopes for BEAMJIT existing as a production-grade thing someday. That way I don't have to drop into C in my Erlang applications as often. Sure BEAMJIT will probably never be as fast, but if it can be good-enough while also not forcing me across the safety boundary to get to that good-enough state... that's a huge win.

I'd say he's half-right on the M:N scheduling.

For implementing POSIX threads, M:N threading has a lot of extra complexity and few advantages, and worse performance.

But if you don't have POSIX threads, but something higher-level that's managed by a runtime anyway, you don't need to worry about things like ensuring enough VM space and at least one allocated page for each thread stack.

Erlang's and GHC's threads start out needing much less space than a single page, so you can start 10x more threads in the same amount of RAM.

Solaris used to have an M:N scheduling model and they've mostly abandoned it, didn't they? I wonder what Erlang got right that Solaris got wrong.

Erlang is a low-throughput, low-latency language. It makes no apologies for the former and it even says so on the tin. That allows it to do stuff other languages/runtimes can't.


His point if there is one is that:

>> Unikernels are entirely undebuggable.

For which evidence he provides the sound of some applause at some random conference and I don't believe is even remotely true or is so general to be vacuous.

Your notion of well-argued differs greatly from mine.

Unikernels solve an important class of problems that Joyent is interested in solving with another platform.. it's own.

I see no mention of a significant benefit of unikernels: their decreased attack surface.

There is a fairly new community web site at http://unikernel.org/ for those interested.

  I see no mention of a significant benefit
  of unikernels: their decreased attack surface.
The entire 4th paragraph of the article is about the decreased attack surface.

Edit: OK, not the entire paragraph, but it covers the topic.

Don't forget the following argument:


Yes losing more respect for the guy with every passing hour.

Don't Google his exchanges with David Miller then.

That happened in 1996. People make mistakes. Give it a rest.

No, it's valid. This same person hasn't ever apologised, but then turned around making out to be the bastion of womens' equality. The hypocrisy is stunning.

The comment about using it as a reference for an OS class greatly disturbed me. Hopefully it was facetious.

Oh - where did he say that?

Well if you can't run more than one process then how are you supposed to debug your application?

It's also worth considering that Joyent is likely feeling threatened by Docker picking up unikernel wonks, because until now Joyent has been able to keep Docker in a "containerization and deployment" box, competing on other things in SmartOS and other parts of their stack. Now Docker is pretty plainly coming for more of their pie (with the winds of "hot new valley thing" behind them, which Joyent distinctly lacks), and this wouldn't be the first time Cantrill has blogged out of feeling threatened or angry.

I don't agree with minimizing arguments based upon personal opinions of the author. Please don't interpret my comment that way. I'm just pointing out context, which is worth considering when evaluating the arguments.

I don't think the posting had anything to do with Joyent's strategy in the market, but everything to do with him being goaded on Twitter into writing the article: https://twitter.com/bcantrill/status/690215406317875200

The man is his own worst enemy.

The arguments are either sound or they're not. Why does it matter who wrote them and what motives he or his employer might have?

Cherry-picking examples that match own interests. If you know less about the subject than the author then you can't verify whether they base their arguments on all available information or not.

Or put another way. You have two people presenting arguments for database designs. Both base them on known and cited research papers. Both provide sound arguments for their design. One is an independent researcher and one works for Oracle. What are the chances one of the designs will be very similar to Oracle's database?

When someone offers, with no evidence, a bunch of claims than smoking is in fact safe, and their Employer is BAT, it's quite relevant.

Big upvotes for this article. I'm glad it was written, because I've seen nothing but hype for Unikernels on Hacker News (and in ACM, etc.) for the last 2 years. It's great to see the other side of the story.

The biggest problem with Unikernels like Mirage is the single language constraint (mentioned in the article). I actually love OCaml, but it's only suitable for very specific things... e.g. I need to run linear algebra in production. I'm not going to rewrite everything in OCaml. That's a nonstarter.

An I entirely agree with the point that Unikernel simplicity is mostly a result of their immaturity. A kernel like seL4 is also simple, because like unikernels, it doesn't have that many features.

If you want secure foundations, something like seL4 might be better to start from than Unikernels. We should be looking at the fundamental architectural characteristics, which I think this post does a great job on.

It seems to me that unikernels are fundamentally MORE complex than containers with the Linux kernel. Because you can't run Xen by itself -- you run Xen along with Linux for its drivers.

The only thing I disagree with in the article is debugging vs. restarting. In the old model, where you have a sys admin per box, yes you might want to log in and manually tweak things. In big distributed systems, code should be designed to be restarted (i.e. prefer statelessness). That is your first line of defense, and a very effective one.

> The only thing I disagree with in the article is debugging vs. restarting. In the old model, where you have a sys admin per box, yes you might want to log in and manually tweak things. In big distributed systems, code should be designed to be restarted (i.e. prefer statelessness). That is your first line of defense, and a very effective one.

But if you never understand why it was a bad state in the first place you're doomed to repeat it. Pathologies need to be understood before they can be corrected. Dumping core and restarting a process is sometimes appropriate. But some events, even with stateless services, need in-production, live, interactive debugging in order to be understood.

> But some events, even with stateless services, need in-production, live, interactive debugging in order to be understood.

The question then becomes if it is reproducible since "debuggable when not running normally" seems to be the common thread of unikernels, such as being able to host the runtime in Linux directly rather than on a VM.

I think it if you try a low level language these kinds of things are going to bite you, but a fleshed out unikernel implementation could be interesting for high level languages, since they typically don't require the low level debugging steps in the actual production environment.

In either case unikernels have a lot of ground to cover before they can be considered for production.

You don't need to be able to log in to be able to support a remote debugging stub, though.

Running unikernels on SEL4 is a perfectly sane thing to do. SEL4 does not provide the network stack, or much application interface, so a unikernel is a great thing to put on top.

This would be awesome. Running unikernels on top of a formally verified layer sounds really interesting.

I really do feel like at the end of this container experiment we're going to reinvent microkernels. Possibly badly, but I'm hopeful that it won't work out that way.

Same here.

" I'm glad it was written, because I've seen nothing but hype for Unikernels on Hacker News (and in ACM, etc.) for the last 2 years. "

That much is true. I'm countering what I can where it gets overblown. Just part of something going mainstream... crowd effects and so on...

"The only thing I disagree with in the article is debugging vs. restarting. In the old model, where you have a sys admin per box, yes you might want to log in and manually tweak things. In big distributed systems, code should be designed to be restarted (i.e. prefer statelessness). That is your first line of defense, and a very effective one."

Additionally, there's no inherent reason that I see that unikernels are impossible to debug. We can debug everything from live hardware up to apps with existing tooling. So, if unikernels are lacking, it just means they're still young and nobody has adapted proven techniques to debugging them. I imagine the simple ones on simpler HW will make that even easier.

Apparently, you can't Google search for them:



Enough said about what? That there's no current debugging (my claim) or that it's impossible (his)?

"Now, could one implement production debugging tooling in unikernels? In a word, no"

That's a lie: you could implement it. Debugging has been implemented for everything from ASIC's to kernel mode to apps in other categories. Whether unikernel crowd wants to or will is another question. Not looking good there per Google. Him talking like it's impossible shows he has an agenda, though.

I'm guessing he didn't tell everyone to ditch the entire concept of Joylent's offering early on because they were lacking some important features or properties. Just a guess but I'd bet on it.

EDIT: Changing search terms to "debugging" "xen" "guests" got me two results showing foundations are already built. Weak compared to UNIX but there.



Yeah, that's kind of my point :-) I was being a tad sarcastic with the hashtag.

Gotcha. Cool.

> The biggest problem with Unikernels like Mirage is the single language constraint

Do you write all your software in C? Of course not. The single language constraint doesn't exist, for the same reasons we can write Go software that runs on the Linux kernel

But the entire point is that unikernels aren't like the Linux kernel, that there isn't a userspace/kernelspace boundary in the way that there is on Linux and other traditional OSes.

From the paper (http://unikernel.org/files/2013-asplos-mirage.pdf):

> We present unikernels, a new approach to deploying cloud services via applications written in high-level source code. Unikernels are single-purpose appliances that are compile-time specialised into standalone kernels, and sealed against modification when deployed to a cloud platform. In return they offer significant reduction in image sizes, improved efficiency and security, and should reduce operational costs. Our Mirage prototype compiles OCaml code into unikernels that run on commodity clouds and offer an order of magnitude reduction in code size without significant performance penalty.

> An important decision is whether to support multiple languages within the same unikernel. [...] The alternative is to eschew source-level compatibility and rewrite system components entirely in one language and specialise that toolchain as best as possible. [...] existing non-OCaml code can be encapsulated in separate VMs and communicated with via message-passing [...]

> We did explore applying unikernel techniques in the traditional systems language, C, linking application code with Xen MiniOS, a cut-down libc, OpenBSD versions of libm and printf, and the lwIP user-space network stack.

That is, there is absolutely a single language constraint, on purpose, as arguably the primary differentiation from non-unikernels.

Just because some concrete system call interface is missing doesn't mean you can't glue layers of code together from different languages, that's some logical jump I don't understand.

The single language constraint you're continuing to claim exists is addressed by their own blog posts: https://mirage.io/blog/modular-foreign-function-bindings . After reading this, I can see approximately 50 lines code that would let me run a Python interpreter (another mostly memory-safe language, btw) within the unikernel, assuming enough of the Unix syscall interface was already implemented in shims back into OCaml land, something I presume will likely be released in the coming months in preparation for making their stuff useful to the general public.

Yes, you can, but you shouldn't. The continuation of the third passage I quoted explains why they think that several of the advantages of unikernels disappear if you try writing things in not-OCaml. It looks like the instructions you're linking is intended for third-party libraries that already exist in C, not for entire applications (although, yes, that would work).

I mean, you also can port applications to Linux kernelspace. (Remember the Tux web server?) But that's not really the point of Linux, and if you want that, you should... use a unikernel.

That said, sure, it's entirely possible that as they shift focus from an academic project to a commercial one, they'll give up on this distinction and its performance advantages, and start marketing a product that lets you just write C. (Just like they may well give up on hypervisor-based parallelism and add fork().) But that's not how they're currently envisioning the concept.

High-assurance security approaches on separation/MILS kernels have been doing this successfully for years. It's common for those RTOS's to have a native target on microkernel, a safety-critical runtime for Ada/Java, a featured runtime for them, a POSIX layer, user-mode Linux... all containing pieces of the system or even an application working together through robust middleware.

So, it's a proven model that's literally flying through the air right now due to aerospace take-up. It would likely work for unikernels, too, so long as they included same checks/mediation at interface points or middleware that prior model required. The only real questions should be about the resulting attributes of that system: is it a good approach vs regular unikernels w/ performance, containment, etc (theory vs practice)? Or just ditch them to enhance separation kernels, micro-hypervisor platforms, or capability systems?

Personally, I'm not sold on unikernels for resilience: prior, security models were better, field-proven, and survived advanced pentesting. Under-utilized imho. Cross-language is similar in both, though, with attributes of one application likely carrying to other. The real problem is the TCB being complex & insecure, breaking isolation paradigm.

Non-C programs on Linux either implement the C ABI (or the Linux system call interface) or run via interpreters that do.

Can your preferred language link to the OCaml ABI? Do you have an interpreter written in OCaml for it?

It's much easier than you think. OCaml can expose functions with a C ABI. You can put newlib on top of MirageOS, calling down to OCaml for basics (and newlib needs only a handful of IO and memory functions to be provided, so it isn't particularly hard), and then you have instant POSIX. Now you can run most anything you want from C-land.

Are we talking about microservices that talk over TCP of some sort? If that is the case it is a moot point as the other end need not even compile against a unikernel at all, it can sit on a traditional OS.

Exactly. They seem to be creating unsolvable, strawman problems instead of assuming we start with the middleware approach to getting things to work together. And there's both very efficient and very robust tooling available for that these days.

No, we're talking about running things under unikernels.

I mean yes, you could have some of your app implemented in OCaml and then it talks over TCP to some Fortran running on Linux to do linear algebra, but that approach has its own problems.

You are splitting the problem at the wrong place. Microservices is a well known thing and is what is providing the fire under unikernel research, you cannot simply ignore the aspect of "do the hard thing in a different environment" when talking about microservices.

IMO the popularity of microservices is a lot more about languages that provide very weak isolation guarantees (i.e. Ruby). OCaml is a strongly typed language with an excellent module system, and so microservices offer relatively little value.

That is no different from the days C was only meaningful to UNIX users.

The languages have to implement the OS ABI, whatever that is.

Lisp Machines had Fortran, Ada and C compilers available.

Interesting article. Rather than arguing what can or cannot be done or what might or might not work, here's some code, and some history.

Here's full-mixed-language programmable, locally- and fully-remote-debuggable, mixed-user and inner-mode processing unikernel, and with various other features...

This from 1986...


FWIW, here's a unikernel thin client EWS application that can be downloaded into what was then an older system, to make it more useful for then-current X11 applications...

From 1992...


Anybody that wants to play and still has a compatible VAX or that wants to try the VCB01/QVSS graphics support in some versions of the (free) SIMH VAX emulator, the VAX EWS code is now available here:


To get an OpenVMS system going to host all this, HPE has free OpenVMS hobbyist licenses and download images (VAX, Alpha, Itanium) available via registration at:


Yes, this stuff was used in production, too.

Not only was it used in production, from the first-hand anecdotal accounts I've heard, the VAX/VMSclusters were near-z/OS level of reliability. For a brief time, it was used both in mission-critical environments as well as in academic institutions (basically two of the three large markets that existed during that era).

Every 10 years, the same thing gets re-invented. Take network block devices/clustered sharing. VMS had high-availability and each node you joined could use it's local disk as an aggregate resource. In the 90s you had AndrewFS and CODA (CMU's golden age IMO). Then Linux had the whole DRDB era which gained traction about 10 years ago right around the time Hadoop was gaining traction. OpenStack has Cinder. 10 years from now we'll have something else.

Anyways, great points and good post. VAXstations are available on ebay for pretty cheap, but I'd personally go with a hobbyist OpenVMS Alpha license running on ES40. I threw a setup together a few years back and it was neat. Thanks for the data-sheet, my father will get a huge kick out of it.

There are presently OpenVMS servers and clusters in production in a number of locations, and new configurations are being installed — primarily for existing applications, obviously.

The most recent OpenVMS release shipped in June 2015, and the next release is due to ship in March 2016.

There's a port to x86-64 underway, as well.

For those looking for hardware for hobbyist use, used Integrity Itanium servers are usually cheaper than used working Alpha and VAX gear, and newer — working VAX and Alpha gear has become more expensive in recent years. Various VAX and Alpha emulators are available, either as open source or available to hobbyists at no cost.

Well that is pretty provocative :-) Bryan might be surprised to learn that for its first 15 years of its existence NetApp filers were Unikernels in production. And they out performed NFS servers hosted on OSes quite handily throughout that entire time :-).

The trick though is they did only one thing (network attached storage) and they did it very well. That same technique works well for a variety of network protocols (DNS, SMTP, Etc.). But you can do that badly too. We had an orientation session at NetApp for new employees which helped them understand the difference between a computer and an appliance, the latter had a computer inside of it but wasn't progammable.

When NetApp started publishing about how their filers worked, I saw it as a great demonstration of the power of computer science.

Their patents kept direct competitors away, but the core ideas of the write-ahead filesystem had a big impact on an industry that was going ahead anyway and implementing these in kernels, userspace, etc.

Then came the SSD and they had no brilliant IDE of how to make more of that...

Had they had a more mainstream platform they could have caught up with "code moving" instead of "data moving"? because you could run established software like Hadoop. If your core OS is Linux, Windows or something mainstream you can get all kinds of software but if you have to port it to something codesigned in 1993 you probablt won't to do

Hi Chuck!

At risk of speaking for Bryan, I think the difference between a NetApp and a unikernel-in-a-hypervisor is the sharedness of it. Without taking a position on Bryan's article (it's always entertaining to read his thoughts), I think his point is that that advantages of a unikernel are largely washed away, and the disadvantages are emphasized.

While Bryan is somewhat bombastic (more fun to read), there's a lot of smart in this article, I think.

Some of the original "unikernel in a shared environment" research and development was done on IBM's VM system in the 70's. Their motivation was that there were many customers who used a dedicated application on what was then a mainframe to process their data (this would be analogous to their unikernel) and they wanted to consolidate their hardware, so they got a bigger mainframe (like an IBM 370 at the time) and they would run all of these dedicated applications in their own logical partition (LPAR). It brought three huge benefits to the table (error containment, hardware consolidation, and backward compatibility). Because IBM had experienced the same effect that we're seeing today, their new mainframes in the 70's were so much more powerful than the ones from the 50's and 60's, and for them what was worse the ones from the 50's and 60's running their dedicated apps were still fine. So they developed a way to make computing more cost effective for their customer and at the same time opened the market further for their mainframes.

Today, a typical dual core 8GB x86 machine can, as a dedicated machine run a lot of things. At the same time the evolution of open systems have brought "continuous configuration integration" into the mainstream, all major OSs from OS X to Windows to Linux have weekly, sometimes daily, reconfiguration events.

And while the number of the changes in aggregate are high, the number of changes on particular subsystems are low. Unikernels answer the need of creating a stable enough snapshot of the world to allow for better configuration management. Look at the example of the FreeBSD system taken down after 20 years. Some services can just run and run and run.

Image isolation is a thing, and you can only be as good as your underlying software and hardware can make you, but it can also be a big boost to operational efficiency if that can simplify your security auditing and maintenance.

So my take on Bryan's article was that he came at the argument from one direction, which is fine, but to be more through it would help to look at it from several directions. What was worse was that he made some assertions (like never being in production) before defining precisely what he means by a Unikernel which leaves him wide open to examples like NetApp and IBM's VM system to counter his assertion.

The nice thing about computers these days is that many of the problems we experience have been experienced in different ways and solved in different ways already, and we can learn from those. The Unikernel discussion is not complete with looking through the history of machines which are dedicated appliances (from Routers, to Filers, to Switches, to Security camera archivers)

Like most things, I don't think unikernels are a panacea but they also aren't the end of the world and have been applied in the past with great success.

> the difference between a computer and an appliance, the latter had a computer inside of it but wasn't progammable.

An appliance is a computer that is programmable, just not by the person that owns it.

Sometimes it is owned by the vendor who rents it to the user.

Isn't it BSD based or that happened only after "first 15 years"?

> Unikernels are entirely undebuggable

I'm pretty sure you debug an Erlang-on-Xen node in the same way you debug a regular Erlang node. You use the (excellent) Erlang tooling to connect to it, and interrogate it/trace it/profile it/observe it/etc. The Erlang runtime is an OS, in every sense of the word; running Erlang on Linux is truly just redundant, since you've already got all the OS you need. That's what justifies making an Erlang app a unikernel.

But that's an argument coming from the perspective of someone tasked with maintaining persistent long-running instances. When you're in that sort of situation, you need the sort of things an OS provides. And that's actually rather rare.

The true "good fit" use-case of Unikernels is in immutable infrastructure. You don't debug a unikernel, mostly; you just kill and replace it (you "let it crash", in Erlang terms.) Unikernels are a formalization of the (already prevalent) use-case where you launch some ephemeral VMs or containers as a static, mostly-internally-stateless "release slug" of your application tier, and then roll out an upgrade by starting up new "slugs" and terminating old ones. You can't really "debug" those (except via instrumentation compiled into your app, ala NewRelic.) They're black boxes. A unikernel just statically links the whole black box together.

Keep in mind, "debugging" is two things: development-time debugging and production-time debugging. It's only the latter that unikernels are fundamentally bad at. For dev-time debugging, both MirageOS and Erlang-on-Xen come with ways to compile your app as an OS process rather than as a VM image. When you are trying to integration-test your app, you integration-test the process version of it. When you're trying to smoke-test your app, you can still use the process version—or you can launch (an instrumented copy of) the VM image. Either way, it's no harder than dev-time debugging of a regular non-unikernel app.

> You don't debug a unikernel, mostly; you just kill and replace it

Curious as to how you would drive to root cause the bugs that caused the crash in the first place? If you don't root cause, won't subsequent versions still retain the same bugs?

There are bugs that can only manifest themselves in production. Any system where we don't have the ability to debug and reproduce these classes of problems in prod is essentially a non-starter for folks looking to operate reliable software.

Personally, I wouldn't use unikernels for my custom app-tier code† (unless my app-tier either was Erlang, or had a framework providing all the same runtime infrastructure that Erlang's OTP does.)

Instead, unikernels seem to me to instead be a good way to harden high-quality, battle-tested server software. Of two main kinds:

• Long-running persistent-state servers that have their own management capabilities. For example, hardening Postgres (or Redis, or memcached) into a unikernel would be a great idea. A database server already cleans and maintains itself internally, and already offers "administration" facilities that completely moot OS-level interaction with the underlying hardware. (In fact, Postgres often requires you to avoid doing OS-level maintenance, because Postgres needs to manage changes on the box itself to be sure its own ACID guarantees are maintained. If the software has its own dedicated ops role—like a DBA—it's likely a perfect fit for unikernel-ification.)

• Entirely stateless network servers. Nginx would make a great unikernel, as would Tor, as would Bind (in non-authoritative mode), as would OpenVPN. This kind of software can be harshly killed+restarted whenever it gets wedged; errors are almost never correlated/clustered; and error rates are low enough (below the statistical noise-floor where the problem may as well have been a cosmic ray flipping bits of memory) that you can just dust your hands of the responsibility for the few bugs that do occur (unless you're operating at Facebook/Google scale.) This is the same sort of software that currently works great containerized.


† The best solution for your custom app-tier, if you really want to go the unikernels-for-everything route, might be a hybrid approach. If you run your app in both unikernel-image, and OS-process-on-a-Linux-VM-image setups, you'll get automatic instrumentation "for free" from the Linux-process instances, and production-level performance+security from the unikernel instances. You could effectively think of the Linux-process images as being your "debug build."

Two ways of deploying such hybrid releases come to mind:

• Create cluster-pools consisting of nodes from both builds and load-balance between them. Any problems that happen should eventually happen on an instrumented (OS-process) node. This still requires you to maintain Linux VMs in production, though.

• Create a beta program for your service. Run the Linux-process images on the "beta production" environment, and the unikernel-images on the "stable production" environment. Beta users (a self-selected group) will be interacting with your new code and hopefully making it fall down in a place where it's still instrumented; paying customers will be working with a black-box system that—hopefully—most of the faults will have been worked out of. Weird things that happen in stable can be investigated (for a stateful app) by mirroring the state to the beta environment, or (for a stateless app) by accessing the data that causes the weird behavior through the beta frontend.

> In fact, Postgres often requires you to avoid doing OS-level maintenance, because Postgres needs to manage changes on the box itself to be sure its own ACID guarantees are maintained.

Huh? What are you talking about?

With unikernels you get a lot more consistency. E.g. I once saw a bug that came down to one server using reiserfs and another using ext2. But there's no way to have that problem with a unikernels.

But sure, you need a debugger. So you use one. I'm not sure why the author seems to think that's so hard.

> But sure, you need a debugger. So you use one. I'm not sure why the author seems to think that's so hard.

The author wrote and continues to contribute to DTrace, which is an incredibly advanced facility for debugging and root causing problems. GDB (for example) doesn't help you solve performance problems or root-cause them, because now your performance problem has become ptrace (or whatever tracing facility GDB uses on that system).

The point he was making is that there are problems with porting DTrace to a unikernel (it violates the whole "let's remove everything" principle, and you couldn't practically modify what DTrace probes you're using at runtime becuase the only process is your misbehaving app -- good luck getting it to enable the probes you'd like to enable).

You can't modify them from within your app, sure. You modify them from a (privileged) outside context. Allowing the app to instrument itself that thoroughly violates the principle of least privilege the author was so fond of.

> With unikernels you get a lot more consistency

That's not unique to unikernels. You can get that with Docker containers or EC2 instances on AWS.

There's a lot more that can happen differently there. Docker doesn't hide all the details of the filesystem, kernel version, or the like. With EC2 instances you're still running a kernel that has a lot of moving parts of its own.

> . With EC2 instances you're still running a kernel that has a lot of moving parts of its own.

I'm sure that's true, but it's not a statement directly about consistency.

It's a lot harder to get consistency out of a non-unikernel system running on EC2 - e.g. IME the linux boot/hardware probing process can behave nondeterministically before it even starts running your user program.

The problem is how to debug sporadic production problems. It doesn't matter how good your tooling is to debug a problem in a dev environment if you have no idea how to reproduce it there.

You need some level of decent logging in the production environment (with optional extra logging that can be turned on) to capture WTF went wrong. THEN you can try to reproduce it. When a logged system goes boom, you need it to come out of production and remain around until those logs are saved somewhere.

It is old, but http://lkml.iu.edu/hypermail/linux/kernel/0009.1/1307.html is an interesting data point about IBM's experience. They implemented exactly the kind of logging that I advocate in all of their systems until OS/2. And the fact that they didn't have it in OS/2 was a big problem.

You're right. Coming from the Erlang world, I almost implicitly assume good logging (and extremely useful task-granular crashdumps.) If you're implementing your app on a custom unikernel "platform" that isn't basically-an-OS, the first and most important step is logging the hell out of everything.

But that's the case for every system of sufficient complexity (that I've seen).

There's an alternate approach that I was mentally contrasting with. You tend to see it with e.g. Ruby apps on Heroku: you can't "log into" a live app, but you can launch an IRB REPL—with all the app's dependencies loaded—in the context of the live app.

Having this ability to "futz around in prod" frequently obviates the need for prescient levels of logging. You can poke at the problem until it crashes for you.

Ruby folks, don't let not being in Heroku stop you from taking this approach, because it's the best. pry-remote (are you using irb instead of pry? please stop) will give you the same fantastic behavior. It's part of every persistent Ruby service I deploy (guarded for internal access only, of course, don't be crazy).


I haven't tried deploying a unikernel in production yet - I've been mostly using/debugging only OCaml code on Linux in production - but it should be possible to implement the kind of logging you describe. For example I've seen a project that would collect&dump its log ring-buffer when an exception was encountered in a unikernel, or one that collects very detailed profiling information from a unikernel.

It would be nice to have some kind of a "shell" to inspect details about the application when something goes wrong, but that applies equally to non-unikernel applications. The difference is that with unikernels the debugging tool would be provided as a library, whereas with traditional applications debugging is provided as a language-independent OS tool.

Just to add some links for those (I assume these are the ones you mean):

Dumping the debug-level log ring-buffer on exception:


Detailed profiling (interactive JavaScript viewer for thread interactions):


I use Go on App Engine and have never been able to SSH or GDB those machines. Nevertheless, I am still able to debug issues in my app.

I admit that debugging is easier on platforms where you get more control.

> It's only the latter that unikernels are fundamentally bad at.

Well, that's the point of the article.

What good does restarting your service do if the issue will stay there, and come back again later?

It's also not true though.

Running a LING unikernel gets you a huge set of tracing and debugging capabilities that aren't 100% compatible with what's provided by BEAM, but it's close and a couple orders of magnitude better than what almost any other language provides for mucking with a live system; unikernel or not.

Running Erlang via Rumprun gets you a standard BEAM/erts release packaged as a VM image or a bootable ISO, and that is just straight up Erlang. So all the absolutely excellent debugging, tracing, and profiling facilities that any random Erlang application has access to are also accessible when deployed as a unikernel (rumpkernel).

Exactly my thoughts. In addition, I doubt that the author is familiar with the principles of OTP (or immutable services in general), or he wouldn't dismiss restarting a service once it misbehaves as impractical.

I think most people are unfamiliar with Erlang/OTP. Even if I converted my entire team over, I don't know how I'd hire new people without developing strong internal training.

Perhaps that means the real argument is "Unikernels are unfit for production for people who aren't comfortable with the Erlang/OTP way of doing things." Yes, this isn't a technical argument, but—most people, most customers, don't even see SmartOS + Linux emulation as suitable for production (and I imagine the author knows that), not for any technical reason but just for unfamiliarity. And that's still a UNIX working in UNIXy ways.

Hell, a lot of the base docker images I've been using lately don't even come with netstat or vim on them.

I'm already feeling the "can't debug" pinch and we haven't even started getting into unikernels yet.

I thought the idiomatic way of debugging containerized applications was from the host system, not from inside the container. The tooling surely can still be improved, but fundamentally the host has full view to the innards of containers.

I can't tell if I'm exposing the wrong port(s) (or rather, PORTing the wrong EXPOSES, please shoot whoever inverted that terminology). I also can't easily tell if I'm getting relative paths wrong, and I can't rapidly iterate on fixing the problem. That's where you really feel the debugging problems. The drag of day to day troubleshooting of new configuration or tools can still be pretty challenging.

Stuff tends to stay working once you get it working though, so that evens up the score a bit, but I'd really love it if getting there were smoothed out a bit.

Are you talking about debugging code running a container or debugging Docker? Because we're all talking about the former, not the latter.

Ah, I tend to define "debug" not as "the act of using a debugger tool" but as "diagnosing the broken things".

Makes me fairly useful in a crisis, but possibly less so in this thread.

I would think a remote debugger would work just fine in a container, at least once you documented how, but to zokier's point the only reason you'd do that is if the problem isn't repeatable outside the container. To me that means a couple possibilities. Bug in the docker scripts, bug in the unikernal generator, or a timing issue, which a debugger won't help you with very much (timing bugs are often also heisenbugs)

It may well be the case that unikernels as currently envisioned by unikernel proponents are impossible to make fit for production; it may also well be the case that there exists a product that is closer to a unikernel than current kernels, that is quite production-suitable, and unikernels are fruitful research to that point.

For instance, you could imagine a unikernel that did support fork() and preemptive multitasking, but took advantage of the fact that every process trusts every other one (no privilege boundaries) to avoid the overhead of a context switch. Scheduling one process over another would be no more expensive than jumping from one green (userspace) thread to another on regular OSes, which would be a huge change compared to current OSes, but isn't quite a unikernel, at least under the provided definition.

Along similar lines, I could imagine a lightweight strace that has basically the overhead of something like LD_PRELOAD (i.e., much lower overhead than traditional strace, which has to stop the process, schedule the tracer, and copy memory from the tracee to the tracer, all of which is slow if you care about process isolation). And as soon as you add lightweight processes, you get tcpdump and netstat and all that other fun stuff.

On another note, I'm curious if hypervisors are inherently easier to secure (not currently more secure in practice) than kernels. It certainly seems like your empirical intuition of the kernel's attack surface is going to be different if you spend your time worrying about deploying Linux (like most people in this discussion) vs. deploying Solaris (like the author).

> you could imagine a unikernel that did support fork() and preemptive multitasking but took advantage of the fact that every process trusts every other one

No need to imagine, this is exactly how Microsoft Singularity worked (it benefited from a language expressive enough to make that trust possible)

Yeah, Singularity is an amazing existence proof of lots of cool stuff in the unikernel-ish space (though, sadly, not quite of anything being suitable for production).

Is Singularity a unikernel? More specifically, would the unikernel.org folks consider a production-ready kernel inspired by Singularity and targeting a hypervisor to be a unikernel? The 2013 paper's introduction section contains the sentence, "By targeting the commodity cloud with a library OS, unikernels can provide greater performance and improved security compared to Singularity [4]," so I'd imagine no. But I don't see any expansion on that point, so I suspect it was added to appease a reviewer.

I agree. SPIN did some similar stuff in a monolithic model that could probably help with such efforts:


Each tactic is obsolete in CompSci has surpassed them since then. So, in reality, it would be better than SPIN for sure. The Midori write-ups are probably going to be our new baseline for what hybrid models can pull off.

> Scheduling one process over another would be no more expensive than jumping from one green (userspace) thread to another on regular OSes, which would be a huge change compared to current OSes, but isn't quite a unikernel, at least under the provided definition.

Sounds more or less like any embedded kernel that runs on an MMU-less cpu. Do unikernels handle interrupts? If so that means they can have preemptive threads, rather than green.

This article is mostly FUD I think.

It comes off as a slew of strawmen arguments ... for example the idea that unikernels are defined as applications that run in "ring 0" of the microprocessor... and that the primary reason is for performance...

All of the unikernel implementations he mentioned (mirageos, osv, rumpkernels) all run on top of some other hardware abstraction (xen, posix, etc) with perhaps the exception of a "bmk" rumpkernel.

We currently have a situation in "the cloud" where we have applications running on top of a hardware abstraction layer (a monolithic kernel) running on top of another hardware abstraction layer (a hypervisor). Unikernels provide a (currently niche) solution for eliminating some of the 1e6+ lines of monolithic kernel code that individual applications don't need and introduce performance and security problems. To dismiss this is as "unfit for production" is somewhat specious.

I wonder if Joyent might have a vested interest in spreading FUD around unikernels and their usefulness.

Your argument in favor of unikernels assumes that we're stuck with hardware virtualization as the lowest layer of the software stack. What if cloud providers offered secure containers on bare metal, under a shared OS kernel? That's what Joyent provides. So yes, Joyent has a vested interest in calling out the problems with unikernels. But I think their primary motive is that they truly believe containers on bare metal are a superior solution.

Speaking for myself, that's exactly why I work at Joyent. I believe in OS virtualisation (whether you call them zones or containers) for multi-tenancy, in high quality tools for debugging both live (DTrace) and post mortem (mdb), and in open source infrastructure software (SmartOS, SDC, etc).

I also believe that as an industry and a field, we should continue to build on the investments we've already made over many decades. The Unikernel seems, to me at least, to be throwing out almost everything; not just undesirable properties, but also the hard-won improvements in system design that have fired so long in the twin kilns of engineering and operations.

Isn't it possible, therefore, that Unikernel and Joyent's virtualisation serve different purposes and are definitely not meant to be interchangeable?

>What if cloud providers offered secure containers on bare metal, under a shared OS kernel?

Then they're offering a very similar thing. And the questions then are things like:

What should the interface between contained and outside look like?

Is there value in running a traditional unix userland inside the container?

What kind of code do we want to run inside the container?

IMO the unikernel answers to these questions are better. The Unix userland is an accident of history; if unikernels had come first we wouldn't even think of it.

I'm still a little confused as to why he assumed that unikernels will run under virtualization in production. The whole point of a unikernel is to get rid of a layer of abstraction, isn't it? So why would it make sense to replace that with another, thicker layer? Any non-toy unikernels are going to have to run on bare metal to actually get those performance gains. Kinda like how you don't run your docker host in a VM - you don't need to, because it replaces VMs. If you assume that a unikernel (or docker host) is going to run in a VM, of course it will look pointless.

I think the problems with this article are well covered already. Just a suggestion for Joyent: articles like this are damaging to your excellent reputation, would suggest a thin layer of review before hitting the post button!

Some additional meat:

- The complaint about Mirage being written in OCaml is nonsense, it's trivial to create bindings to other languages, and in 40 years this never stopped us interfacing our e.g. Python with C.

- A highly expressive type/memory safe language is not "security through obscurity", an SSL stack written in such a language is infinitely less likely to suffer from some of the worst kinds of bugs in recent memory (Heartbleed comes to mind)

- Removing layers of junk is already a great idea, whether or not MirageOS or Rump represent good attempts at that. It's worth remembering that SMM, EFI and microcode still exist on every motherboard, using some battle-tested middleware like Linux doesn't get you away from this.

- Can't comment on the vague performance counterarguments in general, but reducing accept() from a microseconds affair to a function call is a difficult benefit to refute in modern networking software.

> an SSL stack written in such a language is infinitely less likely to suffer from some of the worst kinds of bugs in recent memory (Heartbleed comes to mind)

While you are right about OCaml being safer than C, Heartbleed was a pretty lame bug, it doesn't even give an attacker remote code execution. Something like CVE-2014-0195 is far more dangerous than Heartbleed but it didn't have a marketing name and large amounts of press coverage.

> While you are right about OCaml being safer than C, Heartbleed was a pretty lame bug, it doesn't even give an attacker remote code execution.

But it did give the attacker your server's private keys. And client private data.

A bug in DTLS will not get attention because people don't run DTLS.

Yea probably a poor choice of a OpenSSL vulnerability, I was assuming this was on by default even when using TLS like lot's of other OpenSSL features but then I found this line, "Only applications using OpenSSL as a DTLS client are affected."[1]

CVE-2012-2110 is probably a better choice.

[1]: https://www.openssl.org/news/secadv/20140605.txt

CVE-2012-2110 is probably a better choice.

From the openssl advisory[0], "In particular the SSL/TLS code of OpenSSL is not affected.".

[0] - https://www.openssl.org/news/secadv/20120419.txt

I'm happy for this article because it does hit some points on the head. Other points are deeply entrenched in Bryan's biases, but I can't really fault him for that.

In particular, I am suspicious of the idea that unikernels are more secure. Linux containers make the application secure in several ways that neither unikernels nor hypervisors can really protect from. Point being a unikernel (as defined) can do anything it wishes to on the hardware. There is no principle of least-privilege. There are no unprivileged users unless you write them into the code. It's the same reason why containers are more secure than VMs.

Users are only now, and slowly, starting to understand the idea that containers can be more secure than a VM. False perspectives and promises of unikernel security only conflate this issue.

That said, I do think the problems with unikernels might eventually go away as they evolve. Libraries such as Capsicum could help, for instance. Language-specific or unikernel-as-a-vm might help. Frameworks to build secure unikernels will help. Whatever the case, the problems we have today are not solved or ready for protection -- yet.

This blog post was clearly spurred by the acquisition made by Docker (of which I am alumnus). I think it's a good move for them to be ahead of the technology, despite the immediate limitations of the approach.

The essential point the lengthy article makes revolves around debugging facilities for unikernels. While mostly true for MirageOS and the rest of the unikernel world today, OSv showed that it is quite possible to provide good instrumentation tooling for unikernels.

The smaller point about porting application (whether targetting unikernels that are specific to a language runtime or more generic ones like OSv and rumpkernels) is the most salient, it will probably restrict unikernel adoption.

For docker, if only to provide a good subtrate for providing dev environments for people running windows or Mac computers, it is very promising.

What porting? We have quite a few pieces of software in rumprun-packages which require ZERO (0) porting or patches to function as unikernels, e.g. haproxy, mpg123 and php. Feel free to check them out for yourself if you don't want to take my word for it.

I think Bryan Cantrill and Joyent are doing a number of interesting things, but this reads more like an ad than a genuine critique of Unikernels.

    The primary reason to implement functionality in the
    operating system kernel is for performance: by avoiding
    a context switch across the user-kernel boundary,
    operations that rely upon transit across that boundary
    can be made faster.
I haven't heard this argument made once. There are performance benefits (smaller footprint, compiler optimization across system call boundaries, etc...). However, the primary benefit is not performance from eliminating the user/kernel boundary.

    Should you have apps that can be unikernel-borne, you
    arrive at the most profound reason that unikernels are
    unfit for production — and the reason that (to me,
    anyway) strikes unikernels through the heart when it
    comes to deploying anything real in production:
    Unikernels are entirely undebuggable.
If this were true, and an issue, FPGAs would also be completely unusable in production.

"If this were true, and an issue, FPGAs would also be completely unusable in production."

BOOM! And kernels. And ASIC's. And so on. Yet, we have tools to debug all of them. But unikernels? Better off trying to build a quantum computer than something that difficult...

It's not the same as using something like DTrace on a live system, but he's describing it as though eliminating the flexibility implies some sort of event horizon.

This also bothered me...

    virtualizing at the hardware layer carries with it an inexorable performance tax
Hardware has been adding a lot of virtualization support over the last decade, and it's generally been a net performance gain as far as I'm aware. I could see things like IO scheduling potentially being an issue. Although, IO performance on top of operating systems in general does not particularly inspire confidence.

Next time you see that, point out that the hardware is itself basically virtualized to monolithic OS's with microcode, shared I/O, and multiplexed buses. A version could work for unikernels if that worked for UNIX. Performance issues come more from how it's applied than the concept itself.

I think you guys might find Arrakis interesting: https://arrakis.cs.washington.edu/ it won Best Paper at OSDI '14 and demonstrates a possible way to better use things like virtio.

That was interesting. Thanks for the link. I'll read the full papers later. Meanwhile, both Arrakis and FlexNIC are now in my collection. :)

First, let's put aside the start of the blog post, which consists entirely of empirical questions. Each potential adopter of unikernels will have to figure out for themselves wether their specific use-case justifies the cost and benefit of this particular technology, just like all others.

Putting that aside, debuggability is an obvious and pressing issue to production use-cases. Any proponent of unikernels that denies that should be defenestrated. I haven't come across any that do.

How to go about debugging unikernels is unclear because it certainly is still early days. However, I don't think the lack of a command-line in principle precludes debuggability, nor does it my mind even preclude using some of the traditional tools that people use today. For example, I could imagine a unikernel library that you could link against that would allow for remote dtrace sessions. Once you have that, you can start rebuilding your toolchain.

P.S. Bryan, where's my t-shirt?

Exactly. The JVM has dandy debugging tools, but nothing command line-ish is any of them.

From TFA: "At best, unikernels amount to security theater, and at worst, a security nightmare."

As a security engineer, that's a good one sentence summary from my point of view of unikernels, since, forever.

I think the reason why unikernels are being developed is due mostly to ignorance, and if any of them is successful, it will morph into an OS that is closer to Mesos, Singularity, or even Plan9. That's faster, safer, more logical, etc.

I'm not sure how this is different from containers security. Provided you strip it down properly and not "my container is whole system + my binary", how is the exposure different exactly?

Both will prevent persistence, both are restricted outside, not internally. If anything I'd say that reduced number of devices give you lower attack surface over hypercalls (unikernels) than having direct access to all the syscalls (container).

What's the huge difference and where's the theater?

If you look at proper OS virtualization implementations (like Zones and jails), where syscalls that aren't safe just don't work, then the difference is more apparent.

The key thing to realize, I think, is that if you're using virtualization, a unikernel is nothing more than a process that uses a very strange system call API.

Yep. Which also means that whatever advantages they bring, we can probably get in a traditional system call API if designed the right way.

There was a highly interesting research project along these lines: https://arrakis.cs.washington.edu/

The Arrakis paper also show how much overhead the traditional UNIX architecture is (contrary to the authors assertions) for many popular workloads. It is now merged back into the project it was forked from (Barrelfish) which in itself is a very interesting research project. Well worth studying for those that don't believe UNIX is the last word in OS design.

Sure. Unikernels are just a way to a) use an extremely restricted system call API to enforce separation between processes b) have my processes be VMs for memory-safe languages for safety c) not implement a user/kernel mode distinction within a single application because it's just overhead at that point.

> a) use an extremely restricted system call API to enforce separation between processes b) have my processes be VMs for memory-safe languages for safety c) not implement a user/kernel mode distinction within a single application because it's just overhead at that point.

But the second two points are already covered by just writing an application and not running it in a virtual machine. Remember, your VMs are already running on an OS.

And the first -- I'm not convinced that qemu and x86 is all that much more restricted than a well jailed process. Given the complexity of the PC, and the number of critical Xen/KVM/... vulnerabilities, it certainly isn't trivial to emulate securely.

Note, there is one advantage to unikernels, and that's lower overhead access to network hardware than you get with the socket API. This advantage is also available with netmap.

> And the first -- I'm not convinced that qemu and x86 is all that much more restricted than a well jailed process. Given the complexity of the PC, and the number of critical Xen/KVM/... vulnerabilities, it certainly isn't trivial to emulate securely.

Xen has their own priorities and I have my views on their code quality. The interface is that much smaller that it should be much more possible to implement securely than the unix API (which isn't even well-defined). People elsewhere are talking about seL4; I hope we'll one day see a formally verified hypervisor. I don't think we'll ever see a formally verified unix container.

In the long run you're right, there's ultimately no difference between compiling my OCaml to a binary that runs on a secure formally verified microkernel and building it into a unikernel that runs on a secure formally verified hypervisor. But I can take steps towards the latter now - I can deploy unikernel systems to EC2 today, and while they may not be more secure than deploying processes to Joyent today, they're using an interface that should make it possible to run them on a more secure environment.

You can use seccomp to reduce the scope of the Linux system call API.

I think the real point here is that AWS exists, and so virtualization is the new baremetal, but the advantage over baremetal is the range of "hardware" is much more limited. You have virtio / XenPV disks, not twenty different SCSI devices you need to write drivers for and debug. Therefore writing interesting kernels directly to the virt layer and running those in the cloud makes sense.

In particular one where you get a fixed amount of memory and it's very hard to allocate any more.

I can't help but respond to each of these incredibly myopic comments that suggest some deficiency with unikernels that doesn't already exist with a traditional OS: this exact same problem exists on Linux, except there you have multiple layers of allocators managing the fixed 'heap' available.

There's no reason few allocators can't cope as well, or why Mirage couldn't implement support for memory ballooning like Linux does, or etc. etc.

Except the OP is talking about how running a Unikernel on qemu is basically the same as running a process on the host, except with a strange system call API (and one which might even be more complex than the real API). So talking about how ballooning or RAM hotplugging is clunky is very relevant indeed here.

It's not by any means the main point of the article, but: I'm not sure citing the Rust mailing list post on M:N scheduling is proof that it's a dead idea. The popularity of Go is a huge counterexample.

+Erlang, +Pony, +Clojure (also STM waves hello), +Haskell

I also found that reference highly dubious.

I mean the reasons Rust abandoned it were quite legitimate. As a systems language with originally segmented stacks that performed poorly and were thus removed, which mitigated anything resembling the "lightweight" promise of lightweight-tasks, and then combined with the overhead of having to maintain two distinctly different IO interaction interfaces due to a lack of unification between the standard runtime and the libuv based M:N scheduler... I mean of course Rust needed to punt that semantic out of the core runtime and move it to a domain where that kind of functionality could be implemented as a library/framework instead. Otherwise writing consistent IO libraries for Rust would be a massive pain in the ass.

The point of Rust isn't to be an opinionated framework that provides a set of prescriptive models for solving problems like Erlang/OTP does. The point is to be a generic systems language that you could use to build a new Erlang/OTP shaped thing with.

I realize I'm quite literally preaching to the head of the International Choir Association right now though. ;-)

I think Cantrill is doing a massive favour for those who are pro-Unikernels - he's essentially trolling them and will force them to come up with responses to some of the issues he's making.

Given how invested Joyent is in their current positions, I can see why Unikernels may seem a threat, but none of the things Cantrill has raised as concerns seem insurmountable.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact