The primary reason to implement functionality in the
operating system kernel is for performance...
But it’s not worth dwelling on performance too much; let’s
just say that the performance arguments to be made in favor
of unikernels have some well-grounded counter-arguments and
No, either they are faster, or they are not. If someone has benchmarks showing they are faster, then I don't care about your counter-argument, because it must be wrong. If you believe there are no benchmarks showing unikernels to be faster, then make a falsifiable claim rather than claiming we should "move on".
Are they faster? I don't know, but there are papers out there with titles like "A Performance Evaluation of Unikernels" with conclusions like "OSv significantly exceeded the performance of Linux in every category" and "[Mirage OS's] DNS server was significantly higher than both Linux and OSv". http://media.taricorp.net/performance-evaluation-unikernels....
I would find the argument against unikernels to be more convincing if it addressed the benchmarks that do exist (even if they are flawed) rather than claiming that there is no need for benchmarks because theory precludes positive results.
Edit: I don't mean to be too harsh here. I'm bothered by the style of argument, but the article can still valuable even if just as expert opinion. Writing is hard, finding flaws is easy, and having an article to focus the discussion is better than not having an article at all.
He made a bunch of wild claims without backing it up even with simple links to buqtraq/securityfocus/whatever providing evidence that hypervisors are inherently additive to your 'surface area' by which you can be attacked. He also, as you mentioned, failed to provide even cursory benchmarks, much less cite any 3rd party, academic, peer reviewed analyses. Thirdly, he asserted a false choice between unikernels and on-the-metal. There's nothing stopping you from firing up a heterogeneous environment, using unikernels when they perform well, and containers when the situation dictates them. So yeah, you weren't too harsh - IMO, your post was more well-balanced and thought out than his entire blog post. But hey, who knows, maybe he intentionally wanted to be incendiary so we'd all be talking about his company's product (in some capacity at least) on a slow Friday afternoon.
Right there aren't supposed to be multiple applications in there? But eventually there will. Things only start small, as a rule.
Look, we have had pages and memory management units since the 1960's already. Protection was considered performant enough to be wortwhile on mainframes built from discrete integrated circuits on numerous circuit boards, topping out at clock speeds of some 30 mHz. Fast forwarding 30 years, people were happily running servers using 80386 and 80486 boxes with MMU-based operating systems.
Why would I want to give up the user/kernel separation and protection on hardware that blows away the protected-memory machine I had 20 years ago.
Now that hypervisors are a mature commodity, this model is practical at smaller scale too: instead of running in separate physical computers, process run in separate virtual computers.
In short: unikernels make way more sense if you zoom out and think of your entire swarm of computers as a single computer.
For the remaining use cases where hypervisors are not secure enough, use physical computers instead.
For the remaining use cases where the overhead of 1 hypervisor per physical computer is not acceptable, build unikernels against a bare metal target instead. (the tooling for this still has ways to go).
If for your use cass hypervisors are not secure enough, 1 hypervisor per physical machine is too much overhead, and the tooling for bare metal targets is not adequate, then unikernels are not a good solution for your use case.
For most cloud deployments nowadays, the hypervisor is a given. Given that, why not get rid of the kernel?
Surely, you should be saying: why do you need all of a kernel and hypervisor and an app when you could subsume the app into the kernel and just run the hypervisor and the kernelized app (or single-appified kernel, call it what you want).
I'm having a hard time seeing the benefits given the obvious increase in complexity.
What features of a full-fat OS do unikernels retain? If the answer is very little because hypervisors provide all the hardware access then it would be fair to say that hypervisor has become the OS and the traditional kernel (a Linux one in this case I presume) has become virtually * ahem * redundant.
A hypervisor is an OS. A rose by any other name.
What? It's simpler. You remove all the overhead of separating the kernel and app. And that of running a full-featured multiuser, multiprocess kernel for the sake of a single app.
> What features of a full-fat OS do unikernels retain? If the answer is very little because hypervisors provide all the hardware access then it would be fair to say that hypervisor has become the OS and the traditional kernel (a Linux one in this case I presume) has become virtually * ahem * redundant.
Yes, that's entirely fair.
That's the point of unikernels right there :)
In any case, I was not arguing that hypervisors are superior to traditional operating systems. I was simply pointing out why the comparison of unikernels to macos8 and calling it a "regression to the stone age" was missing the point entirely, because of the distributed nature of modern applications.
All code is privileged, so any remote execution exploit in any piece of code makes you own the whole machine (physical or virtual, as the case may be). A buffer overflow in some HTML-template-stuffing code is as good as one in an ethernet interrupt routine. Wee!
That may or may not be true... In any case it's completely orthogonal to unikernels. Distributed applications, and any security advantages/disadvantages, are a fact of life.
> All code is privileged, so any remote execution exploit in any piece of code makes you own the whole machine (physical or virtual, as the case may be). A buffer overflow in some HTML-template-stuffing code is as good as one in an ethernet interrupt routine. Wee!
I'm afraid you're parroting what you learned about security without really understanding it. Yes, an exploit will give you access to the individual machine. But what does that mean if the machine is a single trust domain to begin with, with no privileged access to anything other than the application that is already compromised? In traditional shared systems, running code in ring0 is a big deal because the machine hosts multiple trust domains and privileged code can hop between them. That doesn't exist with unikernels.
Add to that the tactical advantages of unikernels: vastly reduced attack surface, a tendency to use unikernels for "immutable infrastructure" which means you're less likely to find an opportunity to plant a rootkit before the machine is wiped, and the fact that unikernels are vastly less homogeneous in their layout (because more happens at build time), making each attack more labor-intensive. The result is that the security story of unikernels, in practice and in theory, is very strong.
Really? Here's what I wrote in this very thread, just above your message: If for your use case hypervisors are not secure enough, 1 hypervisor per physical machine is too much overhead, and the tooling for bare metal targets is not adequate, then unikernels are not a good solution for your use case. 
At this point I believe we are talking past each other, you are not addressing (and apparently not reading) any of my points, so let's agree to disagree.
That does not sound practical at all.
From what I understand of MirageOS an impetus for the "security theatre," is that they believe libc itself is a vulnerability. Therefore no matter how secure their applications are they will always be vulnerable. They will be vulnerable to the host OS as well as any process it is running and the vulnerabilities they expose. It's not security by obscurity but a reduction in attack surface which is a well-known and encouraged tactic. I don't see any prepositional tautology there.
Yes they will still be reliant on Type-1 hypervisors... and Xen has had its share of vulnerabilities in the last year. That's a much smaller surface to worry about.
The other benefit is that jitsu could be one interesting avenue to further reduce the attack surface. Summoning a unikernel on-demand to service a single request has demonstrated to be plausible. Instead of leaving a long-running process running you have a highly-restricted machine summon your application process in an isolated unikernel for a few milliseconds before it's gone.
The kinds of architectures unikernels enable have yet to be fully explored. The ideas being explored by MirageOS are by no means new but they haven't been given serious consideration. They may not be "ready for production," yet but given some experimentation and formal specification it may yet prove fruitful.
From a "far enough" pov (but not too far...), how is that system different from a kernel running processes on-demand? Why any replacement for the libc would contain less vulnerability? Same question for replacing a kernel with an "hypervisor".
I feel I still don't know enough on these subject to think this whole game in the end consist in renaming various components, rewriting parts of them in the process for no real reason. But maybe this is actually it.
The gist of it is that for a typical DNS server you can boot a unikernel per-request to service the query instead of leaving a long-running process going on a typical server. You can boot these things fast enough to even map them to URLs and create a unikernel to serve each page of your site.
It's hard to approach it from the perspective of what you already know to be true and reliable. It's still experimental and we've only begun to explore the possibilities.
Now I'm not saying that it is bad to experiment, but if you don't come up beforehand with actual things to test that were not possible on current modern systems or at least substantially more difficult to do instead of quite the opposite, it's not a very useful experiment but just a curious contraption.
Because it would be written in better languages.
> Same question for replacing a kernel with an "hypervisor".
Because the hypervisor implements a much smaller, better specified API than a kernel does.
He doesn't need to present a great argument against security or performance. There doesn't even need to be such an argument. If you've ever spent six months trying to find out why a content management system blows up under the strangest of conditions, even when you have a full debug stack, you understand why that argument may be able to stand alone.
The place where his argument falls down, IMO, is, like others have said, in assuming that everything is binary: everything is unikernel or it is not. And that's just silly.
I personally agree that this would be a stronger argument, but unfortunately it's not the argument he's making. Instead, he's "pleading in the alternative", which is less logical, but can in some situations can be more effective. The classic example is from a legendary defense lawyer nicknamed "Racehorse" Haynes:
“Say you sue me because you say my dog bit you,” he told the audience. “Well, now this is my defense: My dog doesn’t bite. And second, in the alternative, my dog was tied up that night. And third, I don’t believe you really got bit.” His final defense, he said, would be: “I don’t have a dog.”
It maps excellently: "As everyone knows, unikernels never have a performance advantage. And even when they are faster, they are always terribly insecure. And even after people solve the security nightmare, they're still impossible to debug. But what's the point in spending time talking about something that doesn't even exist!"
So if the performance and security arguments are just distractions, and the core argument that they're "undebuggable" is just baldly incorrect, then what's left?
the OS isn't the bottleneck.
Curious, then why are we seeing articles here all the time on bypassing the linux kernel for low latency networking?
FYI most of the devices inside your computer work through DMA.
It's basically an alternative driver that comes with additional capabilities. Such as capturing promiscuously, and filtering captures.
The ratio of kernel to userland work is bad if you're receiving an entire network request just to increment a word in memory but usually quite tolerable if, say, you're shoveling video files out and most of the time is spent in something like sendfile() or if your small request actually requires a non-trivial amount of computation.
ie. as per what bryan said there's plenty of counter examples on perf
He voiced all this here , and so I countered by listing stuck paradigms in traditional monolithic Unixes, as well as reopening my inquiry on Sun's Spring research system, which he seems to scoff at, but over which I am impressed by the academic research it yielded. He has yet to respond to my challenge.
DTrace? Sure, there's a plethora of dynamic tracing tools in linux, but it's honestly just now starting to catch up with eBPF. If eBPF stack traces make it into 4.5, that might be the first time I can look at Linux dynamic tracing and go "Yep, it's arrived"
Systemd certainly apes quite a bit of it's functionality from SMF (and, personally, I'd argue SMF still does basically everything better...)
ZFS? Still the king of filesystem/logical volume management. Btrfs might catch up someday
Zones? Again, still ahead of the linux equivalent in functionality. And lx branded zones are awesome.
I'm not saying nothing needs to advance ever again in the world of OS research, but I think we need to be honest about how much we owe to Sun for having created the modern template for a great amount of functionality that Linux is just now having come into existence, and it's not like things at Joyent, OmniIT, Nexenta, and other Illumos developing shops have stagnated either.
I often disagree with Bryan on specific points, but I think you do him a disservice. While many of the things Sun did might be viewed as an incremental update to existing concepts, it's still something that the Linux community has yet to catch up with.
Anyway, even when he is wrong, I think he often brings up a viewpoint that results in some interesting discussion.
I also like to point out that Java, another Sun project, was a big impetus for a bunch of OSes to get properly functioning pre-emptive threading implemented. At the beginning were dark times, and even Solaris was no cake walk. C has copied the Java Memory Model for shared state concurrency, and most of the best papers on Garbage Collection were written in the mid to late nineties, often targeting Java. Subsequently in the 00's, escape analysis made pretty giant leaps forward, leading to the memory system you have in Rust today.
I think people who argue about the best data models for shared state concurrency often forget that they're taking us-vs-them stances on conventions that were all documented by the same individual, the Late, Great, Sir Tony Hoare.
There have really only been two breakthroughs since then. Transactional memory, which is still in a state of discovery and who knows when we'll get forward motion on that (especially since Intel's hardware support went belly up), and object ownership based on escape analysis, as used internally in recent JVMs, and explicitly in a few new languages, like Rust.
Now if only they'd pushed type theory forward, instead of regressing and taking the rest of us with them...
> Subsequently in the 00's, escape analysis made pretty
> giant leaps forward, leading to the memory system you
> have in Rust today.
I hadn't encountered Cyclone, but reading through a paper about it now, and it sounds a lot like reading about how Java uses escape analysis with its generational garbage collection to decide where to allocate objects. In fact in 8 goes a lot farther, and it can eliminate monitors, memory fences, and copy constructors that have no observable side effects.
Of course if Java ever conservatively wrong (retaining a dead allocation), the GC will take care of it later. In a language without a collector you have to be right in both directions.
Tony Hoare is very much still alive to the best of my knowledge!
HP-UX? AIX? What do you think is doing better than Solaris or Illumos on those things?
Then, there's the shit most people are doing and that Bryan advocates with UNIX. I also called him out on it listing numerous counters... with references backing them... to his comment at vezzy-fnord above.
Secure UNIX had already been tried by geniuses for a decade or two with always same results: architecture, language, and TCB were inherently so complex and insecure that even smallest versions had many 0-days and dozens of covert channels. He didn't respond, likely since he uses assertions instead of evidence. And all published evidence that I've ever seen contradicts his view of UNIX model superiority in robustness of any kind, sometimes performance.
Now, he's making all kinds of assertions about unikernels, deliberately avoiding evidence on some (eg performance), praising UNIX model/tech, and ignoring faults in his own approach. Should we take his word this time given prior result? Definitely not.
That is why is so important to spread the word of old OS designs and welcome any attempts to move forward.
This is why I like the changes in mobile OSes architecture, pushing safer languages and with a stack where the kernel type is actually irrelevant to user applications.
Now, tablets and smartphones are closer to the television model. Mac OS X got fairly close to it for desktops. So, I know it can be done. There's just this willingness to... DO IT.
In parallel, we can do the Burrough's model for servers and the Genera model for hackers. Cuz seriously, what real hacker is using C on inefficient black boxes of black boxes? Do they know what they're missing?
(This question is not meant to be disagreeable, I'm genuinely interested in finding out what the issues are).
Funky, the Lisp Machine uni-kernel OS was probably one of the most debuggable OS ever... with the most sophisticated error handling system, processes, backtraces, self-descriptive descriptive data-structures, full source code integration, seamless switching between compiled and interpreted code, run-time type checking, runtime bounds checking, inspectors, integrated IDE, ...
Oberon was also quite good, but I guess less than Lisp Machines.
Smalltalk, too. Especially as the topic is stuff that's still better than UNIX in some attribute. I haven't studied it enough to know what you like about it past probably safety and great component, architecture.
I just didn't knew what comment was better to reply to.
As for Smalltalk, I loved its expressioness, specially since my experience with it was in the mid-90's with VisualWorks at the university, before Java was introduced to the world.
But back then one still needed to code the VM primitives in Assembly. Meanwhile with Pharo and Squeak it is turtles all the way down.
Still waiting for a Dynabook though.
Ahh. I believe it was VisualWorks mentioned when I last looked at it. The impression I had from that description was that it was the ultimate, component language. They said you don't have a main function and directives like most languages. You really just have a pile of objects that you glue together with the glue being the main application. And that this was highly integrated into the IDE's to make that easy to manage.
Was that the experience you had?
re VM primitives in assembly
I'm actually fine with that. I'm not like other people that think language X's runtime must always be written in X. Maybe a version of that for testing and reference implementation. People can understand it. Yet, I'll throw the best ASM coder I can at a VM or even its critical paths if I have to get highest performance that way. Can't let C++ win so easily over the dynamic languages. ;)
So I was already comfortable with the concepts as such.
But playing with one of the foundations of OOP concepts had some fun to it, the environment fully dynamic that you could explorer and change anything (better save the image before).
Also it was my first contact with FP, given the Lisp influence on Smalltalk blocks and collection methods. The original LINQ if you wish.
Then the mind blogging idea of meta-classes and the interesting things one could do with them.
Smalltalk given its Language OS didn't had any main as such, you were supposed to use the Transcript (REPL) or the class browser to launch applications.
As an IDE, you could prune the image and select a specific class as entry point to create a production executable.
But after that semester, I lost access to it, so eventually I spent more time reading those wonderful Xerox PARC books about Smalltalk-80 than using Visual Works.
As for the VM primitives in Assembly, I also liked it, but many people see that as a disadvantage, like you mention.
Historical look at quite a few
Note: A number were concurrency safe, had a nucleus that preserved consistency, or were organized in layers that could be tested independently. UNIX's was actually a watered down MULTIC's & he's harsh on it there. I suggest you google it too.
Burrough's B5000 Architecture (1961-)
Note: Written in ALGOL variant, protected stack, bounds checks, type-checked procedure calls dynamically, isolation of processes, froze rogue ones w/ restart allowed if feasible, and sharing components. Forward thinking.
IBM System/38 (became AS/400)
Note: Capability architecture at HW level. Used intermediate code for future-proofing. OS mostly in high-level language. Integrated database functionality for OS & apps. Many companies I worked for had them and nobody can remember them getting repaired. :)
Note: Brilliance started in Lilith where two people in two years built HW, OS, and tooling with performance, safety, and consistency. Designed ideal assembly, safe system language (Modula-2), compiler, OS, and tied it all together. Kept it up as it evolved into Oberon, Active Oberon, etc. Now have a RISC processor ideal for it. Hansen did similar on very PDP-11 UNIX was invented on with Edison system, which had safety & Wirth-like simplicity.
Note: Individual systems with good security architecture & reliability. Clustering released in 80's with up to 90 nodes at hundreds of miles w/ uptime up to 17 years. Rolling upgrades, fault-tolerance, versioned filesystem using "records," integrated DB, clear commands, consistent design, and great cross-language support since all had to support calling convention and stuff. Used in mainframe-style apps, UNIX-style, real-time, and so on. Declined, pulled off market, and recently re-released.
Genera LISP environment
Note: LISP was easy to parse, had REPL, supported all paradigms, macro's let you customize it, memory-safe, incremental compilation of functions, and even update apps while running. Genera was a machine/OS written in LISP specifically for hackers with lots of advanced functionality. Today's systems still can't replicate the flow and holistic experience of that. Wish they could, with or without LISP itself.
BeOS Multimedia Desktop
Note: Article lists plenty of benefits that I didn't have with alternatives for long time and still barely do. Mainly due to great concurrency model and primitives (eg "benaphors"). Skip ahead to 16:10 to be amazed at what load it handled on older hardware. Haiku is an OSS project to try to re-create it.
Note: Capability-secure OS that redid things like networking stacks and GUI for more trustworthyness. It was fast. Also had persistence where a failure could only loose so much of your running state. MINIX 3 and Genode-OS continue the microkernel tradition in a state where you can actually use them today. MINIX 3 has self-healing capabilities. QNX was first to pull it off with POSIX/UNIX compatibility, hard real-time, and great performance. INTEGRITY RTOS bulletproofs the architecture further with good design.
Note: Coded OS in safe Modula-3 language with additions for better concurrency and type-safe linking. Could isolate apps in user-mode then link performance-critical stuff directly into the kernel with language & type system adding safety. Like Wirth & Hansen, eliminates all the abstraction gaps & inconsistency in various layers on top of that.
Note: Builds on language-oriented approach. Puts drivers and trusted components in Java VM for safety. Microkernel outside it. Internal architecture builds security kernel/model on top of integrity model. Already doing well in tests. Open-source. High tech answer is probably Duffy's articles on Microsoft Midori.
So, there's a summary of OS architectures that did vastly better than UNIX in all kinds of ways. They range from 1961 mainframes to 1970-80's minicomputers to 1990's-2000's desktops. In many cases, aspects of their design could've been ported with effort but just weren't. UNIX retained unsafe language, root, setuid, discretionary controls, heavyweight components (apps + pipes), no robustness throughout, GUI issues and so on. Endless problems many others lacked by design.
Hope the list gives you stuff to think about or contribute to. :)
Off topic: How do you get an asterisk in your comment without HN flipping it into italics?
Nowhere in my comment was I saying that the *Nix world is perfect and it needs no further improvement. I'm saying that attacking Cantrill on his thoughts on OS research is kind of disingenuous because Sun and the Illumos contributors are still on the cutting edge for a lot of features as far as widespread OS distributions go.
If you don't know the answer to that question, then you are in no position to draw sweeping false dichotomies and grand proclamations as you did above. I have no interest in being your history teacher. I've posted plenty of papers on here, as has nickpsecurity here in the comments, and the link in my parent post provides a decent academic summary by one of the greats in the field.
(I'm not even sure if illumos can be described as "widespread" in any fair sense. In any event, being better than some relative competitors doesn't give one the carte blanche to be a pseudohistorian and denialist.)
I'm trying to engage in some fairly civil discourse here, and I am coming into it with an open mind - I am not an academic. Nor am I a systems programmer. My frame of discussion is purely based on what I know, which is mainstream, or relatively mainstream, systems. I work with these systems on a (very) large scale. I know what frame of reference that gives me, and how that applies to me.
I've looked over the other links in these comments. As best I can tell, they deal with academic scenarios, or things that are no longer in widespread use. Cantrill's article is dealing with the here and now of today's production demands. As best I can tell, your argument is that there has been work outside of today's widespread deployments that may be superior than what is currently in it.
It's apparent I misunderstood your original comment to a large extent, and I apologize for that - but the argument I made was in good faith, and while I'm not expecting you to give me a history lesson, I don't think it's unfair to ask that if you are going to engage in online discourse, you be prepared to elaborate on your point if someone is trying to rationally discuss it with you.
I'll warn you that I have a serious bias towards systems that are or could be in wide use, though I think part of helping systems get to wide use, or at least get to the point of informing current development, is making information on them easily accessible. I've discovered a few interesting OS projects in this thread alone.
I'm definitely no fan on Bryan Cantrill, who I consider to be insufferably arrogant and hypocritical (he of the "have you ever kissed a girl", but would fire someone who he didn't employ over a personal pronoun...) but he does have a lot of experience and whilst his views are often controversial, it's probably better to counter them with information :-)
Then rather than getting aggravated at having to repeat myself I just hand them a link to said post and move on.
(and in case you're curious, yes I'm mst on IRC as well, though I think your network got disabled in the last or last but one config prune)
Let's try it: * no italics*
EDIT: Ok, space is visible after first and second one disappears. Now I want to know what the commenter is doing too lol.
EDIT 2: The closing asterisk reappeared after I added text after it. Stranger.
As long as I only say *Nix once, no italics...
We could use ∗these∗ instead. U+2217, asterisk operator.
Interestingly, U+2731, heavy asterisk, is stripped by HN it seems.
But to be honest, none of them feel the same as the one true *. I looked at https://en.wikipedia.org/wiki/Unicode_subscripts_and_supersc... hoping there was a way to elevate and scale down a character by some special code to make it look like a regular asterisk but didn't see any. Also, with unusual code points, font support is generally lacking so to some readers of HN, the above mentioned variations will probably render as something such as just squares. Furthermore, one of those I used above mess with apparent line spacing for me.
"Furthermore, the ones I used above mess with apparent line spacing."
Yeah, yeah, that's the glitch I was talking about. Dropped 2-3 lines on me and I couldn't even see the character.
Note: We could use the second one in comments the way people use asterisks. Just to screw with people who will wonder why ours are showing. If they ask about it, just act like it looks fine on your end: all italics. Haha.
They look the best as you said and they are also the ones that ∗didn't∗ mess with apparent line spacing.
Regarding messing with people, use the browser dev tools to replace them with the italics open and end tags, then screenshot the result and say to people, "what are you talking about? looks fine to me" and link said screenshot. I'm affraid our scheme will soon be thwarted by other HNers though who will jump in to say "yeah, I see asterisk as well" and then someone else will say "they are using a different unicode character".
I want to say * nix, and then to *italicize * nix*.
I want to say * nix, and then to italicize nix.
That also left things in italics, which I'll close now.
I want to say * nix, and then to *italicize *nix*.
Again, this left italics on.
So there doesn't seem to be a way to do what I was trying to do. This is why I said "unix" rather than, you know...
(print (+ * (i "NIX")))
EDIT: Darn, I was almost sure the last one would activate one of pg's old routines here. Note that the asterisk disappeared when a backslash came first. Did get NIX italicized but couldn't get rid of the space. Let me try something...
EDIT 2: Didn't work. May not be possible due to formatting rule here. Be nice if they modified it to add an escape code or something to let us drop an arbitrary asterisk.
Like the stuff at the end about STM and M:N scheduling. There are systems that use both to tremendous effect. Erlang only offers an M:N scheduling model, and it works extremely well for building the kinds of systems Erlang was built for.
He seems to be all talk. He's also more critical of unikernel level security and robustness over well-known issues like POLA while ignoring that his own platform violates both POLA and other rules we've learned about INFOSEC over time. The only proven approaches were separation kernels a la Nizza architecture, capability model at OS or component level, and language-based protection w/ extra care on HW interfaces. He doesn't mention those because they oppose UNIX approach, which he ideologically supports, plus show his product is likely not trustworthy.
This article isn't objective or reliable.
Circa 2000, Solaris, Linux, and some (all?) BSDs had M:N threading, and all of them threw it away because there are numerous performance artifacts caused by two schedulers (OS and the M:N) conflicting with each other. Mr. Cantrill was there when this all happened, and even wrote a paper on it: ftp://ftp.cs.brown.edu/pub/techreports/96/cs96-19.pdf
This doesn't affect Erlang as much simply because Erlang is really slow. I think of it as a waterline -- the lower the overhead of the language, the more likely you will run into the serious problems that M:N typically inflicts. E.g. priority inversion, and long and unpredictable latencies when the OS swaps out a thread that M:N wanted running.
As for Haskell and STM, all I know on the topic is that some pretty damn smart engineers at IBM spent a good while trying to make STM work well, and gave up. See this ACM article: https://queue.acm.org/detail.cfm?id=1454466. Perhaps Haskell managed to pull it off through the language restrictions it allows, but I’d like to see some compelling articles on that by engineers who know a thing or two; just because a language has STM doesn’t mean it does it well.
Honestly, I think your comments about M:N and STM are more revealing of your ignorance rather that Cantrill's. While you might disagree with him, there’s sufficient literature to give his stance some legitimacy. Sweepingly calling his stance "full of crap" says a thing or two.
The reason it works for Erlang is becuase M:N is the only model. There's no alternate supported model trying to maintain some competing set of semantics to muck up the works.
You can induce problems by running code that doesn't fit inside the model such as NIF code. But as of R18 you can consume a separately maintained threadpool for that stuff.
When the paradigms collide there are sometimes bizarre non-deterministic interactions, but there's nothing wrong with M:N scheduling, craploads of core infrastructure systems are running on M:N scheduling kernels/runtimes.
If you've used money in any way whatsoever ever at any point in your life you've critically depended on this "fad" and will continue to do so for a long, long, long time.
That's how these things go. I wouldn't expect it to work on NetBSD or any UNIX. Even the good ideas often fail to integrate into UNIX. Half-ass ones like M:N more so. UNIX model is just too broken. Hence, my opposing it here.
POSIX thread scheduling implemented as M:N has problems. General M:N or 1:N schedulers don't necessarily have to have problems, as it is demonstrated in practice by virtualization. Virtualized kernels implement their own scheduler on top of the hypervisor (which also has a scheduler), and it all works great. Performance overhead of virtualization is usually in other places, not in scheduler interactions. But even when it is, people do use VMs, so it works well enough, certainly not "pathologically bad"
My own experience with KVM has been that performance is problematic, thus I don't really find it compelling evidence that M:N schedulers can work well. Maybe other hypervisors can do it better, but I don't see how they could get around fundamental realities: there usually are more hypervisor threads than real cores available to service them, yet guest OS's aren't designed to gracefully handle virtual cores unexpectedly freezing for periods of time.
Don't over-subscribe the cores, disable hyperthreading, and bind scheduler threads to physical cores if your platform supports it.
Unless you're prepared to have many potentially-idle cores around, due to 1-1 bindings with an unknown, erratic, or uncontrollable workload, we're suddenly in a pretty tricky realm. You can get around this, but only in very specific and known use-cases you have control over; and I suspect the majority of use-cases fall well outside this.
If you want to more efficiently take advantage of free resources with relatively unpredictable or uncontrollable workloads (e.g. most cloud providers), binding threads to cores is a problem.
"Honestly, I think your comments about M:N and STM are more revealing of your ignorance rather that Cantrill's."
Probably was. Idk who he is or what he does. I'm aware there were fads to claim the two could solve every problem under the Sun in their category. I'm one of those few that ignore fads to focus on any lessons we can learn from certain tech or research. He made a blanket assertion against at least three technologies in one article with no context. Anyone reading might think the techs don't work well in any context. So, I briefly called that.
As evidence, I gave an example of STM that worked. In OS's, it had mixed results unless hardware accelerated and one typically had to protect I/O stuff with locks due to irreversibility. Far as M:N, only found OS use for it in security kernels (eg GEMSOS) to help supress covert, timing channels. Didn't expect him to know that use-case, though, since the cloud products are usually full of covert channels. And then there's MirageOS on unikernel end using type- and memory-safety for robust, network applications already showing fewer 0-days than alternatives did.
So, all three already have real-world, beneficial use. Quite the opposite of his claim that leads one to believe that never happens.
"Sweepingly calling his stance "full of crap" says a thing or two."
In other comments, I did present data and specific examples to reject his claims about UNIX being a usable foundation for this sort of thing among others. His approach to such things was to ignore all counter evidence and/or leave the discussion. Eventually, a new article shows him making an adverti... err, a bunch of claims, some backed up and some assertions, about all kinds of tech in an anti-unikernel rant.
Strange that you're fine with him ignoring his critic's counterpoints... with references... to his major claims but have trouble with how I handle minor, tangential ones. Says a thing or two indeed.
Instead, you agreed with the blanket statement that those two techs were totally useless instead of a more qualified one that they overpromised and couldn't deliver for "legacy" systems whose models didn't fit. I also bet money that Cantrill's product is still vulnerable to covert, timing channels. I'll win by default because they show up every time a resource is shared, that's his business model, and countering them usually requires ditching UNIX or x86. Want to place that bet? ;)
If there are bulletproof STM implementations, there should be some papers which back that up (preferably not academic toys). What throughtput and latency profiles did the STM implementations get, how did that scale with cores, what work profiles do they work well with, and how did all that compare with regular threads? How were livelock and side-effects handled? Was this put in production, and what shortcomings were there (there are always shortcomings)? Etc, etc, etc.
Also, reread my initial post more carefully. I qualified it at the very beginning with "in the context of high performance applications”, and I qualified it again near the end; I didn’t simply agree with any sweeping claims. Yet here I am, getting lectured by someone who says that "performance assertion showed he was full of crap"... which I have demonstrated isn't true.
Back your claims up; maybe I'll learn something interesting.
I was giving you no references, only stuff you could Google, because you opened with an insult. I don't do homework for people that do that. That said, I think I figured out where this really started:
"by someone who says that "performance assertion showed he was full of crap"... which I have demonstrated isn't true"
This assertion wasn't about M:N or STM. Those came way after the performance assertion. In his third paragraph, he tried to dismiss unikernels with a blanket assertion about [basically] how their performance can't be good enough. Sums it up as saving context switches isn't enough, as if that's supporters' key metric. Other commenters wanted actual data on different types of unikernels and alternatives supporting his assertion. Instead of providing that, he says " let’s just say that the performance arguments to be made in favor of unikernels have some well-grounded counter-arguments and move on."
Seriously? He just opened a counter to unikernels by saying their bad performance makes them worthless compared to alternatives like his, gives context-switches as a counterpoint, and then asks the reader to just take his word otherwise that there's a good counter to anything they'd ask? Unreal... How could I not look at it as either arrogant or propaganda piece after that?
That was the performance assertion some others and I we're talking about. That's what I meant when I said "plus" the performance assertion. I can see how it could be read like it was connected to other two. The M:N and STM claim I made came from this sweeping statement he did for show far as I can tell:
"the impracticality of their approach for production systems. As such, they will join transactional memory and the M-to-N scheduling model as"
That was outside his section on performance toward the end after he talked about all kinds issues. So, he just says they were impractical for production systems. Despite being used in production systems beneficially. Not for ultra-high performance, like you said, but delivered on claims in some deployments and not purely a fad. So, he doesn't substantiate his prior claims on HN when asked (with references supporting counters), his opening claim there, and doesn't qualify those two. So, I called him out on all of it with the assumption this is an ego or ad piece more than any collection of solid, technical arguments. Only debugging claim really panned out and only partially.
I think we just got to talking cross-purposes from then on where the discussion conflated what the performance assertion was with my gripe about his overstatemetns on other two technologies. So, I'm dropping that tangent from here on now that I see what happened. Pointless as I don't disagree with your performance assertion about two technologies so much as his unsubstantiated assertion about unikernels and overstatements about two technologies. So, I'm done there.
Far as my original comment, we're still waiting on him to back up that performance assertion he made that unikernels provide no performance advantage despite reports to contrary from users. I'm also waiting for his security claims on OS vs hypervisors as I've seen more robust, even formally-verified, hypervisors than OS's in general with ZERO verified or secure UNIX's. Merely attempts. I've got citations going back to the 70's supporting a minimal-TCB, kernelized model as superior to unrestrained, monolithic kernels in security. Love to see what his counterpoint is on that. Well, counterpoints given the volume and quality of evidence on my side. :)
Erlang has a lot of neat ideas.
Still, I can dream.
There's a lot missing, not the least of which is a set of abstractions over the rich primitives which provide simple programming interfaces that don't force a lot of the language's complexity on you, but if what you want is something that might one day be a really fast Erlangish it's probably the front-runner at the moment.
I still have high hopes for BEAMJIT existing as a production-grade thing someday. That way I don't have to drop into C in my Erlang applications as often. Sure BEAMJIT will probably never be as fast, but if it can be good-enough while also not forcing me across the safety boundary to get to that good-enough state... that's a huge win.
For implementing POSIX threads, M:N threading has a lot of extra complexity and few advantages, and worse performance.
But if you don't have POSIX threads, but something higher-level that's managed by a runtime anyway, you don't need to worry about things like ensuring enough VM space and at least one allocated page for each thread stack.
Erlang's and GHC's threads start out needing much less space than a single page, so you can start 10x more threads in the same amount of RAM.
His point if there is one is that:
>> Unikernels are entirely undebuggable.
For which evidence he provides the sound of some applause at some random conference and I don't believe is even remotely true or is so general to be vacuous.
Your notion of well-argued differs greatly from mine.
Unikernels solve an important class of problems that Joyent is interested in solving with another platform.. it's own.
I see no mention of a significant benefit of unikernels: their decreased attack surface.
There is a fairly new community web site at http://unikernel.org/ for those interested.
I see no mention of a significant benefit
of unikernels: their decreased attack surface.
Edit: OK, not the entire paragraph, but it covers the topic.
I don't agree with minimizing arguments based upon personal opinions of the author. Please don't interpret my comment that way. I'm just pointing out context, which is worth considering when evaluating the arguments.
Or put another way. You have two people presenting arguments for database designs. Both base them on known and cited research papers. Both provide sound arguments for their design. One is an independent researcher and one works for Oracle. What are the chances one of the designs will be very similar to Oracle's database?
The biggest problem with Unikernels like Mirage is the single language constraint (mentioned in the article). I actually love OCaml, but it's only suitable for very specific things... e.g. I need to run linear algebra in production. I'm not going to rewrite everything in OCaml. That's a nonstarter.
An I entirely agree with the point that Unikernel simplicity is mostly a result of their immaturity. A kernel like seL4 is also simple, because like unikernels, it doesn't have that many features.
If you want secure foundations, something like seL4 might be better to start from than Unikernels. We should be looking at the fundamental architectural characteristics, which I think this post does a great job on.
It seems to me that unikernels are fundamentally MORE complex than containers with the Linux kernel. Because you can't run Xen by itself -- you run Xen along with Linux for its drivers.
The only thing I disagree with in the article is debugging vs. restarting. In the old model, where you have a sys admin per box, yes you might want to log in and manually tweak things. In big distributed systems, code should be designed to be restarted (i.e. prefer statelessness). That is your first line of defense, and a very effective one.
But if you never understand why it was a bad state in the first place you're doomed to repeat it. Pathologies need to be understood before they can be corrected. Dumping core and restarting a process is sometimes appropriate. But some events, even with stateless services, need in-production, live, interactive debugging in order to be understood.
The question then becomes if it is reproducible since "debuggable when not running normally" seems to be the common thread of unikernels, such as being able to host the runtime in Linux directly rather than on a VM.
I think it if you try a low level language these kinds of things are going to bite you, but a fleshed out unikernel implementation could be interesting for high level languages, since they typically don't require the low level debugging steps in the actual production environment.
In either case unikernels have a lot of ground to cover before they can be considered for production.
That much is true. I'm countering what I can where it gets overblown. Just part of something going mainstream... crowd effects and so on...
"The only thing I disagree with in the article is debugging vs. restarting. In the old model, where you have a sys admin per box, yes you might want to log in and manually tweak things. In big distributed systems, code should be designed to be restarted (i.e. prefer statelessness). That is your first line of defense, and a very effective one."
Additionally, there's no inherent reason that I see that unikernels are impossible to debug. We can debug everything from live hardware up to apps with existing tooling. So, if unikernels are lacking, it just means they're still young and nobody has adapted proven techniques to debugging them. I imagine the simple ones on simpler HW will make that even easier.
"Now, could one implement production debugging tooling in unikernels? In a word, no"
That's a lie: you could implement it. Debugging has been implemented for everything from ASIC's to kernel mode to apps in other categories. Whether unikernel crowd wants to or will is another question. Not looking good there per Google. Him talking like it's impossible shows he has an agenda, though.
I'm guessing he didn't tell everyone to ditch the entire concept of Joylent's offering early on because they were lacking some important features or properties. Just a guess but I'd bet on it.
EDIT: Changing search terms to "debugging" "xen" "guests" got me two results showing foundations are already built. Weak compared to UNIX but there.
Do you write all your software in C? Of course not. The single language constraint doesn't exist, for the same reasons we can write Go software that runs on the Linux kernel
From the paper (http://unikernel.org/files/2013-asplos-mirage.pdf):
> We present unikernels, a new approach to deploying cloud services via applications written in high-level source code. Unikernels are single-purpose appliances that are compile-time specialised into standalone kernels, and sealed against modification when deployed to a cloud platform. In return they offer significant reduction in image sizes, improved efficiency and security, and should reduce operational costs. Our Mirage prototype compiles OCaml code into
unikernels that run on commodity clouds and offer an order of magnitude reduction in code size without significant performance penalty.
> An important decision is whether to support multiple languages within the same unikernel. [...] The alternative is to eschew source-level compatibility and
rewrite system components entirely in one language and specialise that toolchain as best as possible. [...] existing non-OCaml code can be encapsulated in separate VMs and communicated with via message-passing [...]
> We did explore applying unikernel techniques in the traditional systems language, C, linking application code with Xen MiniOS, a cut-down libc, OpenBSD versions of libm and printf, and the lwIP user-space network stack.
That is, there is absolutely a single language constraint, on purpose, as arguably the primary differentiation from non-unikernels.
The single language constraint you're continuing to claim exists is addressed by their own blog posts: https://mirage.io/blog/modular-foreign-function-bindings . After reading this, I can see approximately 50 lines code that would let me run a Python interpreter (another mostly memory-safe language, btw) within the unikernel, assuming enough of the Unix syscall interface was already implemented in shims back into OCaml land, something I presume will likely be released in the coming months in preparation for making their stuff useful to the general public.
I mean, you also can port applications to Linux kernelspace. (Remember the Tux web server?) But that's not really the point of Linux, and if you want that, you should... use a unikernel.
That said, sure, it's entirely possible that as they shift focus from an academic project to a commercial one, they'll give up on this distinction and its performance advantages, and start marketing a product that lets you just write C. (Just like they may well give up on hypervisor-based parallelism and add fork().) But that's not how they're currently envisioning the concept.
So, it's a proven model that's literally flying through the air right now due to aerospace take-up. It would likely work for unikernels, too, so long as they included same checks/mediation at interface points or middleware that prior model required. The only real questions should be about the resulting attributes of that system: is it a good approach vs regular unikernels w/ performance, containment, etc (theory vs practice)? Or just ditch them to enhance separation kernels, micro-hypervisor platforms, or capability systems?
Personally, I'm not sold on unikernels for resilience: prior, security models were better, field-proven, and survived advanced pentesting. Under-utilized imho. Cross-language is similar in both, though, with attributes of one application likely carrying to other. The real problem is the TCB being complex & insecure, breaking isolation paradigm.
Can your preferred language link to the OCaml ABI? Do you have an interpreter written in OCaml for it?
I mean yes, you could have some of your app implemented in OCaml and then it talks over TCP to some Fortran running on Linux to do linear algebra, but that approach has its own problems.
The languages have to implement the OS ABI, whatever that is.
Lisp Machines had Fortran, Ada and C compilers available.
Here's full-mixed-language programmable, locally- and fully-remote-debuggable, mixed-user and inner-mode processing unikernel, and with various other features...
This from 1986...
FWIW, here's a unikernel thin client EWS application that can be downloaded into what was then an older system, to make it more useful for then-current X11 applications...
Anybody that wants to play and still has a compatible VAX or that wants to try the VCB01/QVSS graphics support in some versions of the (free) SIMH VAX emulator, the VAX EWS code is now available here:
To get an OpenVMS system going to host all this, HPE has free OpenVMS hobbyist licenses and download images (VAX, Alpha, Itanium) available via registration at:
Yes, this stuff was used in production, too.
Every 10 years, the same thing gets re-invented. Take network block devices/clustered sharing. VMS had high-availability and each node you joined could use it's local disk as an aggregate resource. In the 90s you had AndrewFS and CODA (CMU's golden age IMO). Then Linux had the whole DRDB era which gained traction about 10 years ago right around the time Hadoop was gaining traction. OpenStack has Cinder. 10 years from now we'll have something else.
Anyways, great points and good post. VAXstations are available on ebay for pretty cheap, but I'd personally go with a hobbyist OpenVMS Alpha license running on ES40. I threw a setup together a few years back and it was neat. Thanks for the data-sheet, my father will get a huge kick out of it.
The most recent OpenVMS release shipped in June 2015, and the next release is due to ship in March 2016.
There's a port to x86-64 underway, as well.
For those looking for hardware for hobbyist use, used Integrity Itanium servers are usually cheaper than used working Alpha and VAX gear, and newer — working VAX and Alpha gear has become more expensive in recent years. Various VAX and Alpha emulators are available, either as open source or available to hobbyists at no cost.
The trick though is they did only one thing (network attached storage) and they did it very well. That same technique works well for a variety of network protocols (DNS, SMTP, Etc.). But you can do that badly too. We had an orientation session at NetApp for new employees which helped them understand the difference between a computer and an appliance, the latter had a computer inside of it but wasn't progammable.
Their patents kept direct competitors away, but the core ideas of the write-ahead filesystem had a big impact on an industry that was going ahead anyway and implementing these in kernels, userspace, etc.
Then came the SSD and they had no brilliant IDE of how to make more of that...
Had they had a more mainstream platform they could have caught up with "code moving" instead of "data moving"? because you could run established software like Hadoop. If your core OS is Linux, Windows or something mainstream you can get all kinds of software but if you have to port it to something codesigned in 1993 you probablt won't to do
At risk of speaking for Bryan, I think the difference between a NetApp and a unikernel-in-a-hypervisor is the sharedness of it. Without taking a position on Bryan's article (it's always entertaining to read his thoughts), I think his point is that that advantages of a unikernel are largely washed away, and the disadvantages are emphasized.
While Bryan is somewhat bombastic (more fun to read), there's a lot of smart in this article, I think.
Today, a typical dual core 8GB x86 machine can, as a dedicated machine run a lot of things. At the same time the evolution of open systems have brought "continuous configuration integration" into the mainstream, all major OSs from OS X to Windows to Linux have weekly, sometimes daily, reconfiguration events.
And while the number of the changes in aggregate are high, the number of changes on particular subsystems are low. Unikernels answer the need of creating a stable enough snapshot of the world to allow for better configuration management. Look at the example of the FreeBSD system taken down after 20 years. Some services can just run and run and run.
Image isolation is a thing, and you can only be as good as your underlying software and hardware can make you, but it can also be a big boost to operational efficiency if that can simplify your security auditing and maintenance.
So my take on Bryan's article was that he came at the argument from one direction, which is fine, but to be more through it would help to look at it from several directions. What was worse was that he made some assertions (like never being in production) before defining precisely what he means by a Unikernel which leaves him wide open to examples like NetApp and IBM's VM system to counter his assertion.
The nice thing about computers these days is that many of the problems we experience have been experienced in different ways and solved in different ways already, and we can learn from those. The Unikernel discussion is not complete with looking through the history of machines which are dedicated appliances (from Routers, to Filers, to Switches, to Security camera archivers)
Like most things, I don't think unikernels are a panacea but they also aren't the end of the world and have been applied in the past with great success.
An appliance is a computer that is programmable, just not by the person that owns it.
I'm pretty sure you debug an Erlang-on-Xen node in the same way you debug a regular Erlang node. You use the (excellent) Erlang tooling to connect to it, and interrogate it/trace it/profile it/observe it/etc. The Erlang runtime is an OS, in every sense of the word; running Erlang on Linux is truly just redundant, since you've already got all the OS you need. That's what justifies making an Erlang app a unikernel.
But that's an argument coming from the perspective of someone tasked with maintaining persistent long-running instances. When you're in that sort of situation, you need the sort of things an OS provides. And that's actually rather rare.
The true "good fit" use-case of Unikernels is in immutable infrastructure. You don't debug a unikernel, mostly; you just kill and replace it (you "let it crash", in Erlang terms.) Unikernels are a formalization of the (already prevalent) use-case where you launch some ephemeral VMs or containers as a static, mostly-internally-stateless "release slug" of your application tier, and then roll out an upgrade by starting up new "slugs" and terminating old ones. You can't really "debug" those (except via instrumentation compiled into your app, ala NewRelic.) They're black boxes. A unikernel just statically links the whole black box together.
Keep in mind, "debugging" is two things: development-time debugging and production-time debugging. It's only the latter that unikernels are fundamentally bad at. For dev-time debugging, both MirageOS and Erlang-on-Xen come with ways to compile your app as an OS process rather than as a VM image. When you are trying to integration-test your app, you integration-test the process version of it. When you're trying to smoke-test your app, you can still use the process version—or you can launch (an instrumented copy of) the VM image. Either way, it's no harder than dev-time debugging of a regular non-unikernel app.
Curious as to how you would drive to root cause the bugs that caused the crash in the first place? If you don't root cause, won't subsequent versions still retain the same bugs?
There are bugs that can only manifest themselves in production. Any system where we don't have the ability to debug and reproduce these classes of problems in prod is essentially a non-starter for folks looking to operate reliable software.
Instead, unikernels seem to me to instead be a good way to harden high-quality, battle-tested server software. Of two main kinds:
• Long-running persistent-state servers that have their own management capabilities. For example, hardening Postgres (or Redis, or memcached) into a unikernel would be a great idea. A database server already cleans and maintains itself internally, and already offers "administration" facilities that completely moot OS-level interaction with the underlying hardware. (In fact, Postgres often requires you to avoid doing OS-level maintenance, because Postgres needs to manage changes on the box itself to be sure its own ACID guarantees are maintained. If the software has its own dedicated ops role—like a DBA—it's likely a perfect fit for unikernel-ification.)
• Entirely stateless network servers. Nginx would make a great unikernel, as would Tor, as would Bind (in non-authoritative mode), as would OpenVPN. This kind of software can be harshly killed+restarted whenever it gets wedged; errors are almost never correlated/clustered; and error rates are low enough (below the statistical noise-floor where the problem may as well have been a cosmic ray flipping bits of memory) that you can just dust your hands of the responsibility for the few bugs that do occur (unless you're operating at Facebook/Google scale.) This is the same sort of software that currently works great containerized.
† The best solution for your custom app-tier, if you really want to go the unikernels-for-everything route, might be a hybrid approach. If you run your app in both unikernel-image, and OS-process-on-a-Linux-VM-image setups, you'll get automatic instrumentation "for free" from the Linux-process instances, and production-level performance+security from the unikernel instances. You could effectively think of the Linux-process images as being your "debug build."
Two ways of deploying such hybrid releases come to mind:
• Create cluster-pools consisting of nodes from both builds and load-balance between them. Any problems that happen should eventually happen on an instrumented (OS-process) node. This still requires you to maintain Linux VMs in production, though.
• Create a beta program for your service. Run the Linux-process images on the "beta production" environment, and the unikernel-images on the "stable production" environment. Beta users (a self-selected group) will be interacting with your new code and hopefully making it fall down in a place where it's still instrumented; paying customers will be working with a black-box system that—hopefully—most of the faults will have been worked out of. Weird things that happen in stable can be investigated (for a stateful app) by mirroring the state to the beta environment, or (for a stateless app) by accessing the data that causes the weird behavior through the beta frontend.
Huh? What are you talking about?
But sure, you need a debugger. So you use one. I'm not sure why the author seems to think that's so hard.
The author wrote and continues to contribute to DTrace, which is an incredibly advanced facility for debugging and root causing problems. GDB (for example) doesn't help you solve performance problems or root-cause them, because now your performance problem has become ptrace (or whatever tracing facility GDB uses on that system).
The point he was making is that there are problems with porting DTrace to a unikernel (it violates the whole "let's remove everything" principle, and you couldn't practically modify what DTrace probes you're using at runtime becuase the only process is your misbehaving app -- good luck getting it to enable the probes you'd like to enable).
That's not unique to unikernels. You can get that with Docker containers or EC2 instances on AWS.
I'm sure that's true, but it's not a statement directly about consistency.
You need some level of decent logging in the production environment (with optional extra logging that can be turned on) to capture WTF went wrong. THEN you can try to reproduce it. When a logged system goes boom, you need it to come out of production and remain around until those logs are saved somewhere.
It is old, but http://lkml.iu.edu/hypermail/linux/kernel/0009.1/1307.html is an interesting data point about IBM's experience. They implemented exactly the kind of logging that I advocate in all of their systems until OS/2. And the fact that they didn't have it in OS/2 was a big problem.
Having this ability to "futz around in prod" frequently obviates the need for prescient levels of logging. You can poke at the problem until it crashes for you.
It would be nice to have some kind of a "shell" to inspect details about the application when something goes wrong, but that applies equally to non-unikernel applications.
The difference is that with unikernels the debugging tool would be provided as a library, whereas with traditional applications debugging is provided as a language-independent OS tool.
Dumping the debug-level log ring-buffer on exception:
I admit that debugging is easier on platforms where you get more control.
Well, that's the point of the article.
What good does restarting your service do if the issue will stay there, and come back again later?
Running a LING unikernel gets you a huge set of tracing and debugging capabilities that aren't 100% compatible with what's provided by BEAM, but it's close and a couple orders of magnitude better than what almost any other language provides for mucking with a live system; unikernel or not.
Running Erlang via Rumprun gets you a standard BEAM/erts release packaged as a VM image or a bootable ISO, and that is just straight up Erlang. So all the absolutely excellent debugging, tracing, and profiling facilities that any random Erlang application has access to are also accessible when deployed as a unikernel (rumpkernel).
Perhaps that means the real argument is "Unikernels are unfit for production for people who aren't comfortable with the Erlang/OTP way of doing things." Yes, this isn't a technical argument, but—most people, most customers, don't even see SmartOS + Linux emulation as suitable for production (and I imagine the author knows that), not for any technical reason but just for unfamiliarity. And that's still a UNIX working in UNIXy ways.
I'm already feeling the "can't debug" pinch and we haven't even started getting into unikernels yet.
Stuff tends to stay working once you get it working though, so that evens up the score a bit, but I'd really love it if getting there were smoothed out a bit.
Makes me fairly useful in a crisis, but possibly less so in this thread.
I would think a remote debugger would work just fine in a container, at least once you documented how, but to zokier's point the only reason you'd do that is if the problem isn't repeatable outside the container. To me that means a couple possibilities. Bug in the docker scripts, bug in the unikernal generator, or a timing issue, which a debugger won't help you with very much (timing bugs are often also heisenbugs)
For instance, you could imagine a unikernel that did support fork() and preemptive multitasking, but took advantage of the fact that every process trusts every other one (no privilege boundaries) to avoid the overhead of a context switch. Scheduling one process over another would be no more expensive than jumping from one green (userspace) thread to another on regular OSes, which would be a huge change compared to current OSes, but isn't quite a unikernel, at least under the provided definition.
Along similar lines, I could imagine a lightweight strace that has basically the overhead of something like LD_PRELOAD (i.e., much lower overhead than traditional strace, which has to stop the process, schedule the tracer, and copy memory from the tracee to the tracer, all of which is slow if you care about process isolation). And as soon as you add lightweight processes, you get tcpdump and netstat and all that other fun stuff.
On another note, I'm curious if hypervisors are inherently easier to secure (not currently more secure in practice) than kernels. It certainly seems like your empirical intuition of the kernel's attack surface is going to be different if you spend your time worrying about deploying Linux (like most people in this discussion) vs. deploying Solaris (like the author).
No need to imagine, this is exactly how Microsoft Singularity worked (it benefited from a language expressive enough to make that trust possible)
Is Singularity a unikernel? More specifically, would the unikernel.org folks consider a production-ready kernel inspired by Singularity and targeting a hypervisor to be a unikernel? The 2013 paper's introduction section contains the sentence, "By targeting the commodity cloud with a library OS, unikernels can provide greater performance and improved security compared to Singularity ," so I'd imagine no. But I don't see any expansion on that point, so I suspect it was added to appease a reviewer.
Each tactic is obsolete in CompSci has surpassed them since then. So, in reality, it would be better than SPIN for sure. The Midori write-ups are probably going to be our new baseline for what hybrid models can pull off.
Sounds more or less like any embedded kernel that runs on an MMU-less cpu. Do unikernels handle interrupts? If so that means they can have preemptive threads, rather than green.
It comes off as a slew of strawmen arguments ... for example the idea that unikernels are defined as applications that run in "ring 0" of the microprocessor... and that the primary reason is for performance...
All of the unikernel implementations he mentioned (mirageos, osv, rumpkernels) all run on top of some other hardware abstraction (xen, posix, etc) with perhaps the exception of a "bmk" rumpkernel.
We currently have a situation in "the cloud" where we have applications running on top of a hardware abstraction layer (a monolithic kernel) running on top of another hardware abstraction layer (a hypervisor). Unikernels provide a (currently niche) solution for eliminating some of the 1e6+ lines of monolithic kernel code that individual applications don't need and introduce performance and security problems. To dismiss this is as "unfit for production" is somewhat specious.
I wonder if Joyent might have a vested interest in spreading FUD around unikernels and their usefulness.
I also believe that as an industry and a field, we should continue to build on the investments we've already made over many decades. The Unikernel seems, to me at least, to be throwing out almost everything; not just undesirable properties, but also the hard-won improvements in system design that have fired so long in the twin kilns of engineering and operations.
Then they're offering a very similar thing. And the questions then are things like:
What should the interface between contained and outside look like?
Is there value in running a traditional unix userland inside the container?
What kind of code do we want to run inside the container?
IMO the unikernel answers to these questions are better. The Unix userland is an accident of history; if unikernels had come first we wouldn't even think of it.
Some additional meat:
- The complaint about Mirage being written in OCaml is nonsense, it's trivial to create bindings to other languages, and in 40 years this never stopped us interfacing our e.g. Python with C.
- A highly expressive type/memory safe language is not "security through obscurity", an SSL stack written in such a language is infinitely less likely to suffer from some of the worst kinds of bugs in recent memory (Heartbleed comes to mind)
- Removing layers of junk is already a great idea, whether or not MirageOS or Rump represent good attempts at that. It's worth remembering that SMM, EFI and microcode still exist on every motherboard, using some battle-tested middleware like Linux doesn't get you away from this.
- Can't comment on the vague performance counterarguments in general, but reducing accept() from a microseconds affair to a function call is a difficult benefit to refute in modern networking software.
While you are right about OCaml being safer than C, Heartbleed was a pretty lame bug, it doesn't even give an attacker remote code execution. Something like CVE-2014-0195 is far more dangerous than Heartbleed but it didn't have a marketing name and large amounts of press coverage.
But it did give the attacker your server's private keys. And client private data.
CVE-2012-2110 is probably a better choice.
From the openssl advisory, "In particular the SSL/TLS code of OpenSSL is not affected.".
 - https://www.openssl.org/news/secadv/20120419.txt
In particular, I am suspicious of the idea that unikernels are more secure. Linux containers make the application secure in several ways that neither unikernels nor hypervisors can really protect from. Point being a unikernel (as defined) can do anything it wishes to on the hardware. There is no principle of least-privilege. There are no unprivileged users unless you write them into the code. It's the same reason why containers are more secure than VMs.
Users are only now, and slowly, starting to understand the idea that containers can be more secure than a VM. False perspectives and promises of unikernel security only conflate this issue.
That said, I do think the problems with unikernels might eventually go away as they evolve. Libraries such as Capsicum could help, for instance. Language-specific or unikernel-as-a-vm might help. Frameworks to build secure unikernels will help. Whatever the case, the problems we have today are not solved or ready for protection -- yet.
This blog post was clearly spurred by the acquisition made by Docker (of which I am alumnus). I think it's a good move for them to be ahead of the technology, despite the immediate limitations of the approach.
The smaller point about porting application (whether targetting unikernels that are specific to a language runtime or more generic ones like OSv and rumpkernels) is the most salient, it will probably restrict unikernel adoption.
For docker, if only to provide a good subtrate for providing dev environments for people running windows or Mac computers, it is very promising.
The primary reason to implement functionality in the
operating system kernel is for performance: by avoiding
a context switch across the user-kernel boundary,
operations that rely upon transit across that boundary
can be made faster.
Should you have apps that can be unikernel-borne, you
arrive at the most profound reason that unikernels are
unfit for production — and the reason that (to me,
anyway) strikes unikernels through the heart when it
comes to deploying anything real in production:
Unikernels are entirely undebuggable.
BOOM! And kernels. And ASIC's. And so on. Yet, we have tools to debug all of them. But unikernels? Better off trying to build a quantum computer than something that difficult...
This also bothered me...
virtualizing at the hardware layer carries with it an inexorable performance tax
Putting that aside, debuggability is an obvious and pressing issue to production use-cases. Any proponent of unikernels that denies that should be defenestrated. I haven't come across any that do.
How to go about debugging unikernels is unclear because it certainly is still early days. However, I don't think the lack of a command-line in principle precludes debuggability, nor does it my mind even preclude using some of the traditional tools that people use today. For example, I could imagine a unikernel library that you could link against that would allow for remote dtrace sessions. Once you have that, you can start rebuilding your toolchain.
P.S. Bryan, where's my t-shirt?
As a security engineer, that's a good one sentence summary from my point of view of unikernels, since, forever.
I think the reason why unikernels are being developed is due mostly to ignorance, and if any of them is successful, it will morph into an OS that is closer to Mesos, Singularity, or even Plan9. That's faster, safer, more logical, etc.
Both will prevent persistence, both are restricted outside, not internally. If anything I'd say that reduced number of devices give you lower attack surface over hypercalls (unikernels) than having direct access to all the syscalls (container).
What's the huge difference and where's the theater?
There was a highly interesting research project along these lines: https://arrakis.cs.washington.edu/
The Arrakis paper also show how much overhead the traditional UNIX architecture is (contrary to the authors assertions) for many popular workloads. It is now merged back into the project it was forked from (Barrelfish) which in itself is a very interesting research project. Well worth studying for those that don't believe UNIX is the last word in OS design.
But the second two points are already covered by just writing an application and not running it in a virtual machine. Remember, your VMs are already running on an OS.
And the first -- I'm not convinced that qemu and x86 is all that much more restricted than a well jailed process. Given the complexity of the PC, and the number of critical Xen/KVM/... vulnerabilities, it certainly isn't trivial to emulate securely.
Note, there is one advantage to unikernels, and that's lower overhead access to network hardware than you get with the socket API. This advantage is also available with netmap.
Xen has their own priorities and I have my views on their code quality. The interface is that much smaller that it should be much more possible to implement securely than the unix API (which isn't even well-defined). People elsewhere are talking about seL4; I hope we'll one day see a formally verified hypervisor. I don't think we'll ever see a formally verified unix container.
In the long run you're right, there's ultimately no difference between compiling my OCaml to a binary that runs on a secure formally verified microkernel and building it into a unikernel that runs on a secure formally verified hypervisor. But I can take steps towards the latter now - I can deploy unikernel systems to EC2 today, and while they may not be more secure than deploying processes to Joyent today, they're using an interface that should make it possible to run them on a more secure environment.
I think the real point here is that AWS exists, and so virtualization is the new baremetal, but the advantage over baremetal is the range of "hardware" is much more limited. You have virtio / XenPV disks, not twenty different SCSI devices you need to write drivers for and debug. Therefore writing interesting kernels directly to the virt layer and running those in the cloud makes sense.
There's no reason few allocators can't cope as well, or why Mirage couldn't implement support for memory ballooning like Linux does, or etc. etc.
I also found that reference highly dubious.
I mean the reasons Rust abandoned it were quite legitimate. As a systems language with originally segmented stacks that performed poorly and were thus removed, which mitigated anything resembling the "lightweight" promise of lightweight-tasks, and then combined with the overhead of having to maintain two distinctly different IO interaction interfaces due to a lack of unification between the standard runtime and the libuv based M:N scheduler... I mean of course Rust needed to punt that semantic out of the core runtime and move it to a domain where that kind of functionality could be implemented as a library/framework instead. Otherwise writing consistent IO libraries for Rust would be a massive pain in the ass.
The point of Rust isn't to be an opinionated framework that provides a set of prescriptive models for solving problems like Erlang/OTP does. The point is to be a generic systems language that you could use to build a new Erlang/OTP shaped thing with.
I realize I'm quite literally preaching to the head of the International Choir Association right now though. ;-)
Given how invested Joyent is in their current positions, I can see why Unikernels may seem a threat, but none of the things Cantrill has raised as concerns seem insurmountable.