
Unikernels: The Next Stage of Linux's Dominance [pdf] - telotortium
https://www.cs.bu.edu/~jappavoo/Resources/Papers/unikernel-hotos19.pdf
======
rwmj
Oh, it's a paper I am (rather peripherally) involved with. If you have
questions then ask away, although it's possible I might not have all the
answers ...

Edit: In case you can't read the paper there's a copy here:
[https://www.cs.bu.edu/~jappavoo/Resources/Papers/unikernel-h...](https://www.cs.bu.edu/~jappavoo/Resources/Papers/unikernel-
hotos19.pdf) (Thanks anonymousDan in the comments below for linking to it)

~~~
antsoul
How do they relate to microkernels ?

Since last HN post about GNU Hurd, I'm wondering why not try to join this Free
microkernel project instead of looking in another kernel's direction ?

~~~
panpanna
Architecturally, microkernels and unikernels are direct opposites.

Unikernels strive to minimize communication complexity (and size and footprint
but that is not relevant to this discussion) by putting everything in the same
address space. This gives them many advantages among which performance is
often mentioned but ease of development is IMHO equally important.

However, the two are not mutually exclusive. Unikernels often run on top of
microkernels or hypervisors.

~~~
rwmj
Right answer.

While a unikernel might be structured to some extent internally into modules,
it's basically all linked into a single blob and running in a single address
space. Some _programming languages_ may support some kind of PL-level
separation, but there is no hardware enforcement. In the case of what Ali is
doing because all the code is written in C (it's all Linux, glibc and
memcached) there is neither software nor hardware separation internally.

~~~
ratmice
To give a list of these PL-level separation mechanisms, and their languages,
these are the kernels I know of that work in this fashion.

Spin OS (Modula-3), Singularity (Sing#), Midori (M#), TockOS (safe rust).

As far as I know TockOS is the only one I know of which has some form of both
PL-level and hardware enforcement of separation, albeit on an MPU rather than
a full MMU, PL-level for kernel modules, and an MPU protected userspace.

I at least think it is worth addressing that none of these separation
mechanisms are actually mutually exclusive.

------
ggm
Long ago, an OS I worked inside had a tool which took shared library indirect
calls, and wired them out so you did not have a nested indirect function call
cost, but went direct from the call to the backend without the pass-through
the null function mapping into the specific .so library. It made things faster
by one layer of the onion skin.

Strip, as a UNIX tool removes the textual crap in a debuggable binary, to
leave behind only the bits you needed.

It feels to me like if you can do runtime code call checks, and confirm which
_actual_ calls you make, then stripping the bits of libc and associated
libraries out, being left with only the strictly required calls, and then by
extension the syscalls, and then by extension the kernel elements, is actually
possible a lot of the time.

So, library -> reduced library -> reduced calls -> reduced syscalls -> reduced
kernel state is a sequence or set or something, of applied minimisations which
can be done, if you can predict all the call paths in your code and their
dependencies.

But then its not a generic OS any more: its an application specific binding to
a general purpose CPU.

Why not go all the way, and work into the ALU and remove the bits you don't
need? And then go into the micro code, and the associated FPGA, and everything
else..

~~~
sp332
You have to make sure your test cases exercise every possible code path.
German demogroup .theprodukkt made a demo one time, .kkreiger, a first-person
shooter game in 96kb. They used a technique like this to strip every line of
source code that was not hit during a playtest. But the playtester didn't
press the up-arrow in the main menu, so that doesn't work. More importantly
they didn't get hit by any bullets, so in the final release the player is
invulnerable because the code that handles damage was never compiled.

Edit: more informative link
[https://fgiesen.wordpress.com/2012/04/08/metaprogramming-
for...](https://fgiesen.wordpress.com/2012/04/08/metaprogramming-for-madmen/)

~~~
ggm
Yes. And its almost impossible to do this without a huge investment in time
and effort, except: If you use languages with FP, and are rigorous, then I
believe (possibly wrongly) its actually somewhat simpler to understand because
the style of coding in FP exposes much of this "better" than in classic
imperative coding

So yes, I think you're right: great dream, hugely hard to do, if not actually
impossible in most cases near as damnit, but there are places where program
proofs take you to this, Military/Space/Medical needs to know the code calls
don't have unexpected outcomes. And language choices which express as "how"
you express code, can also help. Probably not enough.

~~~
oblio
I think it's impossible, or close to it. Isn't it just another form of the
halting problem?
[https://en.wikipedia.org/wiki/Halting_problem](https://en.wikipedia.org/wiki/Halting_problem)

~~~
striking
Well, the alternative is that you limit the set of things you can do, such
that you no longer have a Turing machine. But that's certainly less fun, isn't
it?

~~~
AnimalMuppet
Depends on your definition of fun.

------
gandalfgeek
For those who don't want to read the whole paper, I'd made a short ten minute
explainedy video going through it:

[https://youtu.be/3NWUgBsEXiU](https://youtu.be/3NWUgBsEXiU)

~~~
sitkack
I love your approachable style. Keep it up!

------
squarefoot
The article is Tl;Wbmloc;Dr (too long, way beyond my level of comprehension,
didn't read), so forgive any stupid questions. Could the unikernel technology
be used to build extremely small Linux distros that contain just the bare
minimum to satisfy all dependencies for running say a single software or a
small group of them. I'm not interested in virtual machines for network
related services but rather in small embedded systems where the user might
need to use just one single application for the entire board lifetime.
Security wouldn't be a concern there, assuming the user already has complete
access to the hardware, that layer could be moved outside of the box by
inspecting and firewalling the network traffic.

If yes, does anyone know if there is/will be any automated way say to analyze
a software, example: a .deb package containing an user software, then build a
list of dependencies and create a bootable image for the requested
architecture which will run just that software?

~~~
vidarh
It doesn't really make much difference in terms of size in itself, I suspect.
You're still carrying along the whole kernel and the application code, they're
just linked together into a single binary. You can certainly then trim things
down a lot by stripping the kernel to its bones by removing drivers etc., but
you can do that even without linking it together, and you can go a lot further
than what the normal kernel configuration mechanism allows if you're prepared
to strip out functionality you know you'll never need, but I'm hard pressed to
think of approaches like that which you can apply to the kernel that will only
work if you move your application code to kernel space.

In the past I've brought up Linux on embedded boards with 4MB RAM and 4MB
flash for both the kernel and the application, and we gained most by linking
statically to a smaller libc on the user side and stripping out drivers on the
kernel side, ditching bash for ash (currently dash) and using a stripped down
barebones init. We could have ditches ash and init as well and booted straight
into the app, but ash made debugging and testing slightly easier and we could
fit it in. 4MB at the time was not even a challenge.

I think the gap from something that you can fit a Linux kernel and an app
mashed together into a unikernel up to being able to run them separately will
be very small, so in most instances if you can make it work as a unikernel
odds are you can produce a nearly a small solution without it.

Going full on unikernel potentially costs you a lot of resilience against
failure, it's not just about security. The main reason for wanting to do it is
to reduce the amount of context switches and data copying to reduce _latency_.

~~~
rwmj
The linking stage automatically removes all parts of Linux (eg. drivers) that
are not needed. We can also apply various link-time and whole process profile-
guided optimizations. Apparently most of the performance improvement comes
from getting rid of system call overhead although we still have a student
measuring these things more precisely.

~~~
vidarh
> The linking stage automatically removes all parts of Linux (eg. drivers)
> that are not needed.

So does the normal kernel build process once you've specified which drivers
you need compiled in. That's in no way unique to a unikernel approach.
There'll be extremely little code that can be stripped from a unikernel that
can not be stripped from a kernel built for the purpose of running a specific
application.

It's just not where the benefits of a unikernel are. As I said, and you
reiterated, it's really cutting latency from syscalls that is the big benefit
of a unikernel.

~~~
rwmj
That's not how most distro kernels are built, but sure you can build a custom
kernel with everything statically linked in. What you can't do is link your
application in as well and do whole-program optimizations on the whole binary,
which is what Ali is trying to do here. We're also hoping that the eventual
solution will be distro friendly enough that we can pack the "kernel library"
into a future Fedora which you can use to link your own servers into
unikernels.

~~~
vidarh
Of course it's not, but the comment I replied to was about whether using it to
build binaries for extremely small embedded systems made sense. In that
scenario a distro kernel from one of the desktop distros is a total non-
starter.

If a few MB of space matters, then you'll want a suitably configured kernel
that disables a ton of stuff that is totally irrelevant for such a system, and
you'll need to take that configuration into account whether or not you're
building a normal kernel or a unikernel, as it involves not just excluding
drivers and the like, but disabling functionality that if left enabled will
not get removed by the linker because there will be calls into it from code
paths that are reachable.

Do that, and you'll find it is pretty straightforward to get a Linux kernel
down under 1MB.

Now, it's possible your tooling makes that _easier_ , but my point is that it
is equally possible to disable those parts of the kernel for a non-unikernel
system as well, and you're not going to make significant additional savings.
And if you want to, then ditching glibc for one of the smaller libcs would be
a better starting point.

This is not a criticism of this project - size is simply not what most people
look to unikernels to address, especially not one built on a general purpose
kernel.

------
rapsey
> Unikernels have demonstrated enormous advantages over Linux in many
> important domains, causing some to propose that the days of Linux's
> dominance may be coming to an end.

Where are unikernels widely used?

~~~
c0l0
I don't think that this statement describes the reality we live in.

To me, unikernels feel quite a bit like statically linked server binaries
running under an unprivileged UID - but you're choosing not to trust Linux'
(or any other kernel's) user separation facilities, but your hypervisor's domU
separation facilities instead. In exchange, you lose virtually all of your
existing OS's amazing debugging and performance analysis/tuning tools. It's
not a tradeoff I'd readily consider.

~~~
mirceal
i think this kindof misses the mark. A unikernel is both the kernel (just the
things you need) + app together. It’s build for one thing: whatever the app is
doing.

Tooling/Perf/etc is not needed when running (do you really want to debug in
production) but tooling can be used in the process of development.

Why consider unikernels? 1) they’re stupid fast. You have only the things you
need (booting in nanoseconds? yepp) 2) small attack surface (because you only
have your app that’s the only attack surface. you don’t have cruft that build
up in you os/kernel over years) 3) light resource usage (you could run
thousands of these on the same physical machine) 4) true isolation via the
hypervisor

definitely worth keeping an eye on the developments in this space

~~~
imbriaco
> Tooling/Perf/etc is not needed when running (do you really want to debug in
> production) but tooling can be used in the process of development.

Whether or not you want to debug in production, reality often means that you
will see things in a live environment that you will not see in other
environments.

Unikernels are very interesting and have a number of compelling attributes,
but let's not pretend that the current state of available tooling for
troubleshooting, instrumentation, and general debugging isn't a challenge.

~~~
weberc2
You're moving the goal posts and invoking a straw man. No one is advocating
for "pretending", and the anti-unikernel argument is that the inherent cost of
unikernels in general is the loss of kernel debugging tools; not simply that
"right now the unikernel debugging experience is subpar".

~~~
convolvatron
in general right now, running on a full-featured monolithic kernel, the
debugging experience is really pretty bad. especially in the target
environment of horizontal clustered services or lots of little micro services.

so I actually believe there is an opportunity here to focus on the important
pieces (network messages, control flow tracing, memory footprints, etc) after
ejecting a huge amount of irrelevant stuff

~~~
imbriaco
Totally agree, there's definitely an opportunity there if the surface area of
what the system is doing gets smaller. The big difference today is that it's a
lot easier to compose a debugging suite using additional tools on top of a
traditional host-based runtime for now.

Definitely excited to see how the technology evolves over the next few years.
It hasn't moved as fast as I'd have expected over the last 2-3 years but I'd
love to see that accelerate.

------
ch_123
Is there any way I can read this if I don't have an ACM subscription?

~~~
enriquto
In this case the authors themselves put the pdf on their website.

In the general case, you can almost always download the article from scihub by
searching for the DOI of the publication.

~~~
beefhash
> In this case the authors themselves put the pdf on their website.

OT: This should honestly be a criminal offense, at least if done before a year
passes. The publisher does valuable work in vetting contributions and checking
the paper before publishing. They shouldn't be cheated out of their fair share
of money that comes with due process just because authors go rogue.

~~~
tecleandor
Can't really say if this is sarcasm or not, but journals are paid to do that
checking, it's not free, and they use unpaid people for peer review, so I'd
say they aren't really losing money.

------
dwheeler
This is really cool, though it looks like the _security_ impact is nuanced.

Unikernel plus: A lot of code, including code with vulnerabilities, can
disappear. So _some_ vulnerabilities disappear.

Unikernel minus: The unikernel and application don't have any security
isolation, since the kernel is essentially linked into the appl. As a result,
the application has all the privileges of its virtual machine (not the subset
that would normally be imposed by the kernel), which would normally be more
privileges than it really needs. So any remaining vulnerabilities can have a
more devastating effect.

That trade-off in terms of security is really hard to evaluate.

But the kinds of performance improvement shown here, with relatively modest
changes, is a really really big deal. So I'd expect a lot of people to
investigate this further; it certainly seems promising.

~~~
edwintorok
If you write your unikernel in a high-level language then you can mitigate
some of your security concerns. The unikernel should only have access to the
data it needs to, so doing multi-tenancy or isolation inside the unikernel is
probably not what you want: just spin up a new unikernel for each tenant/thing
you want to isolate. The hypervisor would provide isolation in this case.

There is a downside, if there is a vulnerability the exploit can probably make
hypercalls straight to the hypervisor, but a hypervisor can have less of an
attack surface than a full OS.

------
vectorEQ
all unikernels so far suck ALOT regarding security. wouldn't recommend any of
them to run in any production environment unless you want to lose all of your
data.

NCC group did a good article on unikernels and how crap they are if you want
to know.

A better idea, like already suggested would be to create an OS which builds
itself according to a target application and system, to host that application
on said system specifically. that would reduce about as much code, but keep
potential security mechanisms like aslr, stack protection , user / kernel
separation etc. in tact. (now kernels / oses build to target system ,but not
application! -> application would only use subset of kernel, and thus kernel
can be built to target application, reducing kernel to whats needed).

don't try to be cheap for performance and skip security, we're not in the
damned 80s anymore.

/endrant

~~~
roddux
I'd argue the Mirage Unikernel (built almost wholly in OCaml) is one of the
most robust platforms out there. The NCC paper you talk about looks at two
rather old fashioned unikernels in isolation. I don't think the idea of
unikernels should be discarded because the current implementations are
slightly lacklustre -- it just shows that there's a fair way to go yet.

>[..] A better would be to host that application and reduce the kernel to
whats needed

This is a unikernel.

~~~
mlinksva
The authors of the NCC paper are evaluating MirageOS as well. IIRC from
listening to their talk on the paper and ongoing research
[https://www.youtube.com/watch?v=b68VFuB_y5M](https://www.youtube.com/watch?v=b68VFuB_y5M)
it's got more of the problems other unikernels do than I'd have assumed. I'm
pretty ignorant, but the paper gave me the impression that there's a long
(rather than fair) way to go yet, especially relative to seemingly widespread
assumption that unikernels are inherently more secure.

------
fulafel
Can anyone commnent on how this approach differs from LinuxKit? Or is the
LinuxKit plan at all unikernel (MirageOS) based anymore?

I hope people will only use unikernels with memory-safe languages or otherwise
well sandboxed runtimes.

~~~
eyberg
It's worth noting that Linux itself is written in C (not rust).

------
compilers
Anil Madhavapeddy's work on unikernels is also quite interesting to look at.

------
pkilgore
> a support layer for Emacs

The best part of this joke was that I can't tell if it's shade or earnest
appreciation.

------
ElijahLynn
This is a paid article. Am I missing something?

~~~
iamnotacrook
I'm getting this link:
[https://dl.acm.org/citation.cfm?id=3321445](https://dl.acm.org/citation.cfm?id=3321445)

The link the PDF contains IDs and stuff which differ each time I visit, which
seems unnecessary.

Maybe get it from here:
[https://www.reddit.com/r/programming/comments/cb17mn/read_a_...](https://www.reddit.com/r/programming/comments/cb17mn/read_a_paper_unikernels_the_next_stage_of_linuxs/)

which links to:
[https://www.cs.bu.edu/~jappavoo/Resources/Papers/unikernel-h...](https://www.cs.bu.edu/~jappavoo/Resources/Papers/unikernel-
hotos19.pdf)

------
mtgx
I'd rather see unikernels written in Rust or any other memory-safe language.
If we're going to start from scratch with this, let's do it right this time.

70%-90% of bugs seem to be memory-related. Let's end that with a little bit of
effort upfront for much fewer headaches in the long-term.

[https://twitter.com/LazyFishBarrel/status/115341007092920320...](https://twitter.com/LazyFishBarrel/status/1153410070929203201)

[https://www.zdnet.com/article/microsoft-70-percent-of-all-
se...](https://www.zdnet.com/article/microsoft-70-percent-of-all-security-
bugs-are-memory-safety-issues/)

[https://security.googleblog.com/2019/05/queue-hardening-
enha...](https://security.googleblog.com/2019/05/queue-hardening-
enhancements.html)

~~~
rwmj
The entire point of this paper is _not_ to start over from scratch, but to
reuse existing software (Linux and memcached in this case), and fiddle with
the linker command line and a little bit of glue to link them into a single
binary. If you want to start over from scratch using a safe language then see
MirageOS.

 _Edit:_ I don't mean to say you have to use C for the "userspace" part of
this. It should be possible -- in future -- to use any language for that.
However at the end of the day you'll still be linking that with Linux (written
in C) and glibc into a single binary that runs in one address space.

~~~
edwintorok
So this would be basically like rumpkernels but with Linux instead of BSD?

~~~
rwmj
It's kind of the opposite (AIUI). Rumpkernels run kernel code in userspace. We
run userspace code and the kernel in kernel space.

