
The benefits and costs of writing a POSIX kernel in a high-level language - thrill
https://www.usenix.org/conference/osdi18/presentation/cutler
======
xenadu02
An unoptimized research kernel using GC (instead of ref-counting or Rust-style
lifetimes) manages to get within 5-15% of a traditional kernel for the things
they measured. That is impressive.

Lots of asterisks and caveats apply I'm sure, but it bolsters the argument
that writing kernels and device drivers in a high-level safe language wouldn't
impose the 2x or worse perf penalty detractors like to claim. I know I'd trade
10% perf for the elimination of whole classes of security vulnerabilities.

(For the same reason, I want the standards committee to produce a bounds-
checked dialect of C where I can choose to pay some perf cost to get real
dynamic and bounds-checked arrays)

~~~
davidcuddeback
For what it's worth:

 _The conventional wisdom shared by many of today 's software engineers calls
for ignoring efficiency in the small; but I believe this is simply an
overreaction to the abuses they see being practiced by penny-wise-and-pound-
foolish programmers, who can't debug or maintain their "optimized" programs.
In established engineering disciplines a 12% improvement, easily obtained, is
never considered marginal; and I believe the same viewpoint should prevail in
software engineering._

Donald Knuth. _Structured Programming with go to Statements_. (It's the same
paper that the oft-misquoted "premature optimization" quote comes from.)
[http://www.cs.sjsu.edu/~mak/CS185C/KnuthStructuredProgrammin...](http://www.cs.sjsu.edu/~mak/CS185C/KnuthStructuredProgrammingGoTo.pdf)

~~~
naasking
> In established engineering disciplines a 12% improvement, easily obtained,
> is never considered marginal

That's right, but "improvement" doesn't necessarily mean performance. Adding a
GC would yield more than 12% reduction in vulnerabilities and bug count, which
are also "improvements".

~~~
majewsky
> Adding a GC would yield more than 12% reduction in vulnerabilities and bug
> count

Also when compared to Rust-style lifetimes? (Did Redox OS have any CVEs yet?)

~~~
lossolo
Does anyone is using Redox OS ? I don't see how you can compare OS used in
hundreds of millions servers to one that is used nowhere in case of CVE. No
one is probably searching for CVEs in Redox OS compared to Linux Kernel.

------
ilovecaching
I would love to see a production ready kernel written in Rust adopted by the
industry. I am exclusively using Go and Rust at home and at work, and I find
myself equally productive in both. Go lacks a lot of 'high level' features for
a GC language, so I don't find the GC to be that advantageous. Rust is safe,
fast, and a joy to write. I'm not surprised by these numbers, I would be
interested to see a breakdown of compile times for similarly sized kernels in
Rust, Go, and C.

~~~
dorfsmay
Production ready means different things to diff people, and I don't know how
many people use it in Prod, but have you looked at redox:

[https://github.com/redox-os/redox](https://github.com/redox-os/redox)

~~~
kbenson
It would be really interesting to see a Linux ABI layer implemented on top of
redox, a la IllumOS, to allow software designed to run on linux to run.

Given that Bryan Cantrill is gung-ho about rust, and he's got quite a lot of
knowledge in this area, maybe someone can woo him to help (even if in an
advisory role).

~~~
mlinksva
The Redox leader did ask
[https://twitter.com/jeremy_soller/status/1042261893367250944](https://twitter.com/jeremy_soller/status/1042261893367250944)

------
thrill
The paper contributes Biscuit, a kernel written in Go that implements enough
of POSIX (virtual memory, mmap, TCP/IP sockets, a logging file system, poll,
etc.) to execute significant applications. In experiments comparing nearly
identical system call, page fault, and context switch code paths written in Go
and C, the Go version was 5% to 15% slower.

~~~
pjmlp
I would expect that slowdown to be more a consequence of optimizer
implementations than anything else, given the quality of C compilers code
about 30 years ago.

------
DblPlusUngood
Well that was fast! I'm an author of this paper and would be happy to answer
questions.

~~~
frankmcsherry
I'm not at OSDI or I would have asked there, but:

1\. Your skepticism for using Rust cites [26], the PLOS paper from the same
people who did [27]. They've essentially recanted, in that their PLOS talk was
right after Niko's keynote, and he then talked them through how interior
mutability works in Rust. Other than [26], are there other lingering concerns?
I could imagine so, but it would be great to take stock.

2\. You don't seem to page to disk, and do OOM shootdowns instead. If you did
enable disk paging, how comically bad an idea does a tracing GC become? :D
Naively, I could imagine RC-GC being less horrible, but perhaps that's my
built-in (and absolutely for real) bias.

~~~
DblPlusUngood
1\. I wouldn't say we are skeptical of Rust, I'm sure it could be made to work
well and we would love to see such a Rust kernel! I do wonder whether it would
be harder to implement highly-concurrent data structures in Rust though, such
as the directory cache.

2\. GC performance would be basically unaffected by paging, since kernels
don't page their own heaps out to disk and thus a kernel heap access would
never have to wait for a disk IO. Or maybe you had something else in mind?

~~~
frankmcsherry
1\. Jon sits two desks away, right? ;)

Re 2., I assumed (incorrectly, it seems) that virtual memory mappings can
eventually spill to disk. If you grab 1/32 of RAM For the kernel, then you
could only virtually map 16x RAM if the page tables have 512 entries.
Admittedly, I took my OS class in the previous millennium, so I might be out
of date here...

------
lima
While not a full kernel, Google's gVisor is a userland Linux kernel emulator
written in Go:

[https://github.com/google/gvisor](https://github.com/google/gvisor)

They use it in production for App Engine sandboxing.

~~~
XorNot
"Production" \- they won't commit to actually supporting it for production for
the rest of us (currently got app engine flex burning a hole in our pocket at
work because the old runtime was too problematic and sans it being out of beta
we can't move to the new one).

------
Skunkleton
Considering all of the stuff that Go does for you this is hugely impressive.
It is especially impressive that a research kernel is within 10% of Linux's
performance.

~~~
justincormack
It is missing a lot of stuff that is potentially time consuming, like access
control.

------
jaunkst
GC can make high demand applications stutter. From a performance and security
prespective I think rust makes a great candidate for a HLL kernal.

~~~
pjmlp
GC enabled system programming languages like D, Active Oberon or Modula-3,
among others, allow for many ways to allocate memory, it is not GC all the way
down.

------
Senderman
The misplaced hyphenations mid-paragraph in that abstract made me vocalise the
text in my head as a reader with hiccups.

~~~
lapinot
Thats the sad state of (i guess) automatic pdf text extraction, consequence of
research papers being exclusively in pdf, consequence of (la)tex coming from
another age. I love and praise tex for what it allows to do, but my opinion is
that now is the time to get past it, learn everything it has done right and
apply the new knowledge we have in language design to get a better surface
language (lower friction syntax, higher-level semantics allowing to separate
structured content from typography and extract other stuff than a visual
document from a source file). Tex being so good means it has such a monopoly
that this kind of project have to be tremendously good to have a chance (which
is probably a good thing).

------
xvilka
POSIX itself an outdated standard. Some things we're designed without
understanding of modern security and portability problems. Either new
iteration of the standard required, or a new standard from scratch.

------
bitwize
But what did the memory footprint look like compared to C?

~~~
nineteen999
Last time I looked (which was a while back) both Go and Rust binaries for
"Hello world" were over a megabyte with out-of-the-box settings, compared to
just a few kilobytes for C and slightly more for C++. I should qualify that
they were the simplest of Hello world programs, I didn't add any fancy options
to try and reduce the size at all.

I'd be keen to see what the smallest binaries the reference implementations of
these languages can produce are.

~~~
steveklabnik
Rust's standard library is statically linked, C's is not. You have to compare
the total size. Additionally, Rust includes jemalloc by default on many
platforms.

If you really want to get extreme with this,
[http://mainisusuallyafunction.blogspot.com/2015/01/151-byte-...](http://mainisusuallyafunction.blogspot.com/2015/01/151-byte-
static-linux-binary-in-rust.html), while a bit old, talks about various things
that are still true, though some stuff has changed a bit. There's also
[https://www.rust-lang.org/en-US/faq.html#why-do-rust-
program...](https://www.rust-lang.org/en-US/faq.html#why-do-rust-programs-
have-larger-binary-sizes-than-C-programs)

Additionally, some people have done more work since then; the smallest binary
rustc has ever produced is now 145 bytes: [https://github.com/tormol/tiny-
rust-executable](https://github.com/tormol/tiny-rust-executable)

~~~
nineteen999
Thanks - yes I meant to mention, that statically linked (and stripped of
symbols) I get 745k for the C hello world with gcc 7.2, so the gap isn't as
big as all that. I understand that shared libraries/dynamic linking is still a
WIP for rust so it's not exactly a fair comparison yet. With a shared standard
library you also get the benefit of having the RAM shared across multiple
processes.

The 145 byte example also not being a fair comparison, since you've left out
the standard library! Obviously C can do that too, but it's pretty neat to see
the progress rust is making.

(anyway, I'm going off topic since we're discussing kernel programming here).

~~~
steveklabnik
Don't forget that the stripping also isn't automatic on the Rust side either,
so you have to do that if you do it for the C :)

------
Animats
Interesting. It's not that Go is a "high level" language, it's that it is
garbage collected. What happens if it runs out of memory?

~~~
cdoxsey
This is covered in the paper. They use static analysis to know the memory
costbof a kernel function and rather than optimistically run the kernel
function and then unwind it if they run out of memory, they delay it until
memory is available.

------
throwaway5250
Intriguing. Interesting that it implements fork(2), even though Go itself
cannot really deal with that call.

~~~
anyfoo
Though implementing fork() in a kernel is entirely distinct from using fork()
in user space. Implementing fork() is mostly "just" manipulating a bunch of
data structures the right way. Copy the right bytes and set the right values.

It's like how you can implement Java, C++, python, Haskell and any other
(practical) language in C, despite C having no support for e.g. classes or
lambdas. Or how humans can design and build combustion engines, despite us not
having the capability to ingest gasoline in order to rotate an axle at 6000rpm
ourselves.

~~~
throwaway5250
Yeah, I get the difference in levels of abstraction. Still interesting, at
least to me. Among other things, it limits how "thin" the POSIX emulation can
be. So, for example, if I had a POSIXy super-CPU, an implementation of fork()
in C, running on that CPU, could be rather thin. An implementation of fork()
in Go seems like it'd necessarily have to be considerably thicker, as fork()
is more foreign to Go.

But yes, I understand that this noodling has little to do with what the
authors were going for, nor really with current reality.

Definitely a cool--amazing even--project.

~~~
aidenn0
I fail to see how the implementation of fork in C would be any thinner than an
implementation of fork in Go. I expect the implementations would look very
similar.

------
jstewartmobile
5-15% seems like a bargain compared to the massive slowdown everyone noticed
after the first meltdown patches (not that this would have fixed that problem,
of course).

If this approach took off, maybe it would light a fire under the chip industry
to put a little more thought into hardware-assisted garbage collection.

------
duckqlz
Since when did C stop being a high level language O_o?

~~~
Const-me
I think it's a moving target. 70 years ago, assembly was high-level.

~~~
aaronmdjones
Some would argue that it still is.

[https://blog.erratasec.com/2015/03/x86-is-high-level-
languag...](https://blog.erratasec.com/2015/03/x86-is-high-level-
language.html)

