
The cost of a system call [pdf] - caustic
http://www.cs.cmu.edu/~chensm/Big_Data_reading_group/papers/flexsc-osdi10.pdf
======
jws
The FlexSC paper from 2010.

They observe the cache damage from traditional system calls and propose batch
queueing them and ideally using a different core to service them. This is not
the traditional Unix programming model, so they create a threading package the
transparently makes your traditional Unix synchronous system calls work. They
benchmark Apache with all of this new apparatus and it performs very well.

~~~
ape4
Video of the paper presentation...
[https://www.usenix.org/conference/osdi10/flexsc-flexible-
sys...](https://www.usenix.org/conference/osdi10/flexsc-flexible-system-call-
scheduling-exception-less-system-calls)

------
Animats
A callback-oriented kernel call mechanism. Hmm. The callback-oriented
framework people should love this. It looks like you have to keep polling the
shared page to see when your system call is done, though.

It's painful to realize that, after a context switch, modern CPUs can need
11,000 cycles to get back to full speed, with the right stuff in the caches
and pipelines. Maybe we need CPUs which handle context switches better.

~~~
Sanddancer
A lot of that is dumping cache and then trying to refill it after the context
switch. Regarding context switches, one of the things that they suggested is
to pin one core to stay in the system context and just handle servicing
syscalls, etc. Given a modern server can have up to a few hundred logical
cores, that's not as big of a thing to ask for as it was even a few years ago.
Even "cheap" servers these days have 8-16, so pinning there might even make
sense as well.

~~~
nickpsecurity
That's what I did in some of my designs. It was more about covert channel
mitigation by ensuring the secrets and untrusted stuff used seperate CPU's.
Side benefit was performance benefit of less cache flushes. It works.

------
PascLeRasc
I found this paper _incredibly_ interesting, and I think I'd love to work in
research in this area. Does anyone have some resources to learn more/keywords
to search? I'm currently a biomedical engineering undergraduate, so the most
relevant course I've had has been Digital Logic, which I absolutely loved and
did very well in, but I'd really appreciate advice on additional courses to
try to take.

~~~
eru
On the most basic level, you should be able to write a simple operating system
from scratch. I heard
[http://pages.cs.wisc.edu/~remzi/OSTEP/](http://pages.cs.wisc.edu/~remzi/OSTEP/)
is good for an introduction to OS writing.

Some papers I read in no particular order:

Synthesis OS
([http://valerieaurora.org/synthesis/SynthesisOS/](http://valerieaurora.org/synthesis/SynthesisOS/))
might be interesting for you. They do lots of runtime code synthesis.

Exokernels (follow links from
[https://en.wikipedia.org/wiki/Exokernel#Bibliography](https://en.wikipedia.org/wiki/Exokernel#Bibliography)).
And more recently Mirage ([https://mirage.io/](https://mirage.io/)) and HaLVm
([https://github.com/GaloisInc/HaLVM](https://github.com/GaloisInc/HaLVM))

(I assume you already know how to program. Otherwise, brush up on that as step
0. C is still the canonical choice for OS work. But if you are feeling
adventurous there's more choice.)

------
wyldfire
The interesting bit I was eagerly anticipating is buried way down in "3.1
Exception-Less Syscall Interface." How'd they do it? Syscall pages. This
sounds really interesting, though apparently not terribly new.

I'd be really concerned about trust issues, but I'm sure it could be done
safely. Lots of room for corner cases, especially w/NUMA.

------
circlingthesun
We should just let V8 run in the kernel and do away with system calls.
[https://www.destroyallsoftware.com/talks/the-birth-and-
death...](https://www.destroyallsoftware.com/talks/the-birth-and-death-of-
javascript)

------
ape4
This is cool but it certainly makes things more complicated. It adds a kernel
mode thread per process.

~~~
prewett
I think you can do it with just one kernel mode thread for all processes,
using one (or more) pages of memory per process/thread. The kernel process can
read all the pages, but pages can only be read by their respective processes.

It looks like this is not what the article's implementation does, but I think
it would be possible.

------
robinanil
This is fantastic. Any plans of getting this into main stream Linux kernel?

~~~
burfog
www.google.com/patents/US20140149781

~~~
kazagistar
RedHat has the patent... thats good, right?

------
erichocean
The seems like the kind of thing you'd expect to see implemented in Redox.[0]

[0] [http://www.redox-os.org/](http://www.redox-os.org/)

~~~
ketralnis
How so? This talks about changing how syscalls work (batching, callbacks,
pinning the syscall handler to one core) but it doesn't look like redox does
anything special with syscalls ([https://doc.redox-
os.org/doc/kernel/syscall/index.html](https://doc.redox-
os.org/doc/kernel/syscall/index.html))

> The system call interface is very similar to POSIX's system calls

------
nwmcsween
iirc xok exokernel had the same thing by doing 'scheduler activations' through
a vdso. How does flexsc handle cancellation points? What about latency?

------
TerryADavis
TempleOS is ring-0-only. No system call overhead.

