
Preemptive Scheduling of Erlang NIFs - hansihe
http://hansihe.com/erlang/elixir/c/2016/07/26/erlang-nif-preemptive-scheduling.html
======
rdtsc
Very cool article. I like learning about these kinds of hacks.

Btw, it seems the author is also the author of this cool project:

[https://github.com/hansihe/Rustler](https://github.com/hansihe/Rustler)

It is a library which helps write Rust NIFs for Erlang, I am following that.

I see both Erlang/Elixir and Rust as one of the best platforms today, which
focus on practical safety and fault tolerance. Combining the two is a great
idea in a large system. Erlang's VM is solid and battle tested, at the core of
many vital projects and systems. Rust probably brought the most innovative
language feature recently -- compile time lifetime and safety checking.

~~~
cpeterso
With people writing NIFs in Rust, how long until someone writes a new BEAM-
compatible VM in Rust? The existing C implementation of the BEAM VM could be
incrementally rewritten in Rust. I was browsing the BEAM source code recently
and the C coding style is rather dated, e.g. lots of #ifdefs and even function
declarations without prototypes.

------
derefr
To add to the point of one of the "better solutions": if you've got a native
library that does heavy lifting (for example, a game physics engine), and you
want to "integrate" it with Erlang, please don't use NIFs. Use ports. Write a
small C-process wrapper around your native library and talk to it over streams
or sockets or shared memory. Use the OS's pre-emptive scheduler for what it's
for.

Externalizing your bulky native code into its own port-programs heavily
increases your system's robustness. Very few native libraries implement a
Rust-like set of safe abstractions and then rely solely on them; most just do
stupid native things with pointers _et al_. So most native libraries likely
_will_ abort() at some point or another. When the one you're relying on does,
it will of course take down "its own OS process." You don't want that OS
process to be _your Erlang node_. You want that crash to be isolated such that
the only state that's destroyed is the corrupted state of the subsystem that
caused the crash. And then you want a supervisor running in your Erlang node
to see the dead port and restart it, so the system can keep chugging along.

Of course, if you want that to be a _low-overhead_ solution, then you should
try as much as possible to _cache_ anything you'd have to repeatedly send to
the C process, within the C process. Treat your externalized native-library
"processor" server like an SQL database server: insert complex state into it,
get handles back, then manipulate the state _without retrieving it_ by using
high-level queries and commands on those handles. (I say cache, but I don't
mean _persist_. The state your Erlang node hands a port-program should all be
_derived_ state that the Erlang node holds the canonical copies of, that you
can feed into the port-program again _when_ it crashes. Treat port-program
state like state in memcached.)

\---

ETA: I've never seen this design myself in the wild, but I've heard it
suggested a couple of times, and it's kind of a cool idea:

Instead of a native C port-program, you can get the same set of advantages as
the above from making your library a set of "dirty" NIFs running in _its own
isolated Erlang node_. The Erlang runtime itself doesn't have much overhead,
so it's pretty cheap to use Erlang as the way to "network-enable" each library
you have into its own server process. Then your "business logic" Erlang node
can communicate with your native-library-wrapper Erlang node, over the Erlang
distribution protocol (which is very cheap if they're on the same machine.)
Saves you a bunch of hassle in trying to beat C into a form that does IPC
protocol-handling well.

~~~
yuubi
> trying to beat C into a form that does IPC protocol-handling well.

The opposite worked at least once: have Erlang code deal with C-friendly
structures.

Last time I got to work on an Erlang project, we needed an external C program
to do some network stuff we couldn't do in Erlang. We just defined a
simplistic protocol easy to parse from C, with {packet, 4} framing (each
message was preceded by a 4-byte length) wrapped around the messages. Pretty
much any format easy to parse or generate in C is trivial to deal with in
Erlang.

~~~
toast0
If you can share, I'd love to know what network stuff you couldn't do in
Erlang?

~~~
dozzie
Raw sockets? ICMP? AF_UNIX? netlink? IPsec? tun/tap devices (under Linux)?
Pick any.

------
wahern
On Linux you can use makecontext to create a new stack and swapcontext to jump
to it. That API was once defined by POSIX but deprecated after pthreads was
added to the standard. It's still useable on Linux and several other OSs, with
the caveat that you probably don't want to load or enable pthreads for your
process. (I think you can mix the two in glibc today as glibc references
thread-local data structures through a dedicated register, whereas a long time
ago it was kept at the base of the current stack and so incompatible with
alternate stacks. But other OSs might have problems and things might change in
the future wrt glibc, too.)

Another solution which is actually (probably?) POSIX compatible is to use
sigaltstack to create a new stack, save your current context with setjmp,
invoke a signal, call setjmp to save your altstack, then longjmp back to your
original position. Now you can jump back and forth at will, everything
copacetic. Calling longjmp from a signal handler is perfectly legit and POSIX
is careful to preserve that ability. But for obvious reasons you have to be
very careful how you accomplish it.

Now, once you throw an interval timer into the mix things get tricky. Normally
you would need to worry about the signal arriving while the C code is in
async-unsafe library routines, but Erlang might be (I don't know for a fact,
though) one of the few large projects that only ever uses signal-safe syscalls
like mmap, read, write, etc. If it's only ever the user library executing libc
code then there shouldn't be a problem.

One thing I really wish POSIX (or at least Linux) supported is per-thread
signal handlers. With per-thread signal handlers you could bundle this magic
into libraries in a clean manner in a multi-threaded process by preserving and
restoring the existing signal mask, sigaltstack, and signal handler(s).
Currently you can only preserve the first two; the third is process-global. I
was working on a project recently where I caught SIGSEGV on a page fault,
longjmp'd back, doubled the relevant memory buffer (or aborted if the address
wasn't managed), and then restarted the computationally expensive operation.
This allowed me to remove all the bounds checking code and resulted in a very
significant speedup. But touching global state makes it very messy and not
suitable for packaging into library form as these days it's best to assume a
multi-threaded process environment.

