
How Erlang does scheduling (2013) - weatherlight
http://jlouisramblings.blogspot.com/2013/01/how-erlang-does-scheduling.html
======
pron
When Quasar was very young we had something very similar to Erlang's
reduction-based preemption; we later disabled it in favor of preemption on
communication only. The reason is this: suppose you have n cores. Now, suppose
you have a single fiber that hogs the core for a while. If this occurrence is
rare, reduction/time-slice preemption doesn't really help, because the
scheduler has n kernel threads, and one of them would steal the tasks
scheduled on the thread running the "runaway" fiber and all is well.

If, on the other hand this behavior is very common, and the number of fibers
exhibiting it is k, then if k > n, then you're in trouble no matter what
scheduling or preemption you do. k fibers that constantly ask for CPU they
can't have (because k > n) means that work is getting delayed and you're well
out of soft realtime territory.

So since n is small (say 8-60), and the number of fibers is large (10K-1M or
more), the number of fibers that can reasonably often require a lot of CPU
(and reduction/time-slice preemption) is very small compared to the total
number of fibers, and those would be special, well-known, fibers anyway
(remember that if a fiber needs a lot of CPU only occasionally, the many-
threaded scheduler handles that gracefully even without preemption as long as
most other fibers are well-behaved). In that case rather than base the
scheduler designed for ~1M fibers around the behavior of ~10 fibers just
doesn't make sense. It is far more efficient to simply run that CPU-hungry
code in plain-old kernel threads, and let the kernel worry about their
scheduling. In Quasar, unlike in Erlang, it's very simple to choose whether
code should run in a user-mode-scheduled fiber or in a kernel-scheduled
heavyweight thread, and so, after seeing that turning on reduction-based
preemption had no positive effect in practice, we turned that feature off.

------
rdtsc
Yap always a favorite by Jesper Louis Andersen

By the way, Erlang as of next release (19) will have dirty schedulers. That
will make it even easier to integrate blocking C modules into Erlang VM. Can
do it now of course but to do it right have to do your own thread + queue +
locking to avoid blocking schedulers.

Here is article from same author about it:

[https://medium.com/@jlouis666/erlang-dirty-scheduler-
overhea...](https://medium.com/@jlouis666/erlang-dirty-scheduler-
overhead-6e1219dcc7#.hsn4pc47e)

Checkout more of his writings:

[https://medium.com/@jlouis666](https://medium.com/@jlouis666)

------
daveguy
This is interesting. U have briefly looked at Erlang and I wonder a few
things, for any Erlang gurus:

1) A lot of this sounds like OS style scheduling. Does this reduce complexity
(maybe by Erlang dealing with various OS scheduling quirks?) or does this
increase complexity because now you have to deal with both the Erlang
scheduler and the OS scheduler.

2) It struck me as odd that he said, sending to a larger mailbox takes more
resources. Does that mean sending to a mailbox with a larger incoming queue
takes more as a way to balance queues or does that mean a mailbox with a
greater capacity takes more?

3) The preemption and soft-realtime sounds most interesting. Are there foreign
function interfaces that could allow you to get Erlang style concurrency with
other languages? Python (and a lot of others, but especially python) are
absolutely aweful at multitasking due to the Global Interpreter Lock (GIL),
with no intention by the developers to fix it. Could you magically add the
ability to preemptively slice Python code by adding it to an Erlang framework?
Would the overhead of Erlang+Python not be worth it? Usually in a language
like Erlang or Python you would call C for speed. Could you call C and get
preemption and scheduling on that too?

~~~
felixgallo
Not a guru, but:

1\. As with any multithreaded application, if your OS scheduler is heavily
loaded by other applications, you can experience issues. Erlang is designed to
degrade fairly slowly and fairly gracefully before it has to give up. From a
complexity standpoint, the programmer doesn't worry about schedulers of either
type except in rare cases.

2\. Sending to a larger mailbox doesn't take more resources, it takes more
reductions. So if a process has a large mailbox, it costs the sender more to
add yet another message into the mailbox. This is a form of backpressure and
helps slow down the system in a gentle(r) way when a single process is
bottlenecking.

3\. You can get Erlang style concurrency out of pony (natively), JVM languages
(scala/akka; quasar), but I doubt you could get it out of Python unless you
reimplemented Python on the BEAM or Java VM. Generally I find personally that
working directly in Erlang is fine and I don't need to combine in any
ruby/python/perl, which are all slightly less expressive than Erlang.

Calling C code works two ways; you can call it out-of-process via what's
called a port, which is basically a unix pipe to your C process; or in-process
via a NIF, which is more dangerous because the scheduler has no control or
understanding inside your C code. Carefully written NIFs, and NIFs written to
take advantage of the dirty scheduling system, can thrive, but it's both an
art and a science.

~~~
rozap
I don't want to be pedantic, but I'm going to be pedantic (sorry!)

    
    
      You can get Erlang style concurrency out of ... JVM languages (scala/akka; quasar)
    

This is a little misleading. Yea, akka gives you actors, and on the surface
they look similar to erlang processes (they have a receive block, and a !
function call), but I think the more subtle differences are what make erlang
interesting, and ultimately really pleasant to work with 1) akka is not
preemptive multitasking. actors are multiplexed across a thread pool, so it's
possible to do blocking IO and screw a bunch of other actors up inadvertently
2) the whole gc thing that applies across the vm.

all sorts of wonderful things fall out of the two properties above, things
which you need to work really hard to get in an akka world.

as far as foreign interfaces to python, it's certainly possible with something
like jinterface, but I'm not really sure what the point would be. generally
you leave erlang behind for similar reasons you might leave python behind
(things that are cpu bound). jinterface exists because java does a whole bunch
of stuff well that erlang falls down at, so they can complement each other
nicely. I'm not sure there are as many of those cases with python.

~~~
phamilton
It is my understanding that quasar gives you something awfully close to
preemptive scheduling. It's not quite as strict as Erlang's reduction based
appproach, but it essentially provides implicit cooperative scheduling. It
allows you to do blocking IO without tying up execution.

On top of quasar, using Azul as the garbage collector will get you a GC
without Stop-The-World behavior on the JVM. It's neither open source nor cheap
though.

That said, I'm plenty happy with BEAM.

------
mietek
_> The cores may be bound to schedulers, through the +sbt flag, which means
the schedulers will not "jump around" between cores. It only works on modern
operating systems, so OSX can't do it, naturally._

That’s a bit out of date.

[https://developer.apple.com/library/mac/releasenotes/Perform...](https://developer.apple.com/library/mac/releasenotes/Performance/RN-
AffinityAPI/index.html)

~~~
cpeterso
Note that OS X's thread affinity APIs, unlike other operating systems, don't
allow to pin threads to specific processors. You can only define an "affinity
set" of threads that should be scheduled on the same processor. Similarly, you
can spread threads to different processors by assigning them to different
affinity sets.

I wrote some code to do this in Firefox for OS X, Linux, and Windows:
[https://mxr.mozilla.org/mozilla-
central/source/xpcom/threads...](https://mxr.mozilla.org/mozilla-
central/source/xpcom/threads/nsThread.cpp#302)

------
cpeterso
How does Erlang's reduction-counting scheduler work with native-compiled code?

I believe Go's scheduler only preempts goroutines at function call entry
points. A goroutine in a tight-loop, that doesn't call any other functions,
could block the scheduler.

~~~
felixgallo
native code can tell the schedulers how many reductions to take, but the
schedulers are actually cooperative and not literally preemptive, so poorly
written native code can lie, crash, consume a scheduler, etc.

~~~
cpeterso
By native code, I meant Erlang bytecode compiled using HiPE or an LLVM-based
JIT like [1], not Erlang calling out to NIFs. I assume the generated code
would need to insert preemption checks.

[1] [http://www.erlang-factory.com/upload/presentations/516/SF-
JI...](http://www.erlang-factory.com/upload/presentations/516/SF-JIT-Pres.pdf)

~~~
felixgallo
I believe the JIT work was experimental. Functions compiled with HiPE are
still subject to the reduction laws and calling into a HiPE function still
provides a preemption opportunity, so HiPE native functions are essentially
indistinguishable from erlang functions.

~~~
cpeterso
Interesting. Thanks!

