
A Solution to CPU-intensive Tasks in IO Loops - pors
http://williamedwardscoder.tumblr.com/post/17112393354/a-solution-to-cpu-intensive-tasks-in-io-loops
======
ot
_A watchdog thread can be running every n milliseconds. This is very low load
on the system. [...] If the loop has not moved onwards since the last sample
or two it can be deemed stalled. [...] But you can move the other events in
the affected loop to a fresh thread; you can go sideways when you’ve detected
a blocking task._

Congratulations, you just invented (a very inefficient version of) pre-emptive
multitasking.

~~~
willvarfar
(article author)

Spot on except for the bit about being inefficient; presumed performance is
very much the thing that makes the complexity worth it.

~~~
yxhuvud
What are your thoughts on the disruptor pattern? It seems to me to be
something that tries to eat the cake of multithreading while still having a
lot of the simplicity of eventing.

( <https://code.google.com/p/disruptor/> )

~~~
jamii
The disruptor is just a very efficient event loop. It doesn't prevent you from
blocking the handling thread.

~~~
yxhuvud
True, if you do blocking stuff in the handler instead of in the worker
threads.

------
rektide
The final question/answer is resolved around,

 _Hellepoll has the concept of a task tree - tasks can be subtasks of others,
and this simplifies tidy-up when one aborts. This explicit linking of tasks
and callbacks can be used to determine what gets migrated when a task blocks,
to ensure that the cascade of events associated with a request do not
themselves get split, but fire in the originating thread and in the right
order even if at some point the thread triggers the blocking watchdog._

He asks, in closing,

 _I am not a node.js user but I wonder if this approach could be transparent
in node and not actually break any of the API contract there?_

This is the work being done on 0.8, with domains and isolates. It is
explicitly to allow this kind of task/work parenting to be made:
[https://groups.google.com/forum/#!msg/nodejs/eVBOYiI_O_A/-mA...](https://groups.google.com/forum/#!msg/nodejs/eVBOYiI_O_A/-mACjP-
CHtsJ)

~~~
equark
Isolates have been removed.

[http://groups.google.com/group/nodejs/browse_thread/thread/c...](http://groups.google.com/group/nodejs/browse_thread/thread/ccbceea36f76857d/6b8b8a487d2ab817)

------
moonchrome
Why don't you just lock each connection handler to one thread at a time and
dispatch events on a thread pool ? That way connection level events are always
synchronous, but event handlers are spread over the thread pool, you get
optimal load balancing because events fill the pool (no processes) and thread
pool can use it's own logic to grow if one channel handler used blocking IO
and is blocking the a pool thread.

This is pretty much what netty does with OrderedMemoryAwareThreadPoolExecutor
?

[http://netty.io/docs/stable/api/org/jboss/netty/handler/exec...](http://netty.io/docs/stable/api/org/jboss/netty/handler/execution/OrderedMemoryAwareThreadPoolExecutor.html)

    
    
               -------------------------------------> Timeline ------------------------------------>
    
     Thread X: --- Channel A (Event A1) --.   .-- Channel B (Event B2) --- Channel B (Event B3) --->
                                          \ /
                                           X
                                          / \
     Thread Y: --- Channel B (Event B1) --'   '-- Channel A (Event A2) --- Channel A (Event A3) --->

~~~
willvarfar
It does have advantages from code-organisation, but not from all-out
performance.

One of the complexities of this is ensuring that a selector doesn't return a
handle as being read/writable when another thread is still executing a
previous trigger.

------
halayli
Check out lthread_compute_begin() and lthread_compute_end() functions. It
allows you to block inside a coroutine without affecting other coroutines.
(example at the end of the page)

I prefer coroutines over IO loops because they result in simpler and cleaner
code. And with lthread_compute feature, you get the advantages of real threads
+ the lightness of coroutines.

<https://github.com/halayli/lthread>

------
rektide
_So the server has multiple threads. If a handler blocks in one thread,
another thread can pick up incoming requests. So far, so good. In return for
needing to carefully synchronise access to shared state, we get to efficiently
share that state (even if its just a hot cache of secure session cookies -
things you don’t want to be validating every incoming request etc) between
many threads and multiplex incoming requests between them._

Sharing state is bad, m-kay? Allow Node to do it's thing ( _So the server has
multiple threads. If a handler blocks in one thread, another thread can pick
up incoming requests._ ).

It's aggrieving that this model requires any given handler to be able to
service any given request, tbh. Shared state is a folly. A serializing token
scheme might work well: if a request fails to find the data local to it's
core, it passes a serializing token of the request around the ring of
handlers, asking either a, for the required data, or b, take the token and run
the data.

Serializing tokens are a concept Matt Dillon spoke of often at the inception
of DragonflyBSD; much like locks, except that ownership is not relinquished,
someone always hold the token, but instead phase changed, yielded to another.
it's a responsive less a stateful ownership.

Sadly that token ownership negotiation requires some kind of interruption in
the currently-occupied worker thread: if that thread could be interrupted to
do other things, this serializing token negotiation might be an acceptable
argument (ending with a) no, i'm busy using that set, b) sorry, i had the data
and was free, so i completed it, or c) here's the data, i'm busy and not using
it). But it does still require thread-interruption. If the worker thread can
yield frequently and resume, finding the answer might be a small enough
invisible enough calculation to help plaster over there being interruptions
altogether; that's essentially the hope. The result would be the mating of
green threading to with location aware latency aware multi-processing.

~~~
willvarfar
Tokens are very interesting.

A key thing I've learned on my travels is how critical shared state is to
performance.

At its crudest, lots of web frameworks read a request, wait to receive it all,
dispatch it to a handler, gather all the output from the handler, buffered,
and then write it.

This is great from the code organization perspective.

Hellepoll is so very much faster because it streams the requests.

Without a green threading approach, this necessarily makes the code slightly
uglier, and much more for the coder to juggle.

Of course there are at least three layers in an event loop framework:

* the programmer making the event loop itself who has to juggle everything;

* the prorgammer making the various protocol handlers who has to understand everything too but fits the framework conventions rather than creates them;

* finally the programmer who is using the framework who gets to use the utility and decoration and wrappings made by the protocol programmer and who hopefully doesn't have to understand in anything but the broadest terms the way the framework ticks.

I'm a big fan of Clojure and STM, but that's correctness over performance. I
would hope whoever makes some other platform puts as much effort into hiding
and protecting the inner shared state that is so critical to performance.

------
jconley
This is almost exactly what ASP.NET and IIS do in recent iterations.

[http://blogs.msdn.com/b/tmarq/archive/2007/07/21/asp-net-
thr...](http://blogs.msdn.com/b/tmarq/archive/2007/07/21/asp-net-thread-usage-
on-iis-7-0-and-6-0.aspx)

~~~
rektide
I like how they admit right out that they pass off requests to a ThreadPool
and that it's non-optimal for application performance.

 _Similar to IIS 6.0 (classic mode, a.k.a. ISAPI mode), the request is still
handed over to ASP.NET on an IIS I/O thread. And ASP.NET immediately posts the
request to the CLR Threadpool and returns pending. We found this thread switch
was still necessary to maintain optimal performance for static file requests._

Author goes on to say it was done because static file serving is blocking,
which ate away too heavily at a unified threadpool between IIS and ASP.NET.

I'd call out SEDA here again. Passing work items around between executors is
in-advisable. That said, the typical problem to "you need a responsive request
handler" is "offload the work item asap and yield," and done well there's many
many right ways for that effort to look like SEDA.

~~~
jconley
It's a pretty good trade-off with a heterogeneous application workload.

There is a "to the metal" mode in ASP.NET/IIS now, mentioned in that article,
that lets you use the IOCP thread directly in your applications and avoid that
context switch. But, of course, if you do blocking things there it will
completely block your web server.

------
superrad
Doesn't erlang effectively do this, allowing its processes to only execute a
certain amount of vm instructions before allowing a switch to another process?

~~~
willvarfar
Yes, it is one of the neat languages with the green threads that I alluded to
towards the end of the article.

Your description also fits CPython, though ;)

