

What it means to be non-blocking in node.js - Raynos
http://raynos.org/blog/13/What-it-means-to-be-non-blocking-in-node.

======
ender7
I don't understand why there seems to be so much confusion as to how node
works. Here's my mental model:

1\. The basic part of the system is a single thread running an event loop. It
has a queue of waiting events. Every time the event loop starts, it pops off
all of the waiting events and then gives your code a chance to process them,
one after the other. Once your code is done processing this batch of events, a
new even loop starts. Any events that occurred while your code was running
will now be processed.[1]

2\. Some kinds of work take place outside the event loop. I/O tasks such as
sockets and file reading/writing are good examples. In Node.js, I/O calls
automatically spawn a new thread, which goes off and does whatever task you
wanted it to do. When it finishes, it pushes an event onto the event loop's
event queue, which will be handled by the event loop the next time it starts a
new loop. Other things, such as TCP servers, run perpetually in their own
thread, but will push new events onto the event queue whenever something
interesting happens (receiving new data, etc.).

3\. You, the user, cannot do work outside the event loop. You cannot spawn
threads. That is something that the framework does automatically, and only for
certain, built-in tasks like file I/O. Node might add support for arbitrary
asynchronous work later on, but for the moment this is not possible.

4\. Remember from #1 that a new event loop only starts when the previous event
loop has completed. Since an event loop is essentially "the code you have
written to process events that you care about", if any of that code takes a
very long time to complete, then there won't be another event loop for a long
time. Any new events that occur during that time (such as new HTTP requests)
will not be processed until a new event loop occurs.

[1] Exception: events that _your code_ emits are processed immediately.

~~~
Raynos
3\. child_process API and bolting C++ extensions onto node allow you to do
this.

------
tedsuo
I think there is some confusion/conflation with "blocking IO" and "blocking
the event loop". If some requests are more computationally intensive than
others, you will see slower response times for the less computationally
intensive requests. This is similar to the classic "queueing" problem with
some load balancing algorithms, where a fast request can get queued up behind
a slow request on one worker process, only to miss out on another worker
process becoming available. Even if you load balance with node, each node is
processing multiple requests so you can still have this problem.

One thing to note is that it's not inefficient. You will be maximizing your
CPU's and will be processing work as efficiently as possible, but users will
experience some requests having a variable response time. This can happen any
time you have a pool of generalized worker processes btw, it's not just a
"node thing".

Solutions? If some requests do significantly more computation than others, you
can split the computation in half using process.nextTick(). This will let
smaller requests go faster in exchange for the slower request going slower
(remember, we're already at max efficiency, so it's not like everything is
going to get faster). This is better than sending the work out to a separate
process unless the job is so big that the serialization/IPC cost is
negligible. Otherwise you're adding extra work and making everything slower.

But really, having to manually chop up your algorithms is kind of a special
case. Most requests in your app probably take the same order of magnitude in
terms of computation time, or they are obvious outliers (pdf generation, video
encoding, etc) that you would want to move to a separate process.

~~~
Raynos
That's the whole point. Most requests have the same order of magnitude.

If you have two sets of requests that have different orders of magnitude then
put these two sets on their own node worker process behind your load balancer.

As long as you send your requests to the worker process that handles requests
of a similar order of magnitude you will never have the faster requests being
stuck behind slower requests problem.

------
karterk
Well written post. I guess a lot of people don't understand the contexts under
which node.js is useful. They treat it like a magic solution that will cure
all their scaling issues. Node.js is useful for I/O bound apps which are not
CPU intensive because then your event loop is nothing but a "manager" which
manages multiple incoming requests, and hooks them upto the I/O in an
asynchronous manner and when the I/O finishes, sends the result back. For such
use cases, node.js is very efficient and scales well (in the sense that with
just one node.js process, you can handle many requests).

~~~
ILoveNode
Disagee this is a well written post.

Agree people don't understand the contexts under which a node.js type system
is useful.

Personally, I think there are a million better options than server-side
JavaScript.

~~~
karterk
If it boils down to a matter of personal preference, that's true for almost
anything.

The author of the actual post that the OP is responding to did not quite get
what async behavior meant in node.js - i.e. it's less to do with using
process.nextTick() to buy you parallelism, and more to do with async I/O.

And, there are other server-side JavaScript engines than just node.js
(although it's only node which is getting all the attention), so I wouldn't be
making such a generic statement.

~~~
ILoveNode
Oh I'm completely comfortable making such a generic statement.

Every time you use JavaScript God kills a bunny.

~~~
Raynos
Please take your troll attacks of JavaScript elsewhere.

Thank you.

------
fedd
how do experienced people compare one (or 8, by the number of cores) threaded
node.js performance with java's asynchronous IO servlets?

<http://tomcat.apache.org/tomcat-7.0-doc/aio.html>

~~~
Raynos
Is Java faster then JavaScript? Yes.

Is it easier to write performant code with node.js vs Java Async IO?
Subjective.

~~~
grannyg00se
Java is faster than Javascript on V8? This is accepted fact? Can you provide a
reference?

~~~
Raynos
It's pretty well known that Java hovers around 1.1x the speed of C where as JS
hovers around 2-3x the speed of C.

So yes Java is faster.

~~~
mappu
{ 'offtopic' : "

I totally agree with you, but i just want to draw your attention to the phrase
'the speed of C'. You almost make it seem like it's a global constant, like
the real C-for- _celeritas_ speed of light, some unobtainable blazingly fast
mirage accessible only to quantum physicists and unix greybeards.. but we
don't need a particle accelerator to beat the performance of C, just better
JITters. There's not really anything in the language stopping the performance
from being reached (okay, well for JS there's the type system.)

" }

~~~
Raynos
When I say the speed of C. I'm really comparing various language X compilers
to the GCC compiler under the assumption that the GCC compiler is the best.

No it's not a magical constant but I think for a baseline comparison "how
close your compiler X is to GCC" is a fair thing to compare.

I also doubt V8 or spidermonkey can get better then GCC _on average_

------
skybrian
It would be more interesting to compare it to goroutines, which are much
cheaper than threads.

------
MostAwesomeDude
Or you could examine the Twisted answer: Use Ampoule and defer long-running
CPU-bound work to external processes which are bootstrapped into Python using
your already-existing worker code. Node _does_ have subprocess controls; I
just checked the documentation. If somebody hasn't built an Ampoule for Node,
then maybe somebody should get on that.

(Or just use Twisted and Ampoule. Seriously, guys.)

~~~
ender7
No offense to the Twisted guys, but after developing a project using Twisted,
you could not pay me to do so again.

Their documentation is a joke, they managed to completely gunk up their
architecture with a crudely bootstrapped attempt to solder interfaces onto
python, and their development community is not particularly...welcoming...to
newcomers.

Suffice it to say that when the only real way to get anything done with a
framework is to read the source and then ask questions in their IRC channel,
then something is very, very wrong.

~~~
MostAwesomeDude
Hi, I'm currently doing Twisted documentation, as part of a project to recover
divmod.org's contents. How could I improve Twisted's documentation for you?

I'm sorry you don't like zope.interface. Do you prefer ABCs? What would make
interfaces better for you? Did you prefer Twisted's builtin interfaces to
zope.interface?

How was the community unfriendly? What can we do to improve the community
experience?

(Anecdote: The only way I've been productive with Django is through reading
their source; their IRC channel and documentation are useless.)

~~~
ender7
1\. Change this page [1] to have a single, canonical tutorial. Just one link.
You can have links to other resources and third-party guides further down, but
there should be one awesome entry point into "how to get shit done with
Twisted".

2\. Make it short. As short as you can _possibly_ make it and still have it be
a good tutorial. If someone has to spend 3 hours reading docs before they can
even begin to start writing something, they're just going to go off and use
some other framework. For example, the "introduction" section consists of
three pages [2][3][4] that should all be cut. They're either fluff that no one
except the developers are interested in reading, or provide information that
is way too advanced for a newcomer to understand.

3\. Throw away the finger tutorial. Completely. It's about 5 times longer than
it should be. It also makes the old mistake of "here's how to do it, no turns
out that's the wrong way, here's the right way, no turns out THAT was the
wrong way, HERE's the right way". You need a tutorial that's short, and that
addresses the most common use-cases for your product. This will probably
involve a TCP server or an HTTP server.

4\. Make sure that after reading your tutorial _and nothing else_ , someone
could actually go build a small project, organized _in the correct style_.
That means they're going to have to know about twistd, services, factories,
protocols, reactors, zope interfaces, and more.

5\. If it seems like fulfilling #4 is going to be impossible, then there's a
problem with the design of the system, i.e. it's too complex, and needs to
hide more of its implementation details from the user.

6\. Pages like these [5][6] are going to scare most of your users away. They
seem to exemplify the worst of the Enterprise Java interface-for-a-factory-of-
factories code bloat that most web developers live in fear of. If this is
really the "right way" to develop Twisted apps, then that needs to be
explained sooner, and more concisely.

7\. Look at your competition! The documentation for Node.js is sparse (a
little too sparse) but it's not bad. Check out their synopsis page [7]
(useful!). The entire API docs are fairly well-commented, have lots of
examples, and (and this is important) never nest more than one level. The
front page is itself the minimum viable documentation. If they had a couple
extra pages on how to use modules correctly and how to run a node as a system
service and it would be doing pretty great.

8\. My issue with Twisted's use of interfaces is that it makes the API
documentation really hard to read. Every function or class I want to use
requires an ISomething that I've never heard of and don't know how to create.
Then it turns out that my service or factory or whatever actually _is_ an
ISomething, but I didn't know that because the thing that created it returns
an ISomethingElse BUT because I asked for it to be a TCP server it's actually
an instance of SomeClass, which also happens to implement ISomething. Which I
only find out by asking in the IRC channel.

9\. In my experience, asking questions in the IRC channel makes you feel
stupid. The people there tend to give off the feeling of "oh _god_ , this is
so _simple_ , why don't you know that to start a Foo you need to pass it an
IBar, which your particular service counts as because ..., DUH". Maybe the IRC
channel is not supposed to be for support, in which case...you need better
documentation :p

Good luck! Documentation writers are the unsung heroes of many a project,
forging success and popularity with little recognition. Twisted also has a lot
of cool features that Node does not; I would be happy to see them succeed.

[1] <http://twistedmatrix.com/trac/wiki/Documentation>

[2]
[http://twistedmatrix.com/documents/current/core/howto/vision...](http://twistedmatrix.com/documents/current/core/howto/vision.html)

[3]
[http://twistedmatrix.com/documents/current/core/howto/overvi...](http://twistedmatrix.com/documents/current/core/howto/overview.html)

[4]
[http://twistedmatrix.com/documents/current/core/howto/intern...](http://twistedmatrix.com/documents/current/core/howto/internet-
overview.html)

[5]
[http://twistedmatrix.com/documents/current/core/howto/tutori...](http://twistedmatrix.com/documents/current/core/howto/tutorial/components.html)

[6]
[http://twistedmatrix.com/documents/current/core/howto/tutori...](http://twistedmatrix.com/documents/current/core/howto/tutorial/backends.html)

[7] <http://nodejs.org/docs/v0.5.10/api/synopsis.html>

~~~
glyph
Thanks for the thorough critique! Obviously doing all of this is a ton of
work, so it's not going to get done right away, but none of this is
inconsistent with our existing long-term plans for Twisted's documentation. I
hope that if you do get forced to work with Twisted again we will have made
the learning experience considerably smoother :-).

