
On problems with threads in Node.js - xylem
http://www.future-processing.pl/blog/on-problems-with-threads-in-node-js/
======
spion
Not sure if it was intentional, but the article is quite misleading.

The thread pool in node is only used for a limited number of APIs. Pretty much
all networking uses native async IO and is unaffected by the size of the
thread pool. Things like Oracle's driver are rare exceptions: the typical
MySQL/PostgreSQL/redis etc drivers all use native async IO and are unaffected
by this.

The author only glosses over this briefly. As a result this article leaves the
impression that the problem described is the norm, which is not the case.

~~~
suprememoocow
It's a completely unscientific method, but searching through one of our large
applications (`npm ls|wc -l` -> ~2000 dependencies), the only modules I can
find using `uv_queue_work` are:

* kerberos, unused (dependency of mongodb)

* protobuf, for serializing data

* snappy, for compression

kerberos isn't actually used in our app, so it doesn't matter, but we send a
lot of data through protobuf and snappy, so it may be worth us profiling this
a little more.

~~~
spion
You can also experiment with different values for the env variable
`UV_THREADPOOL_SIZE`; last time I checked this can even be set from the JS
code via `process.env['UV_THREADPOOL_SIZE']` if you make sure to do it before
you call something that uses the threadpool.

------
laggyluke
It's worth noting that DB drivers that actually integrate with libuv are a
minority - most DB divers use Node-level network APIs and are unaffected by
such thread pool limits.

~~~
eknkc
Why would a DB driver need to integrate with libuv in the first place?

~~~
paraboul
In the case where the driver (for instance the official libmysql) doesn't
expose its network interface (e.g. its sockets can't be used with an external
event loop). In that case, it would block on IO operation, and there is no
other choice but to push it to another thread.

That said, using a driver in another thread leads to various complexities that
can be avoided when running in the same application (javascript/v8)
thread/event loop.

------
cddotdotslash
I actually ran into an issue recently with CPU intensive tasks blocking my web
server. It turns out that "querystring" (used to parse request bodies in web
applications) is an asynchronous, blocking request. You'd never notice much
slowness, until your request bodies are massive (think 50 nested JSON objects
and some base64 image data for good measure) and you have multiple per second.
Now, every request is blocked until the previous one is processed. I'm still
trying to figure out a solution, after looking into worker threads, etc.

~~~
spion
I always use the following replacements written by petkaanotonov, the author
of bluebird :)

[https://www.npmjs.com/package/querystringparser](https://www.npmjs.com/package/querystringparser)
for query string parsing. Depending on content, you may get massive
improvements (5x-20x)

[https://www.npmjs.com/package/fast-url-
parser](https://www.npmjs.com/package/fast-url-parser) for url parsing (the
built in url parser is the main reason why node is so far behind on the
TechEmpower benchmark - with this replacement the benchmark shows about 60-80%
improvement in served req/s)

[https://www.npmjs.com/package/cookieparser](https://www.npmjs.com/package/cookieparser)

For very large post bodies, I use JSON in conjunction with OboeJS -
[http://oboejs.com/](http://oboejs.com/) . Its not too much slower than native
JSON.parse (about 5-7 times) however its non-blocking. Still haven't found a
solution that is close enough in speed to native JSON.parse

~~~
kapv89
This comment in itself is probably more valuable than the OP

------
GordyMD
Thank you for sharing your findings here. Very pertinent to me right now -
could actually drastically minimise the amount of research I have to do today.
I love HN.

------
albertzeyer
Is there a detailed overview about which functions in libuv (Nodejs) rely on
blocking primitives and thus are using the thread pool to work async?

From
[here]([http://docs.libuv.org/en/latest/design.html](http://docs.libuv.org/en/latest/design.html)),
it sounds like all file IO is always based on blocking primitives, and native
async file IO primitives are not used, although such async file IO primitives
do exists and were tried out in libtorrent
([http://blog.libtorrent.org/2012/10/asynchronous-disk-
io/](http://blog.libtorrent.org/2012/10/asynchronous-disk-io/)). The result of
that experiment however was mostly that the thread pool solution was simpler
to code (I guess).

From the libuv design doc, the overview is:

* Filesystem operations

* DNS functions (getaddrinfo and getnameinfo)

* User specified code via uv_queue_work()

I wonder whether this is really the best solution of if some combination of a
thread pool and native async disk IO primitives could perform better.

~~~
masklinn
> The result of that experiment however was mostly that the thread pool
> solution was simpler to code (I guess).

and uniformly asynchronous (native async operations may not cover e.g. file
copy or filesystem operations, furthermore filesystems may block during
submission of IO ops which makes the operations effectively synchronous) and
have higher throughput (they support read/write vectors).

~~~
albertzeyer
We could use the thread pool for blocking primitives and otherwise use the
native AIO primitives, couldn't we?

And the higher throughput seemed only to be a problem on MacOSX, so we could
fallback to the thread pool there, but use the async IO on Windows and Linux.

~~~
laggyluke
I thought that's basically what libuv does: it uses AIO on platforms that
support it, falling back to thread pool on platforms that don't.

Edit: apparently I was wrong.

------
mjpa
Is this another case of "here's the code I ran" when in fact they didn't?
There should be 3 lines of output, not 6!

Also, the code says it will print the time taken since the start of the
program, which again doesn't go with the output and the conclusion being made!

Anyway, how come the output isn't in order?

~~~
xylem
Oops, thanks for that - seems the results from the first run of the example
somehow got lost in the final version and I didn't notice.

The order of the output is dependent on when each call finished - they run in
parallel, so it's not guaranteed that functions will end in the order they
were invoked.

~~~
mjpa
Ah yes, so they do. My lack of sleep is showing!

For some reason I was thinking the readdir would run in series so output would
go up by ~1s each time.

------
z3t4
Use named functions to avoid closures!

    
    
      for (var i = 0; i < 3; ++i) {
        namedFunction(i);
      }
    
      function namedFunction(id) {
        fs.readdir('.', function () {
          var end = process.hrtime(start);
          console.log(util.format('readdir %d finished in %ds', id, end[0] + end[1] / 1e9));
        });
      };

------
vinceyuan
Does it mean the async functions about file system operations are not really
asynchronous in Node.js? The whole node.js server will be blocked if the
number of file system operations is bigger than the thread pool size. :-(

~~~
laggyluke
Not the whole server - most of the networking IO will work just fine, only the
operations that use libuv thread pool will be queued.

~~~
vinceyuan
Got it. queued != blocked Thanks!

------
imaginenore
4 seems like a ridiculously low default size for the thread pool.

~~~
justincormack
It is only for file system operations, and mostly these do not actually block,
as the results are available in cache. So it is probably reasonable for casual
use.

Now if you want to get good performance on an SSD (ie the rated iops) you will
need a decent queue depth, like 32 or so, so it wont work but thats a
specialist use case.

~~~
jrochkind1
SSD's are a specialist use case these days? Or getting good performance on
them is? What do you mean?

~~~
justincormack
Getting full performance from them is a specialist requirement. Most people
are not disk IO bound on SSD.

~~~
imaginenore
Isn't pretty much every DB that doesn't fit into RAM?

------
ryankshaw
(noob question) why would the libuv threadpool choose to use a static 4
instead of something like matching the number of processor cores available by
default?

~~~
zwily
For things like filesystem access, you'd want more threads than CPUs because
that's not CPU heavy. It still seems like they could choose a saner default
though.

------
jaysoncena
Is there an easy way to identify the number of tasks queued in uv's work
queue?

------
babuskov
To sum it up: node.js runs only 4 background threads. Be aware of this.

No big deal really. I have multiple servers running node.js under load for 3+
years and never had an issue with this.

In fact, I found it helpful. If your database is under load and is already
running 4 heavy queries, not giving it any more jobs is actually a good thing.

~~~
pdpi
On the other hand, not being able to hit the filesystem because you have those
four queries running is... not so helpful.

In general, this looks like it could be a performance bottleneck for highly-
concurrent applications.

~~~
simpleigh
I agree with this - the right place to restrict the number of concurrent
database queries is the database, not the whole IO layer!

------
exo762
Piece of badly engineered software conquering the world. As if more JavaScript
was really something worth pursuing.

~~~
lordbusiness
Do you realize that comments like this are what is breaking the community here
at HN?

Are you aware that people like you are destroying something that was once
brilliant?

HN is going through an Eternal September.

~~~
eklavya
Calm down man, nobody's breaking anything. Everyone is entitled to his/her
opinion. You can express your thinking on the matter by up/down voting. Relax
:)

~~~
lordbusiness
Heh - maybe you're right, and I wish the world were as you describe. :-)

There is however a deeper underlying issue; decorum is important and
communities that exhibit genuine 'niceness' are nice. Communities that allow,
or worse, overlook dark behaviour degenerate.

Flagging and down voting is one part of the solution, but when the nastiness
reaches a level that the nice people start to disengage and go elsewhere, it's
clear to me that we need another element of control. Perhaps algorithmically
detecting repeat offenders? Perhaps more granularity with down votes?

There's are differences between a down vote because one disagrees with the
author, and a down vote because one believes the author is ill-informed and
spreading misinformation, and a down vote because the author is being
downright juvenile.

A number of hits on the third case against a given author on multiple comments
could conceivably constitute an automatic warning and / or banning system.

I don't want people to be unable to express their views, but when the mean-
spirited people who contribute nothing but nonsense start to represent a large
percentage of a community, it's reasonable to see if anything can be done.

~~~
jasonlotito
> There's are differences between a down vote because one disagrees with the
> author, and a down vote because one believes the author is ill-informed and
> spreading misinformation, and a down vote because the author is being
> downright juvenile.

The difference is that the first two should not be voted down. If you vote
down, you should not comment. If you comment, it means at the very least the
comment added to the conversation, unless your comment is also not worth
posting and you should be voted down as well.

It's fairly simple: does the comment bring value to the conversation? If it
does so directly, vote up. If only indirectly, than don't. If it does not,
vote down.

Whether you disagree or not is irrelevant. And someone being ill-informed
should be corrected. At the very least, by writing an incorrect comment, they
are presenting an opportunity to be corrected.

> I don't want people to be unable to express their views, but when the mean-
> spirited people who contribute nothing but nonsense start to represent a
> large percentage of a community, it's reasonable to see if anything can be
> done.

Things can already be done. Vote down and don't reply. That is the best way.
Vote down and ignore.

