

Multi-core HTTP Server with Node.js - sh1mmer
http://developer.yahoo.net/blog/archives/2010/07/multicore_http_server_with_nodejs.html

======
axod
>> "While single-process performance is quite good, eventually one CPU is not
going to be enough;"

Every article on this sort of thing seems to just gloss over this part. Why
isn't one CPU enough? What is using it? Serving static files certainly won't.
Doing simple things won't...

Does anyone have any use cases / experience for when this was the case? :/

edit: Fine downmodding fanboys. I get it. Use whatever you like. Meh

~~~
sh1mmer
Since Node.js is still fairly new technology people are starting out with
'hello world' examples such as static file servers. Obviously specialised
servers like Traffic Server or nginx handle these cases faster.

That said Node is a programming environment so the question is, when on a
multi-core machine (which all DC machines are) how can we scale to use all the
cores so we can do much harder stuff.

What about a node system to deal with 100k concurrent long-poll connections?
When some of those are active they could be really active, requiring all the
cores, etc. There are lots of scenarios in which more compute power is useful.

~~~
axod
I agree there's cases where more CPU power is useful, but I'm just not sure
it's a good idea to firstly assume you need it before it's an issue, and
secondly to split the whole thing (networking IO) over multiple cores, rather
than just shell out the CPU heavy stuff to multi-cores.

Networking IO isn't CPU heavy. There's no reason to increase complexity and
slow throughput in the hope that more CPUs will help...

~~~
silentbicycle
Part of node.js's appeal comes from writing all the server code within
Javascript, even when it'd be more efficient breaking pieces out into separate
programs. In that case, worrying about CPU usage for the server itself makes
some sense.

Not saying I agree with the design choices (I'm more of a multiple language /
"hard and soft layers" person, and I don't care for Javascript), but I think
that's the reason.

------
sedachv
"However, rather than accepting connections using this socket, it is passed
off to some number of child processes using net.Stream.write() (under the
covers this uses sendmsg(2) and FDs are delivered using recvmsg(2)). Each of
these processes in turn inserts the received file descriptor into its event
loop and accepts incoming connections as they become available. The OS kernel
itself is responsible for load balancing connections across processes."

Racing (ie thread-safe) accept() is a really good way to improve server
throughput. Epoll is also awesome for being thread-safe.

------
postfuturist
From the day node.js was released you could run multiple instances on
different ports and stick a load balancer in front of it. Even now, I think
that is a healthier option than baking the number of processes into the script
itself.

~~~
sh1mmer
Doing it using Node allows you to use application logic to balance rather than
just raw traffic.

This entirely depends on the use cases you have.

------
mathias_10gen
Why are they using multiple processes rather than multiple worker threads? IPC
is much costlier than using shared memory, even if it's just passing the
initial state.

~~~
Goosey
Multiprocess is going to be more robust against failures, for one. While it is
a bit of an apples-to-oranges comparison, I thought Chrome had shown pretty
conclusively the benefits of adoption multi-process over multi-threaded

~~~
mathias_10gen
I don't think there is a browser that uses a full thread-per-tab model, so
there's really nothing to compare it against. The problem with other browsers
is that slow JS or in some cases flash in one tab will slow down other tabs.

Also, the tabs in a browser running many sites are much closer to traditional
use of processes. Web application servers running the same codebase on each
request seem like a better fit for threads. For one thing, the security model
for the two uses of JS are very different.

As for robustness, if doing X will cause a crash and your code does X, then
you will just have a bunch of crashing processes rather than just one. How is
that better? Wouldn't the real solution be to either stop doing X or fix X so
that it doesn't cause a crash?

~~~
Goosey
<i>As for robustness, if doing X will cause a crash and your code does X, then
you will just have a bunch of crashing processes rather than just one. How is
that better? Wouldn't the real solution be to either stop doing X or fix X so
that it doesn't cause a crash?</i>

Sure, but bug-free code doesn't exist and not all crashes happen 100% of the
time. If you have a difficult to track down crash bug that happens for some
mysterious reason once every 100,000 requests... Would you rather have that
resulting in the entire server blowing up or one request-session blowing up?

------
hackermom
Isn't this tremendously inefficient compared to just running Apache, nginx or
whatever floats your boat? Both of them thread perfectly across SMP systems.
While it's an interesting implementation, I completely fail to see the point
of even using it. Does anyone have any sane usage scenarios they could share?

~~~
mmaunder
Node.js lets you write server applications in a server container that can
handle tens of thousands of concurrent connections in a loosely typed language
like Javascript which lets you code faster. It uses the same design as Nginx
which is why it can handle so many connections without a huge amount of memory
or CPU usage.

If you were to do this on Nginx you'd have to write the module in C.

You can't do it on Apache because of Apache's multi-process/thread model.

The fact that you can write a web server in a few lines of easy to understand
and maintain Javascript that can handle over 10,000 concurrent connections
without breaking a sweat is a breakthrough.

Node.js may do for server applications what Perl did for the Web in the 90's.

~~~
alecco

      > If you were to do this on nginx you'd have to write the
      > module in C.
    

You can write a module to glue nginx and V8. Many people've done it. It takes
less than 400 lines of code and a lot of it is nginx typical module code. (The
problem is more about the lack of nginx online help, perhaps.)

    
    
      > The fact that you can write a web server in a few lines of easy to
      > understand and maintain Javascript that can handle over 10,000
      > concurrent connections without breaking a sweat is a
      > breakthrough.
    

Yes. But the big performance issue still is hitting the database and disks.
There's no point in having a super fast web server if the DB is dog slow like
the vast majority of databases out there. Including the NoSQL bunch. They are
not fixing the issue of latency vs. scalability vs. reliability. For that they
need to address many uncomfortable problems of current hardware architectures.
This is the elephant in the room.

~~~
IsaacSchlueter
Well, at least it's an elephant we're all talking about a lot. It's not as if
anyone's ignoring that.

If your DB is dog slow, you can still handle wildly high concurrency with
node. It's just that the user will feel the slowness, and your DB will
struggle. But at least your server won't be unable to serve requests while
it's waiting for IO.

Of _course_ it's still on the developer to architect their system for success.
Node just takes one unnecessary bottleneck out of the equation.

------
c00p3r
Why not nginx? ^_^

It is so inadequate to use things like JVM or V8 to serve static content.

Btw, is there any cool-web-server project for Flash - yet another artificial
blob (or to be more correct - tumour) in an OS? ^_^

And of course there should be some dynamic web server written on PHP! (yes,
you can run it standalone)

