

Clojure, Node.js and Concurrency Fail - swannodette
http://dosync.posterous.com/clojure-nodejs-and-why-messaging-can-be-lame

======
lonestar
In what way is pushing 20K requests/second "concurrency fail"?

Node.js doesn't make it easy to share state between processes, but once you're
scaling to multiple processes, the jump to multiple machines probably isn't
far behind. You'll need to design a distributed algorithm, or just centralize
your shared state in something like Redis anyway.

~~~
dedward
"The processes end up with incorrect states pretty much immediately."

Concurrency failure.

~~~
pjscott
If you consider the multi-server case, this is a consensus problem. To solve
it, you need to either use an algorithm like Paxos, or use some central
synchronization point like a Redis or memcached server. Yes, node.js fails
compared to Clojure when it comes to taking advantage of multiple cores on
this example. Clojure's concurrency primitives are really slick. But that's
not the whole story.

~~~
jules
It would be interesting to see Clojure's transaction support extended across
multiple machines.

------
axod
Seems like a rare edge case. Also a bad design IMHO

Expanding your network IO handling code to use multiple cores sounds like fun,
but it's the wrong way to do things.

Network handling code is not CPU heavy. Have a single process handling the IO,
and for CPU heavy tasks hand off to other threads, handled with callbacks.

I also wish we'd stop benchmarking these sorts of things in NNk
requests/second. As if most use cases will get anywhere near that in
production.

~~~
swannodette
Seems like a generalization. People are interested in evented web programming
for applications/services that are less traditional in design - where
memcached doesn't buy you anything and NNk requests/second do matter. If
you're putting your evented application on an 8-core box, you most certainly
want to leverage a good number of those cores.

~~~
axod
>> "If you're putting your evented application on an 8-core box, you most
certainly want to leverage a good number of those cores."

Unless you're CPU bound, there is absolutely no point.

------
davidw
Some enterprising hacker want to add an Erlang version and compare it with the
Clojure version?

------
andrewvc
This is fascinating, but for a lot of us building web apps we have to scale
beyond a single box, which means shared state needs to be in something like
memcached.

I love Clojure, but it's concurrency constructs aren't that relevant to most
web programming. Which is fine, but let's keep in mind that developing a web
app with Clojure is painful now due to a lack of libraries / a large enough
community. I'm eagerly awaiting the day that changes.

~~~
cageface
I'm still trying to figure out exactly where the Clojure concurrency model is
useful. How many problem domains require extremely high throughput in a single
shared memory space? It seems to me that you're usually either well within the
bounds of what a single cpu can handle or you're going to need multiple
machines and can just keep throwing procs at the problem.

~~~
pjscott
It's great for doing things like k-means clustering or Delaunay triangulation.
Those might not come up a lot for you, but _some_ people need to do them, and
would love to be able to speed that up by throwing several processor cores at
the problem. Solving them on a cluster is a lot harder, and often unnecessary.

(On k-means clustering, in particular, Clojure-style transactional memory
gives an almost linear speedup with the number of processor cores, without
significantly changing the code. That's worth something.)

~~~
kanak
Could you please provide a link to the implementation? I'm really interested
in learning how this is done. Thanks.

~~~
pjscott
Sorry, I can't point to any code, but in both cases the approach is the same:
you have some shared data structures, like an array of regions containing
points, or a red-black tree, or a priority queue, and several threads all read
and write these data structures. Since transactional memory lets you run
concurrently unless there are actual memory conflicts, this often gives better
concurrency than locks.

Of course, for examples like these, the tricky part is designing data
structures that don't have many inherent memory conflicts. Red-black trees,
for example, are tricky because the rebalancing transformations tend to step
on other threads' toes.

------
jshen
How much memory were each using?

~~~
swannodette
Amazon Compute Clusters have 23gb of RAM. If top is to be believed Node.js by
the time it's done is around 1.6% memory and well above 420% CPU. Aleph is at
9% memory and steady around 380% CPU the whole time. No surprises really.

------
chewbranca
So wait, in his first post, "aleph (~8.5K req/s) edges out Node.js v.0.1.100
(~7.0K req/s)" which was with one node.js process on one core. Node.js scales
fairly linearly (from what I've seen) as you add additional processors and
processes, so I find it very hard to believe that running an additional
node.js process would still be slower than aleph. I say testing fail.

~~~
felixge
I think Amazon EC2 cannot be trusted for these benchmarks.

On the _physical_ hardware I tested node scales very linearly with the amount
of cores added.

On Ec2 I have seen very strange results / bad performance. There is something
seriously odd here, and I suspect it has to do with Amazons virtualization.

On the box the author tested with, I would expect node (or nginx) to easily
serve 50k req / sec.

I'll have to do some more research to figure this out.

------
c00p3r
Please, try to use some external database/key-value storage lookup instead of
printing "hello world" and show us memory and CPU usage. ^_^ Especially for
Clojure.

------
wmoss
This is the best Hacker News story in years. Finally a post about both Closure
and Node.js.

