
C10K 2012: Erlang Wins, Go a close second. Java, Haskell, Node & Python fail - nirvana
https://github.com/ericmoritz/wsdemo/blob/results-v1/results.md
======
mrj
This doesn't seem to investigate tuning at all.

I see the Tornado script forks processes, for example, but it doesn't
experiment with different values (or PyPy?). The Java code appears to be run
with default VM settings, which pretty much always deserves some tweaks to run
in a server environment.

Those are just the two I'm most familiar with. The rest seem to have similar
faults. Plus, it would be a more interesting test if the server had to perform
some kind of work. Simply echoing the request is not a typical usage and could
really bias the results in favor of setups that would fail on a real project.

I better title might be, "Erlang wins unrealistic test over other VMs in their
default configuration."

~~~
huggyface
_This doesn't seem to investigate tuning at all._

This is always the response to any benchmark where one's pet technologies
don't win. If you have a magic quadrant of tuning, put it forth. If not, you
have said nothing that counters the results.

~~~
SkyMarshal
He doesn't have to, it's enough just to point out gaping flaws in the 'study'
to call it into question. The onus is on the study's author to make it
comprehensive and thorough, not the critics.

This one in particular is clearly incomplete, especially when things like
Haskell, Java, and Python are represented by a single framework, and not even
the best or most optimized ones. For example, why not use Yesod/Warp,
BlueEyes, and Tornado on PyPy, respectively, for those instead?

Also, the submission title is editorializing flame bait. The study's author
didn't make any claims whatsoever about 'Haskell', 'Java', and 'Python', etc.
but rather about Snap, Webbit, and ws4py.

~~~
davidw
> The onus is on the study's author to make it comprehensive and thorough, not
> the critics.

It's always possible for the critics to go out there and do something better,
rather than limiting themselves to pointing out flaws.

~~~
calibraxis
The problem is that it takes effort to make good comparative benchmarks — for
a platform you're not totally immersed in, you'd ideally speak with the
community about your methods. Furthermore, good benchmarks require real
investigation explaining the differences; otherwise you could be measuring
something silly. And you should graph some of the stats by time.

Maybe in an ideal world, quality benchmarks would be commonplace. But in this
world, doing good benchmarks in response to everyone's bad benchmarks is a
heavy burden.

------
KirinDave
For Haskell, there is no doubt it could do better. The culprit is probably the
relatively new Websocket server. For evidence, I cite
[http://www.yesodweb.com/blog/2011/03/preliminary-warp-
cross-...](http://www.yesodweb.com/blog/2011/03/preliminary-warp-cross-
language-benchmarks) But "Not bad for 3 lines of code" is a pretty fair
sentiment.

Erlang and Go doing great is no surprise. Java didn't do too well, but the
implementation didn't really use the most performant tools available.

The real loser here is Node.js. Nearly as long as the Java example, but the
worst performer in the real metrics. So much for "making concurrency easy."

------
eps
Is this a joke? Where is a simple epoll-based C server for a baseline
comparision? :)

~~~
willvarfar
yeah, I'd love to see him enter a hellepoll-based one

------
jrockway
My conclusion is that he managed to write the fastest code in the language he
works with most.

------
kogir
Benchmarking anything on EC2 instances will yield statistically suspect
results. You know nothing about the other workloads on the box, the underlying
hardware, or the network connectivity.

~~~
gchpaco
Doesn't mean it can't be statistically valid, the noise just means you need to
run more tests.

~~~
klodolph
That assumes the noise is uncorrelated. It's not hard to imagine scenarios
where it's correlated.

------
invisible
This test is lacking some due diligence that I feel is skewing the results
significantly (and, I assume, there are more that I haven't even noticed in
Python/Java). (While m1.medium only has one CPU, the network IO cost exists.)

Erlang is automatically threaded by nature so it has some inherit scaling
built-in (so the code benefits from threading as long as it runs correctly).

The Go code is set up to spin up threads inside of ListenAndServe thus gaining
the benefit of splitting up IO.

The Haskell code has a thread specifically for garbage collection, thus
utilizing the second core.

The Node.js code could be using cluster (or threads/fibers more directly) but
isn't for some reason. It also seems this was using the websocket npm (some
unknown code running on the stack!). For a valid test the websocket code
should be written in JavaScript directly in the test itself.

Edit: Researched and Go actually uses ListenAndServe which creates threads on-
the-fly but the m1.medium is bound to 1 CPU (it still benefits from threading
due to IO).

~~~
stock_toaster

      > The Go code is set to spin up threads to accomodate the number of CPUs available on the system (2 logical cores on m1.medium).
    

m1.medium is a single virtual cpu (it is a vm, so no 'cores' to speak of).
c1.medium has 2.

~~~
invisible
Corrected in my post to reflect that - it was kind of difficult to find a real
answer when I was looking that up.

------
dchest
Discussed 3 days ago <http://news.ycombinator.com/item?id=4105317>

------
simfoo
Try again and increase GOMAXPROC to 10-20. See for example here:
[https://groups.google.com/d/msg/golang-
nuts/fjBft6qeMo0/kYSi...](https://groups.google.com/d/msg/golang-
nuts/fjBft6qeMo0/kYSiFTX2j1sJ)

------
gorset
I think he risks dropped connections because the listen queue can overflow.

He doesn't mention increasing somaxconn, which means it probably has a default
of 128 (I don't think increasing tcp_max_syn_backlog is needed since he's
using syncookies). When creating a new connection every 1ms it only takes
128ms for the backlog to fill up, which is not that improbable with GC and JIT
pauses. Java should be faster than erlang most of the time, but the pauses can
kill you if you don't handle them gracefully.

If the benchmark had been about "normal" http requests, I would suggest
putting HAProxy in front with a reasonable maxconn - then HAProxy will hold on
to the connections until the application starts accepting again.

------
lnanek2
Title doesn't agree with the article...article says Java beats Go, orders Java
above it in the ranking chart, and the raw numbers say Java dropped fewer
connections...

~~~
tuxychandru
But it returned only half as many messages as Go.

------
rvirding
I am not competent to judge the tests Eric used. But the obvious reply to
those who complain about his test for some language is to fix it and come with
an improvement which you feel better represents that language. It's all on
github so fork it and come with a pull-request.

To say it bluntly, "put your money where your mouth is".

------
KirinDave
Really. Bad. Title.

------
DrJosiah
There are at least 2 better http servers for Python, uWSGI and gEvent. Tornado
is known to be slower: <http://nichol.as/benchmark-of-python-web-servers> .
More specifically, uWSGI has been shown to respond at under 25ms with 15k+
concurrent connections.

Throw uWSGI behind Nginx (with it's new websocket support), tune it a bit, and
I wouldn't be surprised to see it "pass" and perhaps even be competitive.

------
adamtulinius
The numbers are confusing and doesn't seem to add up at all. The section "stat
definitions" describes data not available in the table below.

~~~
KirinDave
Part of the problem is that EC2 is such a whacky environment. Notice Erlang's
median connection time was tiny but its average had huge outliers.

Read the timing numbers with a reasonable portion of salt. It's EC2 we're
talking about here.

~~~
j2labs
I totally agree, but I'm not sure of any other places that offer such
flexibility in pricing and hardware.

If you were to conduct a test like this, where would you go for more reliable
performance from hardware?

~~~
KirinDave
I think EC2 is great. It's just whacky.

------
bcx
I agree, this is really a test of various websocket implementations. (which is
still cool)

------
keymone
people get so much butthurt when they don't see "<stuff i chose to work with>
wins this benchmark"

any framework, any language, hell any hardware is either generalized to deal
with many problems or specifically designed to battle one.

you sure can write fast C program that does exactly this benchmark well - what
will that prove? nothing at all..

why is it so hard to accept the fact that some tools can provide good enough
results without too much tweaking while other tools may provide better result
with more time spent achieving it?

------
mokus
Could anyone explain to me why some of the rows don't add up to 10k attempted
connections?

For example, looking at the raw data for Haskell and java, and adding up
connections, disconnects, crashes, and timeouts doesn't give anywhere near
10k. What happened to the rest of the connections? Were they simply not
attempted? Was the port closed? Or am i just missing a relevant field in the
output? It's not clear to me from the description.

------
Weltschmerz
Any reason you haven't tested the 'ws' node version I submitted? I think it
should work fine...

~~~
stock_toaster
This appears to be pointing to the old results. I dont think the author of the
benchmark has rerun them yet.

