
The Great Ruby Shootout (July 2010) - acangiano
http://programmingzen.com/2010/07/19/the-great-ruby-shootout-july-2010/
======
headius
I can comment on the JRuby results.

For the base perf numbers, I'm not surprised. We've known that we're roughly
on par with 1.9.2 for some time, and many of the benchmarks in question have
started to reach a point of irreducible complexity (e.g. you can only slice
and dice strings so fast). It's good to see JRuby remains at or near the front
of the pack as far as performance, especially considering we've made no major
performance-related changes in almost two years.

The Linux versus Windows numbers are a bit surprising to me. If I were to make
a guess, I'd guess that the JVMs used were not identical (perhaps like Isaac
Gouy mentions, one platform was 64-bit and the other wasn't) or some other
detail altered the performance characteristics of the test. But the
performance drop does seem to be in line with other implementations, so
perhaps Windows really does suck and there's not much we can do about it.

On the memory issue, I have a few recommendations.

JRuby by default allows the JVM to use up to a 512MB heap (the default is
usually 32-64MB, which is rarely enough for most nontrivial apps). The JVM
likes to use as much memory as you're willing to give it, to keep GC times low
(nearly free) and to give it lots of room to breathe. Almost all these
benchmarks could run in far less memory (maybe 1/5 as much or lower) if the
JVM were choked down to that level. So it's not surprising to me that the
memory sizes for these very object and CPU-intensive benchmarks start to
approach that 512MB limit; the JVM is just stretching its legs.

Expect to see a lot more performance work coming in JRuby 1.6. I've blogged
about it here: [http://blog.headius.com/2010/05/kicking-jruby-performance-
up...](http://blog.headius.com/2010/05/kicking-jruby-performance-up-
notch.html)

Also expect to see more work on picking a "winner" as far as lightweight
servers go. Something that works as seamlessly as Passenger could be the "last
mile" we need to get folks to make a move.

And watch our two Ruby Summer of Code projects: Ruboto, bringing JRuby to
Android; and C extension support.

We're working very hard to bring JRuby to everyone and everyone to JRuby. The
reasons not to use JRuby are rapidly disappearing.

~~~
acangiano
> The Linux versus Windows numbers are a bit surprising to me. If I were to
> make a guess, I'd guess that the JVMs used were not identical (perhaps like
> Isaac Gouy mentions, one platform was 64-bit and the other wasn't) or some
> other detail altered the performance characteristics of the test.

The same identical version was used: Java HotSpot(TM) 64-Bit Server VM
1.6.0_20.

> But the performance drop does seem to be in line with other implementations,
> so perhaps Windows really does suck and there's not much we can do about it.

The leading theory. ;-)

~~~
xpaulbettsx
I don't know that this theory totally holds water (Windows OS developer here,
announcing obvious bias) - looking at the results, the one place Windows
really gets nailed is the I/O test; I suspect there's some significant
optimizations that could be done there, as I/O on Windows OS in general
certainly isn't 4x slower than Linux.

~~~
headius
If there's something we or the JVM could do to improve these numbers, I would
love to talk to you about it. At this point, if the JVM developers haven't
found the magic sauce, we JRuby guys probably won't either...but I'd really
love for JRuby performance on Windows to match JRuby performance on Linux.

~~~
xpaulbettsx
I don't think that I can actively hack on the JVM, but I can tell you that
XPerf is a great way to determine where you're spending your time.
<http://www.microsoftpdc.com/2009/CL16> is a really good intro video on the
topic, it's a very powerful tool (though doesn't provide as much analysis as
Instruments for example)

------
rbranson
JRuby would be the future if the startup times weren't so horrendous. It makes
it very unpalatable for scripting. Nailgun is not robust enough. I have a
feeling this is going to be more of a JVM problem than something Nutter and
team is going to tackle. So is the future Rubinius and JRuby?

~~~
jon_dahl
For scripting, that's definitely a problem. But it's not a big deal for long-
running processes (e.g. a web application server). But Ruby definitely
benefits from both: a fast Ruby with ~zero startup time, and a fast Ruby on
the JVM.

3 years ago, Ruby implementation problems became the hot topic at Ruby
conferences. Ruby was a beautiful language with an ugly implementation. It's
great to see that the community has executed on this concern.

~~~
rbranson
While I agree that it's not important for long-running applications, there is
still a lot of scripting done in Ruby, and it's very frustrating to get a a
5-to-10 second wait just to tell someone their arguments were invalid. Waiting
that long for tests is also such a drag on iteration.

I think once JRuby has stabilized the native gem support and has some sort of
Passenger-like deployment option (Glassfish and Warbler isn't quite there
yet), we'll see some big players built on MRI start to move their Web
applications to the platform.

~~~
rue
Use MRI for short-lived and JRuby for long-lived. Problem solved?

------
dminor
It will be interesting to see what sort of difference Java 7 makes for JRuby.

~~~
alnayyir
Java != JVM

~~~
jimbokun
But invokedynamic is part of Java 7.

<http://java.sun.com/developer/technicalArticles/DynTypeLang/>

~~~
headius
InvokeDynamic and other Java 7 features (method handles, NIO2) are definitely
going to improve JRuby's situation, but maybe not in the way you expect. Indy
and method handles will largely allow us to delete (or not load) code we
currently generate to get the same effect. Smaller runtime, possibly a smaller
distribution. NIO2 will give us more direct access to streams, process
handling, and so on, so we can delete hacks we've written to do all that
ourselves.

But perhaps the most interesting aspect of any new major Java release is that
they're usually 10-20% faster across the board, due to new and better
optimizations. As the saying goes, if your Java app isn't fast enough, upgrade
to a newer JVM.

I think invokedynamic combined with more runtime profile-driven optimizations
in JRuby could easily double JRuby's Ruby-execution performance (or better),
and in many cases reduce memory churn too. Lots of good things coming...it's
nice to have an army of VM engineers on your side.

------
Roboprog
Why no IronRuby test on Windows? Understand, I'm no fan of M$, but it seems
very odd to run the IronRuby test on Mono, then omit running IronRuby on the
"native" .NET environment.

~~~
acangiano
IronRuby was tested on Windows a few weeks ago:
[http://programmingzen.com/2010/06/28/the-great-ruby-
shootout...](http://programmingzen.com/2010/06/28/the-great-ruby-shootout-
windows-edition/)

------
leif
Means and (especially) medians mean little to nothing when you're doing them
over entirely different benchmarks. Why bother? A 3D plot would be a lot
clearer.

~~~
acangiano
> Means and (especially) medians mean little to nothing

It's the other way around. Given the skewed nature of the data, the median is
a better measure of the central tendency of the data set.

But note that I'm not plotting the median only. I use a box plot which gives
you a much better statistical picture than using a simple median (or mean).
Nevertheless, _summarizing_ different benchmarks has its limitations and box
plots give you a rough, general idea of what's going on.

~~~
leif
I disagree. If you write ten benchmarks, and they finish in the same order on
all platforms, the median really doesn't give you any information apart from
how the middle one did. At least the mean will be changed if the slowest
benchmark for some reason is a lot slower on one platform. In this case, there
was some variation, but for the most part it looks like the median is just one
of 4 or 5 middling benchmarks in all cases, so you're really just removing
data, if anything.

The box plot is a good choice here. What I would like to see is a graph
plotting time against {VM} x {benchmark size} to see how each platform scales
within a benchmark.

With the data you've got, apart from confusing me a little by including the
mean and median, you did a good job presenting it. I'd love to see _why_ some
platforms do so much well than others in some cases (except JRuby which
everyone knows is a memory hog), but this seems way out of scope for this
article. :)

------
necubi
Too bad there's no MacRuby, which is probably the fastest ruby interpreter
(though it only runs on OS X).

~~~
acangiano
It's right there in the first paragraph of the article:
[http://programmingzen.com/2010/05/16/benchmarking-
macruby-0-...](http://programmingzen.com/2010/05/16/benchmarking-macruby-0-6/)

~~~
aarongough
And it looks like it's slower that either 1.8 or 1.9...

------
messel
The curve I dream of seeing: c++ slower than any good coverage flavor of Ruby.

~~~
headius
I'd like to see that too. In JRuby, we may be able to get Ruby-to-Ruby calls
to perform as well as Java calls, which would at least get that bottleneck out
of the way. The remaining performance issues, however, are usually the rate at
which objects can be allocated. In order to reduce that we may need a little
JVM help (escape analysis that works well enough to actually eliminate
allocations) and we may start to explore optional static typing, to allow
_really_ reducing numeric operations to raw primitive math.

At this point, we realize that sometimes you really do need native
performance, and we're not taking any options off the table to get there.

~~~
messel
Charles, is there any chance virtual machines, JIT compilers can get "smart
enough" to introduce native typing without requiring a hard decision by a
developer. I'd prefer not to have to worry about types ever, and let the
interpreter/compiler/optimizer slap them onto objects as needed to really
crank numeric throughput.

Maybe I could tune software with profiling tools after the fact. But it feels
like as soon as you start locking specific objects down it's a slippery slope.

I'm not very familiar with the depth you've gone to, but can a dynamic object
have numeric features or be capable of substituting numeric handling for a
limited time (virtual numerics) and then revert back to a sloppy untyped
object?

