Hacker News new | past | comments | ask | show | jobs | submit login
The Great Ruby Shootout (December 2008) (antoniocangiano.com)
41 points by acangiano on Dec 9, 2008 | hide | past | web | favorite | 32 comments

I can't stand Windows, but wanted to point something out to all blind Windows haters out there: look at Vista vs Ubuntu numbers. Windows beats Linux running MRI by 10-50% in nearly all tests [when they pass] which aligns perfectly with my own observations.

Certainly such big difference can't be explained by more effective system calls. I attribute this to Microsoft's superior C/C++ compiler. Back in my C++ loving days I played with various compilers regularly, and MSVC has always been the most optimizing, and kept advancing rapidly from version to version. It also compiles 2-3 times faster than GCC if I remember correctly... uhhh I miss it. :-)

Also his JRuby startup time is inaccurate. No way JVM starts in under a second: it's more like 2 full seconds on his hardware. More likely he had some of JVM in the filesystem cache. Vista/OSX will pre-fetch JVM modules even after you reboot (not sure about Linux) if you run Java software and especially if you have 8GB of RAM...

MSVC the "most advanced" compiler, despite:

1. Doesn't support C99.

2. Can't correctly align its stack, making SSE optimizations of any sort a complete nightmare and forcing us to disable a number of optimizations when compiling under MSVC.

3. Has the worst 32-bit libc I have ever seen in my life, with a memcpy that goes at less than 1 byte per clock cycle for small copies, a massive performance killer for large apps.

4. Completely incompatible with almost all features of the more widely used C compilers, e.g. AT&T-syntax inline assembly. ICC, on the other hand, supports nearly everything.

5. Loves to mangle its entire project file every time you change one small thing, resulting in enormous numbers of false deltas in version control.

Now, I'm not big on GCC either and can come up with dozens of regressions and miscompilation problems with GCC, along with who-knows-how-many cases of suboptimal compilation. But at least it works and doesn't force us to mangle our entire codebase just to get it to compile!

If you want to make an argument about compilers superior to GCC, pick one that is actually good, like ICC, rather a relic like MSVC. Merely what I have to do to keep supporting this late-80s-technology beast makes me want to gouge my eyes out. I mean, seriously, a compiler that doesn't even support variable declarations in the middle of a code block? It's 2008, folks.

He tested Ruby 1.8.7 on Ubuntu and Ruby 1.8.6 on Windows.

Ruby 1.8.7 has added some features found in Ruby 1.9.1 in order to provide a forward path for those wishing to start adopting the new Ruby 1.9.1 features. For whatever reason, Ruby 1.8.7 has become a little slower when compared to Ruby 1.8.6, in general, at least during some testing I have done. So there's more than reaches the eyes in those numbers of the benchmarks.

Also, while Microsoft's C and C++ compiler provide lots of cool optimizations, Ruby is often compiled with just -O2 on Unix compatible systems because it's often fast enough, because Ruby is not known for trying to extract maximum performance from the bare metal anyway, so at the end of the day, things tend to a certain normalization which kills performance.

For that reason, MinGW and even Cygwin can work way too well on Windows machines with GCC, with Ruby tasks in mind. Though there might be little incompatibilities here and there with things like threading, sockets and so on.

I personally regard JRuby a great Ruby for Windows machines, because Java has great support there. More than that, though, I am excited with the work of the IronRuby guys because during my tests IronRuby has started performing within decent expectations, though it lags behind other Ruby implementations and has stability problems at this point.

Understand that unless the C/C++ code being compiled is actually targeted for gcc optimization, you won't get more than ~2-5% speedup in code even with -O3 or higher. Infact, go high enough like -O9 and you may even get incorrect code. The only time that settings like -O6 help are in cases like XaoS which is targeted for gcc optimization.

Furthermore, MSVC is not the only compiler better than gcc. As of now, LLVM also produces significantly better code: http://news.ycombinator.com/item?id=286408

Why is it that, more often than not, whenever someone says something positive about a Microsoft product or the company itself, they have to start with:

- I can't stand Windows but... - I'm 100% GNU/Linux but... - Microsoft and/or BillG can burn in hell but... - etc.

Why the need for justification? We're all informed people here. Can we state positive facts about Microsoft without fear of backlash? Can I post this reply without fear of it being buried?

Why the need for justification? We're all informed people here. Can we state positive facts about Microsoft without fear of backlash?

This wasn't the first positive post about Microsoft tech I've ever done. The answer to your question is: no we can't. :-) Just recently I got beaten to death by OSX and Linux fanboys on reddit for mentioning superior threading/process switching on Windows.

Sorry, didn't mean to single you out!

Yes we're all informed people, but we're also mere mortals suspectible to the same psychology as everyone else. Including instinctively downvoting something that doesn't fit with our worldview.

No way JVM starts in under a second: it's more like 2 full seconds on his hardware.

My machine has lower hardware specs than what the article mentions. This is what i get.

$ java -version

java version "1.6.0_10"

Java(TM) SE Runtime Environment (build 1.6.0_10-b33)

Java HotSpot(TM) 64-Bit Server VM (build 11.0-b15, mixed mode)

$ uname -a

Linux lneves-t61p 2.6.27-9-generic #1 SMP Thu Nov 20 22:15:32 UTC 2008 x86_64 GNU/Linux

$ echo "public class Test{public static void main(String[] args) {System.out.println(\"Hello World\");}}" > Test.java

$ javac Test.java

$ time java Test

Hello World

real 0m0.174s

user 0m0.068s

sys 0m0.016s

It seems that you are wrong.

More likely he had some of JVM in the filesystem cache. Vista/OSX will pre-fetch JVM modules even after you reboot (not sure about Linux) if you run Java software and especially if you have 8GB of RAM...

Do you mean Class Data Sharing?


Lets disable it:

$ time java -Xshare:off Test

Hello World

real 0m0.165s

user 0m0.060s

sys 0m0.040s

Not much of a difference.

Just repeated your experiment on a MBP 2.2GHz. Got 0.424s real for the first attempt, then ~0.180s for all subsequent. So even on a far slower machine with no caching whatsoever, it's still under half a second :)

No caching? http://java.sun.com/javase/6/docs/technotes/guides/jweb/othe...

OSX employs its own pre-caching similar to Vista.

Think for second: how many various files need to be read off your hard drive for JVM to get started. There are some tools you can use (depends on your platform) that will help you with coming up with an exact list.

That's A LOT of I/O. No hard drive will give you sub-second time.

Does it matter if there's caching though? All modern operating systems cache / prelink / etc. Would you say C has a "30 second lag" because a basic "Hello world" program took 30 seconds to load including how long it takes the computer to start up from cold to run that program? After all, the operating system provides the APIs that C uses.. just as the JVM provides Java's.

If a computer starts an app in, say, about 0.2 seconds every time, it's disingenuous to keep referring to some "2 second lag" since that 2 second lag never really occurs in real use under production conditions. At most, it happens once.

Well... I haven't touched any JVM-based software in a few days on my 2.4ghz MBP with 4GB of RAM. Caches got cleared and pre-fetch "forgot" about it, I guess it figured that the heavy Adobe software I've been running needed it more.

So I just timed hello world (in Java) and it gave me 1.89sec.

That's pretty lame if you ask me. In fact, that's incredibly lame. Imagine if it wasn't just "hello world", but an actual piece of usable code! In fact I can even tell you how long it takes for my Excel parsing tool (Java, command-line based, uses Apache POI) takes to start up on a 100% "cold" machine: about 4 seconds.

I wonder how this simple tool would fare on Linux-based netbooks with total 512MB of RAM and slow CPUs... or something like iPhone.

That's a one-off. Startup times are rarely included in benchmarks since they're not representative of true day to day performance - which is the sort of performance most people care about.

An extra couple of seconds for booting up an entire framework you haven't used for a few days is nothing to worry about. If you were doing performance intensive work, it'd already be all cached and ready to go.

Now I might be being stupid here but as far as I can see from the graph of total time that Vista and Ubuntu are roughly equivalent:


>Also his JRuby startup time is inaccurate.

That comment is inaccurate ;)

There are other faster options on Linux than gcc, you know. Doesn't icc benchmark faster than vc++?

I'm not sure why you don't hear more about gcc vs icc & borland in the open source world. They are free for non-commercial use after all.

Because icc's performance isn't great on non-intel cpus? And it isn't particularly portable (only available for Windows, Linux, OS X).

Plus, the open source world got burned pretty badly by BitKeeper. Now they want to use free (as in speech) tools.

> Because icc's performance isn't great on non-intel cpus?

I'm not sure that's true. I think it still out-performs gcc on AMD chips.

It's definitely possible. I haven't seen any benchmarks myself. I just remembered hearing that ICC purposefully produced crippled code on amd cpus. (http://www.amd.com/us-en/assets/content_type/DownloadableAss...).

The relevant section: Intel has designed its compiler purposely to degrade performance when a program is run on an AMD platform. To achieve this, Intel designed the compiler to compile code along several alternate code paths. Some paths are executed when the program runs on an Intel platform and others are executed when the program is operated on a computer with an AMD microprocessor. (The choice of code path is determined when the program is started, using a feature known as "CPUID" which identifies the computer’s microprocessor.) By design, the code paths were not created equally. If the program detects a "Genuine Intel" microprocessor, it executes a fully optimized code path and operates with the maximum efficiency. However, if the program detects an "Authentic AMD" microprocessor, it executes a different code path that will degrade the program’s performance or cause it to crash.

I think the main reasons are ideological (icc isn't free as in beer) and practical (ICC doesn't work on a huge number of architectures or os's that gcc does).

I'm glad that we're starting to see realistic and absolute comparisons.

And I wonder how many people are surprised to learn that JRuby is doing so well right now.

The JRuby guys have definitely been doing lots of good work, it's just impressive to see it pay off like that.

I don't like JVM: I hate the mandatory 2-second startup time and "warming up" time. And the standard set of libraries doesn't look particularly well engineered to me (comparing Java SE to .NET or Python's libs). Also, despite being enterprise-tested and good at memory management with it's "best in class" generational GC, Java programs continue to be memory hogs, based on personal experience.

But you can't beat a nice piece of software (JRuby) and the huge ecosystem of high quality Java libraries. Moreover, JRuby guys are awesome at listening and talking to their community. You can go and ask any question about JRuby internals on #jruby - someone will always be there.

I am definitely considering switching my projects to JRuby, just need to find time.

jruby startup is about <1 second on my box. They did a bunch of tweaks earlier this year to solve the issue.

No it isn't. I am seeing an even lower number: 0.24 sec on my box, right after reboot.

That's called the filesystem cache and pre-fetch daemon (present on Vista/OSX).

I hate the mandatory 2-second startup time and "warming up" time.

I got about 0.45s for a basic Java app on first start, then 0.2s subsequently. For a JRuby one-liner, I get 0.618s consistently (no first start to count). So while JRuby appears to add about 0.4s to startup, it's no 2 seconds.

JRuby is great, the only issue with existing projects is trying to convert all the gem dependencies that use native code over to java. Popular ones like Hpricot have jruby equivalents, but unpopular ones like ruby-gsl or image_science have no hope.

To me, one of the more interesting aspects of that was that the ruby version optimised for reduced memory usage got a 2x speedup.

That makes sense in terms of cpu cache misses. Has anyone run any numbers to see how gcc -Ospace compares with -O2 on recent hardware?

By my reckoning, that makes Ruby 1.9/Yarv the same kind of speed as Python. (At the slow end of the scale, but nothing to be ashamed of any more.)

The most interesting about the results IMHO is JRuby's performance. It's now faster than CRuby1.8, and, for what I have read, solves also the scalability problems associated to it(threads, GC, etc...).

And there's a lot of room for improvement, specially with the advent of invokedynamic. I wasn't eager to learn Ruby because of their alleged performance and scalability problems, but it seems that JRuby is changing that.

(If I'm mistaken, and things aren't so bright, please, enlighten me).

Please read this important update: http://news.ycombinator.com/item?id=393988

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact