

PyPy faster than C (on a carefully crafted example) - prog
http://morepypy.blogspot.com/2011/02/pypy-faster-than-c-on-carefully-crafted.html

======
grav1tas
For fairness sake, I with the author would note the version of the C compiler
they used. Otherwise saying PyPy is faster than C is like saying "my 2010
Toyota Prius is faster than a car in a carefully crafted example".

Sorry to sound like a language curmudgeon, but if you're going to make a claim
that language x is faster than language y you're treading on thin ice.
Languages in and of themselves don't have a whole lot of grounding in reality
(especially Python; C has some closer analogues to the machine). That is to
say that the devil is always in the details, or in this case the
implementation of the compiler. Though my guess is the author of this blog
post may not have intended for this small post to be flung across various nerd
news sources to invoke the ire of language zealots everywhere.

The fact that the compiler can't inline across file boundaries (which is why
PyPy is 'faster' in this case), I would think, is not a limitation of the C
language, but rather a limitation of a C compiler (which I assume is GCC).

~~~
ElliotH
The GCC version has been added to the post. It was "GCC 4.4.5 shipped with
Ubuntu Maverick"

~~~
grav1tas
Awesome, thanks!

------
jwatzman
The PyPy version is faster, as the article and commenters point out, because
the PyPy JIT can inline a call across a module boundary. GCC can do something
like this with -fwhole-program, but that doesn't work on shared libraries.

Has any work gone into inlining (probably only very simple) functions when
linking against a shared library? Something like the "add" function clearly
has no side effects if you were to look at the asm in the shared library, but
it might be hard for the compiler to figure that out in any more complicated
cases... perhaps by adding an annotation to the shared object file? It seems
doable, at least.

~~~
stonemetal
But by definition shared libraries are shared code you are going to load at
run time. What if between the time your app was compiled and the time your app
was run the add function in the shared library changed? You would have the
wrong version of the function inlined.

~~~
jwatzman
Ah, true... what I said makes no sense. Thanks.

~~~
stonemetal
Of course that doesn't speak to the fact that a smarter dynamic loading
capability couldn't inline at runtime. That is what they are doing after all.

------
burgerbrain
_"I added a printf("%f\n",a) to the end of the file so the compiler wouldn't
optimize the whole thing away. On my Cure 2 Duo 2.33Ghz, I got for gcc -O3:

1000000000.000000

real 0m4.396s user 0m4.386s sys 0m0.007s

and for gcc -O3 -flto -fwhole-program:

1000000000.000000

real 0m1.312s user 0m1.308s sys 0m0.003s"_

Yet another case of somebody thinking they're making a clever comparison by
forgetting to set their compiler flags properly.

~~~
ehsanul
Please elaborate on what the correct flags would be, for those of use who
wouldn't know.

~~~
burgerbrain
Try reading the comment I just quoted.

~~~
ehsanul
Oh, oops. Sorry, me and at least 3 others had a collective brain fart it
seems. I had first thought you meant that both of those sets of compiler flags
weren't quite right.

------
malkia
How does it fare to the shootout benchmark game?

Here is comparison of pypy and luajit
[http://shootout.alioth.debian.org/u32/benchmark.php?test=all...](http://shootout.alioth.debian.org/u32/benchmark.php?test=all&lang=pypy&lang2=luajit)

------
minimax
I would have liked to see the timing information for PyPy without dynamic
inlining.

