

Python patch to speed things up to 20% - vaksel
http://bugs.python.org/issue4753

======
listic
I saw this technique, namely using gcc's labels-as-values extension to
efficiently implement threaded code, in gForth. The developers of gForth
originally chose gcc back in 1992 because it had such a neat feature that
allowed to implement threaded code efficiently. Unfortunately for gForth,
somewhere along the way gcc developers changed something and gForth took
massive perfomance hit (2 or 3 times slowdown) gForth developers tried to
negotiate with gcc team, but they didn't seem to be interested to do
additional work seemingly needed only for one application (gForth). The gForth
developers couldn't afford to fork gcc either (gcc is too big; too much work)

I guess the moral of the story should be: beware of such speedups that come
from nonstandard technology you cannot control.

The whole story is from my memory, as I cannot find the source right now.

~~~
anewaccountname
seems like a candidate for inline assembly.

~~~
listic
sure, but what if you might not want to use inline assembly i.e. for easier
portability across platforms. Forths before used it all the way, but gForth
didn't want to.

------
amix
While I really love Python as the language I really hate its primitive
implementation. 20% is nothing compared to what would be gained if Python ran
on a virtual machine that was JIT'ed.

Java for example got a 10 to 20 times performance gain from moving to HotSpot
(their JIT'ed VM). And similar gain's have been seen in Chrome's V8 JavaScript
VM or Strongtalk.

Anyway, it's kind of sad that such a great language has such a braindead
implementation :-(

And unfortunately, the same thing can be said about Ruby as well.

~~~
jd
Python's implementation is focused on code readability. One of Guido's initial
goals was that no matter how insane the input, the interpreter must never
crash. His emphasis on simple and clear code (and an intuitive bytecode) gets
in the way of performance.

The Ruby interpreter is quite terrible, but a lot of people are trying to make
it better. Give it a couple of years.

Also, you imply that both implementations are really naive. They're not. With
languages such as Python and Ruby there is much less to gain with JIT
compilation than with Java. Suppose you have this Python program:

    
    
        def trivial(a, b):
            return a + b
    

What can the JIT optimize? The function call? Not always, because the function
can be redefined during runtime (unlike with Java and .NET). Can the function
be inlined? No. Can the function call frame be stack-allocated? Well, maybe,
at a significant complexity cost: most scopes in Python are essentially
closures so you have to treat carefully there too (upward funarg problems).
Obviously the "+" function call can't be optimized much, because you don't
know the types of "a" and "b". And even if you did it wouldn't help much.

Not done yet!

Even IF you optimize everything using the NBOCK (Non Braindead Omniscient
Compiler Kit). Now you know that you're dealing with integers, and you know
where the function is defined, and you know the function's definition is not
going to change, and you know all function arguments can be declared locally
and you know only integers will be passed to and returned from the function.
Can you now transform it into a few MOV instructions? Alas, no.

Maybe the integers will overflow. And if they overflow you either want to
throw an exception or allocate additional space and transform the integer to a
heap-allocated large integer. (And you have to do all the necessary cache
invalidation.)

So even if you are omniscient there is no low-hanging fruit. In the end even
the logic of adding two integers is sufficiently complex that it warrants its
own C function and inclining becomes only marginally useful.

When you take mundane issues such as maintainability, flexibility, and
portability into account there are very few reasons left for building a JIT
compiler for Python. Note that all these optimizations are viable (to an
extent) with languages such as Haskell, C# and Java.

~~~
marcher
The PyPy folks are working on a JIT, apparently with great speedups. They've
been detailing their work on their blog: <http://morepypy.blogspot.com/>

Also, are JavaScript and Python really that dissimilar? Everyone's working on
tracing JITs for JavaScript now, with great results.

~~~
jd
Take a look at the PyPy FAQ. They're using annotations, type inference
assumptions, and so on. PyPy seems to be focused on a Jit-able subset of
Python. It's not a project that can one day be transparently included in the
new Python release.

JavaScript and Python are not that dissimilar. So we see similar results with
JavaScript. People want performance, so a lot of projects are started where
people attempt to JIT JavaScript - but in the 14 or so years of JS's existence
JavaScript is still several orders of magnitude slower than less dynamic
languages. Java and C# have never been as slow as JavaScript is today. Will
JIT-ing JavaScript help? Sure. But -great- results? I wouldn't go that far.

~~~
thorax
Don't forget psyco's JIT which gives some impressive performance gains for a
lot of different code. We use it while embedding Python in Counter-Strike
Source and it performs admirably.

~~~
tocomment
Someone needs to upgrade pyscho for 64 bit IIRC

------
SirWart
The best explanation for how the optimization works is actually in the patch
itself: <http://bugs.python.org/file12524/threadedceval5.patch>.

------
cool-RR
Are these the 20% we lost with Python 3?

~~~
neilc
Not the "same" 20%, no: the same optimization could be applied to Python 2.x,
presumably yielding a comparable relative speedup.

