

How fast can we make interpreted Python? - heydenberk
http://arxiv.org/abs/1306.6047

======
jballanc
It seems like the biggest point here is that compatibility with the extension
API in Python is the anchor dragging down performance. Similarly, Ruby suffers
from continued compatibility with its extension API. In the paper they compare
with Lua and JavaScript, but it's worth noting that JavaScript doesn't have an
extension API (well, JavaScript proper...not saying anything about node.js),
and the team behind Lua are notorious for not caring about breaking API
compatibility between versions.

I guess what I'm trying to say is, while this is a laudable effort, it seems
what Python (and Ruby) really needs is a way to free itself from the chains of
extension API compatibility.

~~~
ProblemFactory
On the other hand, having a stable C extension API is what allows Numpy and
Cython
([http://docs.cython.org/src/quickstart/cythonize.html](http://docs.cython.org/src/quickstart/cythonize.html))
to be as good and fast as they are.

My experience with numerical algorithms in Python has been a 10000x speedup
and 10x memory reduction by using Numpy arrays and C/fortran functions. And
when Numpy doesn't have a necessary function, a few extra lines of static C
types and on-import compilation with Cython has still resulted in a 1000x
speedup.

"Optimizing Python" needs to look at the whole existing ecosystem of C
libraries - to be useful the end result has to be faster than what's possible
now, not just faster than pure Python.

~~~
tomrod
What exactly is Cython? I've not had a reason to use it yet; are the speed
improvements worth it?

~~~
ProblemFactory
Cython is a runtime Python-to-C compiler.

The speed improvements are worth it given how easy it is to use, I have seen
literally a 1000x speedup on a time-series data analysis script from adding
about 10 lines of static type annotations.

The functionality I like best about it are:

* It can be used transparently without any makefiles or compilation steps. Add "import pyximport; pyximport.install()" at the top of your main script, and now every imported Cython-capable module is compiled on the fly at runtime.

* You can start out with _no_ changes to your python modules. All libraries and features still work within compiled modules. Then you can slowly start adding static types to a few variables at a time. The annotated variables become very fast native C integers/doubles/functions, instead of Python objects.

~~~
tomrod
Sounds intriguing. I know a fair bit of C, python, Fortran. I've used f2py a
bit. Do you have any good Cython tutorial you like to recommend?

~~~
ProblemFactory
For my use cases, the standard documentation and tutorials
([http://docs.cython.org/index.html](http://docs.cython.org/index.html)) have
been enough.

* Install it with pip, the packages in your OS distribution may be out of date,

* Rename your numeric code module's .py file to .pyx,

* Use pyximport from the main script ( [http://docs.cython.org/src/userguide/source_files_and_compil...](http://docs.cython.org/src/userguide/source_files_and_compilation.html#pyximport)) to have it compiled at runtime without any extra build steps, and then

* Start experimenting with adding a few "cdef"s ([http://docs.cython.org/src/quickstart/cythonize.html](http://docs.cython.org/src/quickstart/cythonize.html) and [http://docs.cython.org/src/tutorial/numpy.html](http://docs.cython.org/src/tutorial/numpy.html))

------
albertzeyer
Previous discussion:

[http://www.phi-node.com/2013/06/how-fast-can-we-make-
interpr...](http://www.phi-node.com/2013/06/how-fast-can-we-make-interpreted-
python.html)

[https://news.ycombinator.com/item?id=5943258](https://news.ycombinator.com/item?id=5943258)

------
kghose
From the paper:
[http://github.com/rjpower/falcon/](http://github.com/rjpower/falcon/)

~~~
ExpiredLink
> _Does Falcon support all of Python?_

> _Not yet! Lots of constructs (like catching exceptions, constructing
> objects, etc...) aren 't implemented in the Falcon virtual machine. _

~~~
phonon
From the project page,
[https://github.com/rjpower/falcon/](https://github.com/rjpower/falcon/)

"... However, this doesn't mean that programs which use these constructs won't
run. Any missing functionality is routed through the Python C API, foregoing
any potential performance benefit you might have gotten from Falcon. So,
though Falcon isn't a complete Python implementation, it should still run all
of your code."

------
mlubin
If the goal is to get performance without giving up on Python's existing
libraries, why not use Steven Johnson's PyCall
([https://github.com/stevengj/PyCall.jl](https://github.com/stevengj/PyCall.jl))
package for Julia?

~~~
ngoldbaum
No need to learn a new language, perhaps.

------
wyuenho
While this paper is quite easy and occasionally funny to read, being a total
math idiot, I need somebody to help me out how to read the benchmarks at the
end. The author claims that on average converting stack-based bytecode to
register-based bytecode results in an average of 25% performance improvement,
I have trouble finding where that number comes from. The charts are said to be
a comparison of optimized code relative to unoptimized code, my question is,
how come the unit on the y-axis is a percentage, and the unoptimized code
isn't used as a base-line all labeled as 100% or given an absolute average
time-took? The tiny gaps between unoptimized and optimized code are confusing
me.

~~~
maxerickson
The baseline in the charts is standard CPython. The unoptimized is their
register machine without optimizing the bytecode.

------
lettucecrisper
It would be good to compare Falcon with Numba: "Numba’s job is to make Python
+ NumPy code as fast as its C and Fortran equivalents without sacrificing any
of the power and flexibility of Python." Like Falcon, Numba is compatible with
CPython and whatever extensions you want to use with CPython.

[https://github.com/numba/numba](https://github.com/numba/numba)

Intro to Numba, parts 1 and 2:

[http://continuum.io/blog/numba_growcut](http://continuum.io/blog/numba_growcut)

[http://continuum.io/blog/numba_performance](http://continuum.io/blog/numba_performance)

------
Demiurge
Since it's compatible with PyObject, sounds like it can be folded into
CPython? Are there any arguments against that?

~~~
maxerickson
The maintenance effort.

The big performance win is nice (this benchmark:

[https://github.com/rjpower/falcon/blob/master/benchmarks/mat...](https://github.com/rjpower/falcon/blob/master/benchmarks/matmult.py)

), but it is still going to be a lot slower than doing it the horrible way
(calling out to Numpy).

The benchmark is more aimed at demonstrating the reduction in loop overhead
than it is at numerical speed, but numerical stuff is usually where you end up
with a lot of loop overhead...

------
willvarfar
I wish pyscho was ported to 2.7 64 bit :(

~~~
erdewit
I totally long for Psyco too. PyPy turned out to be not really a replacement
for it.

~~~
chubot
Oh really, why? Because it's not as compatible?

~~~
erdewit
Yes. With Psyco it was possible to use the entire Python ecosystem, in my case
PyQt and the science libraries.

There was just something magical about Psyco, just adding two lines to your
script and seeing it run 100x faster. It was really great at speeding up
floating point math.

------
bayesianhorse
The GIL must go!

~~~
DanWaterworth
I'll get rid of the GIL for you. You're not too attached to mutating state,
are you?

