
Pythran: a Python to C++ compiler with a focus on scientific computing - vmorgulis
http://pythonhosted.org/pythran/
======
m_mueller
What is the advantage of doing this compared to writing Fortran (90+) kernels,
use F2Py bindings and combining them with NumPy? For HPC you generally need to
be able to have full control over compiler flags. Plus, Fortran already has
way better multi-dim array handling built-in - NumPy is basically just a
wrapper around that.

~~~
shoyer
> Plus, Fortran already has way better multi-dim array handling built-in -
> NumPy is basically just a wrapper around that.

I don't think either of these statements is true.

Fortran is hard to beat for performance, but from a usability perspective
NumPy (and other Python libraries) has some nice features for array-oriented
computation that you can't find even in modern Fortran. The most obvious
examples are broadcasting [1] and advanced indexing [2].

NumPy itself contains only C, though some SciPy routines wrap Fortran code.
Yes, NumPy drew heavy inspiration from Fortran, but it's hardly a wrapper.

[1]
[http://docs.scipy.org/doc/numpy/user/basics.broadcasting.htm...](http://docs.scipy.org/doc/numpy/user/basics.broadcasting.html)
[2]
[http://docs.scipy.org/doc/numpy/reference/arrays.indexing.ht...](http://docs.scipy.org/doc/numpy/reference/arrays.indexing.html)

~~~
m_mueller
I guess by "indexing" you mean what NumPy calls "advanced indexing". All the
basic stuff seems lifted straight from Fortran. But yes, I should probably
have been more exact with my statement - NumPy sure has some more helpful
syntax that saves some LOC.

~~~
shoyer
Yes, I did mean "advanced indexing" \-- I edited my post to clarify.

------
akandiah
The problem with a lot of these optimizing tools for Python is that they only
support Python 2.7 not Python 3.

~~~
njbooher
Not really a problem here as most of science still seems to be using Python 2.

edit: Cython was always great for Python 2 and apparently supports 3.
[http://cython.org](http://cython.org)

~~~
ericfrederich
Yeah. I came across C library that used Cython to create Python bindings. It
was funny, they could easily support both Python 2 and Python 3 at the same
time since all their binding code was in Cython, it was only their
installation script witch used an incompatibility (print statement)

~~~
brian_herman
What was the library?

------
hairy_man674
So we have CPython which has an ffi and a native implementation of the source
interpreter.

We have PyPy which is JIT for Python that emits assembly at runtime though not
as successful Cpython for its support.

The complexity class of this problem, without type annotations, namely
completely transforming an arbitary Dynamic Python program to a static,
partially inferred typed language (C++) is O(N):
[https://en.m.wikipedia.org/wiki/Hindley–Milner_type_system](https://en.m.wikipedia.org/wiki/Hindley–Milner_type_system)

The transformations are also not guaranteed to improve runtime performance and
might even do the opposite or have side effects for precision and a program
that is slower due to generalizations that an algorithm makes.

With type annotations... why, just why? Use C++ or C if you want types and
performance gains thereof..nThe potential for bugs and failures is less than
relying on an algorithm if you realise the fear and laziness of compiled
languages is umfounded.

------
sitkack
Compare to
[https://github.com/shedskin/shedskin](https://github.com/shedskin/shedskin)
which does whole program type flow analysis. Shedskin offers excellent
performance on numerical workloads.

------
guyzmo
Wel, too bad they did not implement this with python 3, and actually gave some
meaning to the type annotations… I mean

def dprod(l0: list[int], l1: list[int]): return sum(x * y for x, y in zip(l0,
l1))

Wouldn't it be much nicer?

~~~
srean
To be brutally honest, they probably couldn't give a flying fuck, I couldn't
when I tried. (sorry)

To the number crunchers, Python 3.* offers no compelling reason to move
(except for threats, "you are on your own now, 2.7.* will not get updates"). I
see two compelling advantages that 3.* offers (i) better abstractions for
asynchrony (ii) UTF strings (that exacts a cost even if you don't need them).
If you don't need these two there I see no technical reason to move
(political/social reasons yes, technical not really).

~~~
module0000
For what it's worth, Python 3 does offer at least 1 compelling reason to move.
Multiprocessing lib in python 3 is significantly faster than python 2. My use
case is spawning 30-35 worker processes managed by a pool, and recycling them
after 5 workloads pass through. The switch from 2.x to 3.x without any code
changes, increased my throughput ~150%. Your mileage may vary.

edit: typo

~~~
srean
Oh! thanks for this, I did not know.

------
nerdponx
What does this do that I can't already do with some combination of Numpy,
Cython, and Numba?

~~~
bmer
I use numba quite a bit, and with every new version, the developers have notes
like:

PR #1970: Optimize the power operator with a static exponent. PR #1961:
Support printing constant strings. PR #1927: Support np.linalg.lstsq.

My point being that the developers have to go about re-implementing (in LLVM
IR?) features that are very mature in C libraries (e.g. numpy) and similarly
C++ libraries.

Could it be that one gain of going from Python to C++ will be that one could
take advantage of all the mature libraries that C++ has access to, without
having to do this re-inventing of the wheel?

I am not sure, and regardless, I'll be using numba for a long time coming,
because...I like the devs, they work hard on it, and numba works like a charm
for what I need to do, so I'm not looking for an overkill.

~~~
nerdponx
That makes sense.

Numba aside, though, it's strange to me that I see no mention of Cython on
their project page. They ought to at least acknowledge that other fairly
mature products exist in this space.

Also, maybe it's more of a curiosity for you than a practical tool, but Stan
(which is becoming the de-facto standard DSL for Bayesian statistics) compiles
to C++, where it takes advantage of libraries like Eigen and Boost.

~~~
bmer
That's a really good point---Cython doesn't have the same issues, it's
basically Pythran except it compiles to C. Yeah, no clue.

------
madsohm
We do something similar with Bohrium[1]. Bohrium intercepts your NumPy calls
and compile a kernel for either CPU, Multicore CPU, or GPU and run that
instead. It will also fuse together all NumPy calls until one has a side-
effect, that the run-time needs.

We are currently building smart filters into it as well, so that we can do
byte-code optimizations.

[1] [https://github.com/bh107/bohrium](https://github.com/bh107/bohrium)

------
leaningtower
As I once discussed with a friend of mine, once you remove semicolons, and
give spacing a syntactic and semantic value, you almost have FORTRAN... this
is just the icing on the cake :¬)

------
toolslive
(updated, see comment below) There used to be another python to c++ compiler
called "shed skin" it did quite a literal translation, fe python classes would
turn into c++ classes and it was quite readable.....

but utterly useless: a python integer would be translated into a c++ int.
Where python automatically starts using a wider type this one would just wrap.

~~~
dagw
Are you sure you're not thinking of some other project? As far as I recall
unladen swallow always targeted LLVM and was a JIT compiler rather than a AoT
compiler.

~~~
toolslive
you are right! it was "shed skin".

------
nhamausi
Wondering how does it compare to Cython?

------
_ZeD_
What's the difference between this and nuitka?

~~~
dagw
The big one seems to be that pythran is targeting only a subset of python with
the goal of getting faster performance, while Nuitka is aims at being able to
compile all valid python programs (currently only up to python 3.4), but in
doing so it has fewer opportunities for optimization and thus gives you slower
performance than pythran. Pythran also introduces its own type annotation
system you have to use if you want to get good performance.

So If you write a python program that is compatible with pythran, then pythran
will almost certainly be much faster than nuitka. If however you take a random
python program it will almost certainly compile with Nuitka and almost
certainly not compile with pythran.

~~~
jsjsjsjsjsjs
No idea why they could not contribute some patches to nuitka to make it do
exactly that. Because further down the road it is the goal of nuitka - to
natively compile as much as possible.

------
vonnik
Reminds me of JavaCPP:
[https://github.com/bytedeco/javacpp](https://github.com/bytedeco/javacpp)

