Hacker News new | past | comments | ask | show | jobs | submit login
Pythran: a Python to C++ compiler with a focus on scientific computing (pythonhosted.org)
107 points by vmorgulis on Aug 29, 2016 | hide | past | web | favorite | 64 comments

What is the advantage of doing this compared to writing Fortran (90+) kernels, use F2Py bindings and combining them with NumPy? For HPC you generally need to be able to have full control over compiler flags. Plus, Fortran already has way better multi-dim array handling built-in - NumPy is basically just a wrapper around that.

> Plus, Fortran already has way better multi-dim array handling built-in - NumPy is basically just a wrapper around that.

I don't think either of these statements is true.

Fortran is hard to beat for performance, but from a usability perspective NumPy (and other Python libraries) has some nice features for array-oriented computation that you can't find even in modern Fortran. The most obvious examples are broadcasting [1] and advanced indexing [2].

NumPy itself contains only C, though some SciPy routines wrap Fortran code. Yes, NumPy drew heavy inspiration from Fortran, but it's hardly a wrapper.

[1] http://docs.scipy.org/doc/numpy/user/basics.broadcasting.htm... [2] http://docs.scipy.org/doc/numpy/reference/arrays.indexing.ht...

I guess by "indexing" you mean what NumPy calls "advanced indexing". All the basic stuff seems lifted straight from Fortran. But yes, I should probably have been more exact with my statement - NumPy sure has some more helpful syntax that saves some LOC.

Yes, I did mean "advanced indexing" -- I edited my post to clarify.

I'm a huge Fortran fan, but even I can see that reducing the number of languages required to get high performance can be a good thing. Not all of my science colleagues are interested in learning a ton of languages.

Well, most of the scientific computing in Python is happening on NumPy. Which uses Fortran compatible data structures. I doubt that a C++ backed python extension will be compatible with that. For that reason, if anything, I'd advocate to transpile to Fortran rather than C++ if you want more comfort for scientific users.

NumPy natively supports both row-major (C/C++) and column-major (Fortran) arrays, so from the perspective of NumPy inter-operability there would be no advantage either way.

I didn't know that. How does NumPy use its Fortran kernels with row-major arrays? Reordering would have quite significant performance implications.

NumPy certainly does not have anything like "Fortran kernels".

Elementwise arithmetic is coded in C (and not very fast...). Linear algebra uses the BLAS and LAPACK interfaces, of these only LAPACK is usually written in Fortran these days... but anyway, a Fortran matrix is a C transposed matrix and vice versa and these interfaces all support transpose as an argument, so there is no need for reordering. And if you do np.dot(A.T, A) there is no reordering either, just a change of flags to DGEMM.

There may not be an absolute need in terms of ability to perform operations, but performance is heavily dependant on accessing in the correct order - otherwise your accessible memory bandwidth goes down the drain, as do cache hitrates.

I'm guessing LAPACK might not be too dependant on order though, but I haven't studied its performance tbh.

Algorithms to handle what you talk about efficiently is one of the primary reason these libraries even exist. OF COURSE they access memory in the right order; that is their job.

But it is not really about right vs wrong order, but using an algorithm to tile the data efficiently. Google "Anatomy of High Performance Matrix Multiplication" for examples from OpenBLAS.

This is where NumPy falls through -- "a.T + a" will be very slow no matter what you do, but it can be done efficiently with tiling.

NumPy has an abstract iterator interface so you don't need to write a kernel with a specific memory layout in mind: http://docs.scipy.org/doc/numpy-1.11.0/reference/c-api.itera...

For operations like adding two arrays or summing an array along an axis, this means that NumPy automatically iterates in the fastest order.

Interesting. How does it decide the storage order when initializing an array? I'd assume due to python's dynamic nature it can't optimize for later operations there, right?

It defaults to row major but you can specify column major on creation.

The problem with a lot of these optimizing tools for Python is that they only support Python 2.7 not Python 3.

Not really a problem here as most of science still seems to be using Python 2.

edit: Cython was always great for Python 2 and apparently supports 3. http://cython.org

Yeah. I came across C library that used Cython to create Python bindings. It was funny, they could easily support both Python 2 and Python 3 at the same time since all their binding code was in Cython, it was only their installation script witch used an incompatibility (print statement)

What was the library?

Numba also supports Python 3.

I never looked into, but I wonder if there's something in Python 3 that makes it hard to do these kind of things to, compared to Python 2.7.

Python 3 is the platform I use for almost everything, going back to 2.7 is always a major pain. 10 minutes into some 2.7 work and you have to deal with encoding and you start wondering if it would be quicker to just upgrade to 3.5.

Of cause if you mainly do numbers, there's perhaps little gained from Python 3.

Can't talk for Pythran devs and others, but in the case of shedskin, one of the issues is the removal of the “compiler” module in Python 3, on which shedskin relies to, well, parse the Python code. This, plus some syntax changes, that are not really bigger that some that were introduced between, say, 2.6 and 2.7, but that are mandatory and breaking backward compatibility.

So, nothing “hard”, really, you just have to do the work again (and prepare to support two versions for a long time, instead of virtually only one).

I use Python for my research work...and this has never been a problem for me, I suppose because I use Python 2.7, and I guess I am "part of the problem"?

Same - grad student. I use python / numpy / scipy / matplotlib / sklearn daily for one-offs, proofs-of-concept, etc. There hasn't been any real incentive for me to switch.

I am similar, I did decide to switch though as for just about all my numeric scripts all I needed to do was change the print statement syntax

Yes you are.

You have the problem, not bmer.

The problem is not upgrading after years of 3 being out.

What if you don't care about the version of python, but you have a bunch of code that runs on 2.7. You don't get anything from moving to 3, you have to learn the new syntax features (friction), you need to upgrade all you code (friction) and some of the libraries you used before don't work and need fixing (friction). If you don't use asyncio, and you already had venv working, what do you gain?

For someone to whom python is just a fancy pipe-wrench, why keep up with the new shiny?

You are right. Just like we should all just be using Word 97 on Windows 95 because, you know, word processing is just word processing in the end, isn't it?

I take it you don't remember the Office 2000 -> 2003 upgrade debacle and peoples unwillingness to just get on board. Most people I know simply skipped 2003 and only reluctantly hopped to 2007 after much gnashing of teeth. I still know a handful of people still using Word 2000.

Use whatever you know works for your usecase. Look at George R.R. Martin for one example.

As a happy Python user for 10 years, I didn't even wonder whether it supports python3. Maybe the problem is somewhere else, and not with the library I'm looking forward to trying?

Are you insinuating that the problem is with Python 3? What is actually so much more amazing about Python 2 that you would rather freeze the language than upgrade?

It ain't broken. Why fix it?

Python 3 has many new features and fixes. Maybe 5 years ago this was true, but not today.

What is there in 3 that would benefit scientific computing which 2.7 doesn't have? TBH I don't see the value proposition.

For web stuff you have asyncio, and for hackers there is the venv improvements, but what is there for scientific users who have work to get done?

And how many of them can't be backported to 2?

You don't need to freeze the language, but if the alternative is breaking backwards compatibility after all that time...

It's been seven years. Get over it already. That attitude keeps Python a fractured community.

This argument is just as valid as "You've been trying for 7 years. Give up already."

You could've said the same thing about black-and-white TV, AM radio, horse drawn carriages, et al.

And there's still a lot of use for AM radio, for one.

In niche uses yes. But nobody uses it because "it ain't broke," they have a specific purpose on mind and can actually answer when someone asks what that purpose is.

So far I have not heard anything like a real justification for why Py2 is still better

> So far I have not heard anything like a real justification for why Py2 is still better

I think it needs to be the other way around. If you want me to move from something that works to something that you want me to move to, the onus is on you to show what the new thing does better (or would likely do better in the future). For scientific loads the Python3 story is mostly "not worth the trouble".

This comment branch starts with "tools that only support Py2". This ecosystem is what makes Py2 better for many.

So we have CPython which has an ffi and a native implementation of the source interpreter.

We have PyPy which is JIT for Python that emits assembly at runtime though not as successful Cpython for its support.

The complexity class of this problem, without type annotations, namely completely transforming an arbitary Dynamic Python program to a static, partially inferred typed language (C++) is O(N): https://en.m.wikipedia.org/wiki/Hindley–Milner_type_system

The transformations are also not guaranteed to improve runtime performance and might even do the opposite or have side effects for precision and a program that is slower due to generalizations that an algorithm makes.

With type annotations... why, just why? Use C++ or C if you want types and performance gains thereof..nThe potential for bugs and failures is less than relying on an algorithm if you realise the fear and laziness of compiled languages is umfounded.

Compare to https://github.com/shedskin/shedskin which does whole program type flow analysis. Shedskin offers excellent performance on numerical workloads.

Wel, too bad they did not implement this with python 3, and actually gave some meaning to the type annotations… I mean

def dprod(l0: list[int], l1: list[int]): return sum(x * y for x, y in zip(l0, l1))

Wouldn't it be much nicer?

To be brutally honest, they probably couldn't give a flying fuck, I couldn't when I tried. (sorry)

To the number crunchers, Python 3.* offers no compelling reason to move (except for threats, "you are on your own now, 2.7.* will not get updates"). I see two compelling advantages that 3.* offers (i) better abstractions for asynchrony (ii) UTF strings (that exacts a cost even if you don't need them). If you don't need these two there I see no technical reason to move (political/social reasons yes, technical not really).

For what it's worth, Python 3 does offer at least 1 compelling reason to move. Multiprocessing lib in python 3 is significantly faster than python 2. My use case is spawning 30-35 worker processes managed by a pool, and recycling them after 5 workloads pass through. The switch from 2.x to 3.x without any code changes, increased my throughput ~150%. Your mileage may vary.

edit: typo

Oh! thanks for this, I did not know.

What does this do that I can't already do with some combination of Numpy, Cython, and Numba?

I use numba quite a bit, and with every new version, the developers have notes like:

PR #1970: Optimize the power operator with a static exponent. PR #1961: Support printing constant strings. PR #1927: Support np.linalg.lstsq.

My point being that the developers have to go about re-implementing (in LLVM IR?) features that are very mature in C libraries (e.g. numpy) and similarly C++ libraries.

Could it be that one gain of going from Python to C++ will be that one could take advantage of all the mature libraries that C++ has access to, without having to do this re-inventing of the wheel?

I am not sure, and regardless, I'll be using numba for a long time coming, because...I like the devs, they work hard on it, and numba works like a charm for what I need to do, so I'm not looking for an overkill.

That makes sense.

Numba aside, though, it's strange to me that I see no mention of Cython on their project page. They ought to at least acknowledge that other fairly mature products exist in this space.

Also, maybe it's more of a curiosity for you than a practical tool, but Stan (which is becoming the de-facto standard DSL for Bayesian statistics) compiles to C++, where it takes advantage of libraries like Eigen and Boost.

That's a really good point---Cython doesn't have the same issues, it's basically Pythran except it compiles to C. Yeah, no clue.

We do something similar with Bohrium[1]. Bohrium intercepts your NumPy calls and compile a kernel for either CPU, Multicore CPU, or GPU and run that instead. It will also fuse together all NumPy calls until one has a side-effect, that the run-time needs.

We are currently building smart filters into it as well, so that we can do byte-code optimizations.

[1] https://github.com/bh107/bohrium

As I once discussed with a friend of mine, once you remove semicolons, and give spacing a syntactic and semantic value, you almost have FORTRAN... this is just the icing on the cake :¬)

(updated, see comment below) There used to be another python to c++ compiler called "shed skin" it did quite a literal translation, fe python classes would turn into c++ classes and it was quite readable.....

but utterly useless: a python integer would be translated into a c++ int. Where python automatically starts using a wider type this one would just wrap.

Are you sure you're not thinking of some other project? As far as I recall unladen swallow always targeted LLVM and was a JIT compiler rather than a AoT compiler.

you are right! it was "shed skin".

Wondering how does it compare to Cython?

What's the difference between this and nuitka?

The big one seems to be that pythran is targeting only a subset of python with the goal of getting faster performance, while Nuitka is aims at being able to compile all valid python programs (currently only up to python 3.4), but in doing so it has fewer opportunities for optimization and thus gives you slower performance than pythran. Pythran also introduces its own type annotation system you have to use if you want to get good performance.

So If you write a python program that is compatible with pythran, then pythran will almost certainly be much faster than nuitka. If however you take a random python program it will almost certainly compile with Nuitka and almost certainly not compile with pythran.

No idea why they could not contribute some patches to nuitka to make it do exactly that. Because further down the road it is the goal of nuitka - to natively compile as much as possible.

Reminds me of JavaCPP: https://github.com/bytedeco/javacpp

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact