Obviously, it's often not better, as the author mention.
In my experience with python, a lot of functions are directly mapped to C code and those are nearly as fast as well written C code.
Some others are calling python code in between, and those are obviously much slower.
That's the main thing to know when you want speed. So if you start using fancy python classes and other crap on top, it's going to be slower and slower, and if it's code that's called a lot, it's going to hurt hard.
When sometimes, some code needs to be fast and there's no functions that call C code directly, ctypes work just fine.
Now there are some libraries which are well made (performance wise), but that's not the norm. It's actually pretty damn rare in my experience.
The main other contenders here are R and Mathematica, both of which will fail you when you need do something that isn't strictly statistical/mathematical. Python gives you predictable decent performance and the NumPy ecosystem is awesome for numerical libraries. I've never come across a machine learning library nearly as well designed as scikit-learn and pandas dataframes are a lot snappier than R's equivalents. My only gripe is the paucity of good plotting libraries (matplotlib is impoverished and ugly compared with R's sexy plotting routines).
Now, I haven't said a word about the faster statically compiled languages: C, C++, Java, C#, F#, OCaml, Haskell, etc...
The trouble with static languages is that they either lack essential libraries or don't allow for rapid prototyping (or in some cases, both).
Now, if you're implementing the heart of a numerically intensive algorithm and your code can't be decomposed into a few already implemented primitives, it makes sense to write it in C. The first thing to do, though, is to wrap that native code with a Python interface and test it from python.
(I can't elaborate much because I'm busy writing theM presently, but stay tuned and I think y'all will like what you'll see when the public release lands)
(one sexy hint though: the value add of these works in progress is enough that I'll be able to hire folks full time to work on it with me starting mid September or October . )
cython is very nice for numerical computation because it offers a nice syntax for dealing with array operations very efficiently (e.g. see typed memory views: http://docs.cython.org/src/userguide/memoryviews.html) and has good integration with numpy datastructures and C-API.
I agree. Where Python really shines, I think, is where you have these sorts of bottlenecks + something extra that MATLAB performs terribly at, either as a consequence of language/VM design or lack of libraries.
What is this bad mouthing of scipy.weave? I am a big fan of this approach. Nympy for everything + a simple weave inline for the inner most loop.
Maybe it is not maintained simply because it works?
The weave source hasnt seen development since ages, whereas Blitz++, the C++ array library that it is based upon has moved on quite a bit. Blitz++ has added SIMD support, or rather restructured its code so that the compilers find it easy to vectorize. The new version of Blitz++ holds its own against ifortran in terms of vectorization. These are some of the advantages that you could have enjoyed had weave been kept uptodate. I dont blame the numpy community for this though, though Blitz++ sees continuous development there has not been any formal release in tens of years. So it does become difficult to incorporate such a library. But I dont think that is the main reason why weave has languished.
I am sure Cython is great, but what I like about weave is the syntactic sugar that it brings. I do not have to write raw loops or do pointer arithmetic. If you want this kind of syntactic sugar in Cython now, you call back into the numpy API. If the default API does not give you the speed that you want, you have to expose the raw pointers of the arrays and to the messy pointer arithmetic and operations yourself. Nothing wrong with that, just that it can be error prone.
Cython however has other good things going for it, for instance it allows easy coordination with OpenMP, so it is easy to parallelize array updates, without incurring the multiple processes overhead.