The fact of the matter is that a vast array of systems-level software and high-performance software is only available via a C API interface. New or old, that's still true today.
I've used ctypes and I've used cffi; I'm glad they exist, but authors of popular packages often don't have the luxury of supporting both of them and the CPython API.
If by legacy, they're attempting to paint the CPython API as the legacy one, I also find that a misleading argument at best. PyPy is not yet the official future of Python, and attempting to paint themselves as the successor without any clear indicator of support for that future from the Python leadership seems odd.
I'm all for them getting financial support they need to do the work, but attempting to justify that work by claiming only "legacy" applications need it seems uninformed at best.
We're past that point. PyPy is the future. CPython is the past. The PyPy team succeeded in making an optimizing compiler for a language that fought them every step of the way with gratuitous hidden dynamism. That's a considerable achievement. It extends the life of Python by making it competitive on speed.
Go exists mostly because Python was too slow. Google used to use Python quite a bit internally, but their effort to speed it up, Unladen Swallow, was a disaster. That provided some of the motivation for Go.
CPython is the past and is the present, and the future you can't predict. However, if you try to extrapolate, you should perhaps consider as much historical context as possible. For instance, why is Python what it is? Is it speed? I think it is the accessibility of language syntax, design and features. CPython is the base of this language evolution, and PyPy is improving just speed. So I would extrapolate CPython to always be more popular.
Go exists because someone at Google wanted to make better C and C++, a statically typed language. It doesn't have much to do with Python. Google always preferred C++ and Java to python because of static typing, not just because of speed.
Overall, I think it's a mistake to fixate so much on speed of execution, when often times speed of development is considered more important. This niche is never going away, despite of how hard some people hammer square pegs in round holes.
That's the point to PyPy. You get fast speed of development and fast CPU performance. Best of every world. That's why we use it and not CPython. It's already bigger than you think.
What you use it for, where you need the speed and don't have a C based module to rely on? I don't mind having a bit more speed for 'free', but every time I tried, pypy hasn't been hassle free due to some module compatibility, and yet pure python code has never been a bottle neck. I have been using Python for +10 years.
Anyway, the above question is for curiosity sake, it doesn't change the point that CPython is where new language features are added. If Guido adopts pypy tomorrow, I would be happy, but otherwise, it will always try to catch up, so I don't see how it can be the future.
So far. Everything changes the moment the PyPy team announces they're an official continuation of Python2. They can easily integrate more 3rd party backported features from 3->2 into PyPy4. Guido doesn't really matter there.
That said, it doesn't take that to get me to use PyPy. The speed improvement is a dream come true. Writing pure Python is much more preferable to me than writing a C extension thus PyPy is the present and future for most of us.
I'm still quite curious what is your use case for PyPy where it is such a god send? I don't enjoy C, but it's just been so rare that I had to Cython or C anything, things are quite well optimized in the ecosystem.
PyPy didn't create that situation, CPython did. They're just filling a market need. It may be opportunistic, but it's hardly hostile. If anything was hostile it was CPython3, but I'd say neither would be "hostile" moves.
Everyone should always act in their own best interests. Especially users like myself. The CPython team did what they felt was in theirs. I do what I feel is mine. PyPy should do the same, not be fearful of some toxic, "hostile" accusation.
The godsend is that I don't have to worry about CPU performance being a limiting factor at all. That's a big deal and reduces my hardware needs to do the same amount of work. My VPS is hardly overpowered, and I like it that way because it's cheap for my projects. :) CPython only exacerbates hardware issues.
It's very nice to have a simple interpreter to fall back on, but at this point in time I think most dynamic languages need to be on a JIT like PyPy. It's just too good and I can't wait for the STM branch to be merged into PyPy4. No more GIL for those of us using it.
They might be the future of Python 2.7 though, a very fast Python 2.7, but still with all the annoyances and quirks that Python 2.7 brings. And without a lot of the nice features that are in Python 3.
But, I still think PyPy has a long way to go.
Fact is, PyPy may be "fast", but in science CPython is in effect much faster because you can use the massive and optimised scientific stack including python, pandas, hdf5, etc etc
Of course there are a vast number of applications that have C API themselves that one might want to accesses. You can interact with those through cffi in a way that's simple and performant already.
In fact, lots of old graphics and sound APIs are the same... basically anything that is not hooked up to the web.
Distributing cffi libraries that depend on each other can still be somewhat of a snarl, but last I looked there are a lot of remedies for this being discussed.
Chapeau to the team!
Fingers crossed, but so far my home business is running off pure PyPy4. No C-extensions, no interpreters. :)
 PyPy is on average 7x faster than CPython: http://speed.pypy.org/
But those in the Python community who are serious about scientific computing (or image processing, like my startup) are already using Numpy & Scipy, which provide hand-coded C implementations of most matrix-related operations. Everyone knows that Python for-loops are "slow" , and storing a large 2-D matrix as a list-of-list-of-Python-int-object would require a huge amount of memory & indirection. So, Numpy offers an N-dimensional array type, implemented in C: C arrays of densely-packed C primitive types, with for-loops in C to iterate over the matrix elements. Then Scipy builds a lot of Matlab-like functionality as modules on top of this fundamental Numpy array type.
 Python for-loops are slow: https://jakevdp.github.io/blog/2014/05/09/why-python-is-slow...
So the most expensive operations in a Python number-crunching program are likely already implemented using Numpy & Scipy operations, which run in compiled C (and additionally, often make use of Blas/Atlas/LAPACK/etc, for even greater speedups in sustained number-crunching).
But unfortunately, Numpy & PyPy do not naturally work together. Being written in C, Numpy makes substantial use of the CPython C-API -- and in fact, Numpy provides its own C-API ! The official Numpy package doesn't work with PyPy; the PyPy project very thoughtfully provides its own PyPy-compatible Numpy package .
 Numpy C-API: http://docs.scipy.org/doc/numpy-1.10.0/reference/c-api.html
 PyPy-compatible Numpy package: http://pypy.org/download.html#installing-numpy
Furthermore, Numpy is fantastic, but it can't offer all possible permutations of matrix operations. In particular, there are certain image-processing operations that are awkward (and thus, computationally-inefficient) to express using Numpy operations. So you might ultimately need to go to the Numpy C-API anyway.
This is why we created (and, just a few days ago, open-sourced) Pymod: https://github.com/jboy/nim-pymod
Pymod is a Nim+Python project that auto-generates all the Python C-API boilerplate & auto-compiles a Python C extension module that wraps the functions in a Nim module. Pymod enables us to write our Numpy array-processing code in Nim, then compile it (for C++-like speeds) as a well-behaved Python module. Nim made this very easy, because it compiles to C.
After considering our Python-integration options (CPython C-API, `ctypes` and `cffi`), we decided to go with the CPython C-API & Numpy C-API. We explained this decision in greater detail in the "Implementation details" section of the Pymod README ; the executive summary is that `ctypes` seems better suited to wrapping C types in Python, rather than exposing existing Python types in C, while the CPython C-API code could be generated & compiled with the C code that Nim was going to produce anyway.
 Pymod implementation details: https://github.com/jboy/nim-pymod#implementation-details
That said, we would be delighted for Pymod-produced Python modules to be able to run under PyPy. We've been strongly considering implementing a `cffi` back-end for Pymod, but this won't necessarily solve the Numpy issue. It would be even better if PyPy could support all the CPython C-API extension modules in the world in one fell swoop.
Cython is a separate programming language that allows you to write C code with a special Python-like syntax, or write Python with Python syntax (including NumPy), and has a few extra type annotation bits here and there (like array syntax or pointer syntax). At the end, it creates the equivalent C code (with all needed CPython API boilerplate) and compiles it into an importable shared object file.
Just from the description above, it sounds superficially the same as Pymod; is it fair to say Pymod is to Nim what Cython is to C/C++? I'd be very interested to know more about how the two tools compare and contrast.
"""Actually, Pymod was designed to be almost an anti-Cython. :)
My issue with Cython is that it's a limited sub-language within Python, where you add Cython elements incrementally & iteratively (diverging from Python in the process) until the code runs "fast enough". I'd rather work directly in a full-featured, internally-self-consistent language from the start. Nim has a clean Pythonic syntax, with all the best parts of C++ (including its runtime speed).
Hence, Pymod takes the form of an `exportpy` annotation (a user-defined Nim pragma) onto existing Nim functions, which are then auto-compiled into a Python extension module.
So there's no gradual divergence of my Python code (as it becomes more "Cythonized"); rather, the high-performance code is written directly in pure Nim. :)
There are a few more details in that thread comparing the wrapping of existing C libraries in Cython vs Pymod. (It doesn't seem right to copy-paste an entire thread...)
What this lets me do is to be extremely fine-grained about my optimizations, or conversely, to also see when some optimizations are not worth it because they don't help much but they do hurt the readability, or cause too much Python-to-Cython divergence as you put it.
In a lot of cases, I prefer that this is left up to me to do, rather than if Cython had already made hand-mapped choices about which Python things compile to which C or machine code things. If taken to the limit, a Cython that did that would just become what PyPy is, except it would be ahead of time compiled instead of JIT compiled.
But I do see the benefit of both approaches. Sometimes you don't want the burden of choosing your annotations to induce the desired compilation effect, and you don't want to allow for similar but not identical Cython source files to result in dramatically different C code, as can often happen currently.
(1) Write Python.
(2) Compile with Cython.
(3) Run compiled Cython, profile & review.
(4) Consider what Cython annotations to make to code; make code changes.
(5) Goto 2.
The emphasis is always on iteration & incremental additions.
Of course I practice iterative & incremental development, and of course I'll prototype a quick proof-of-concept implementation first (often in Python+Numpy) before profiling & algorithmic optimization. But the Cython workflow seems to me to add more iteration (of incremental Cython syntax additions) than is really necessary. When I'm working to implement some algorithm, I don't really want to iteratively learn how Cython or CPython implement various Python functions; I'd rather write my "proper version" code just once, properly the first time.
So why not take it to the logical extreme and write it all in Cython-lang from the start? If we're writing for-keeps code in a language with Pythonic syntax & static types, I find Nim-lang a more expressive, more full-featured language (with features such as generics & type-safe Lisp-like macros, in particular; I note that Cython does support pointers & operator overloading) than Cython-lang in general-purpose uses, without being very different at all in simple uses.
For example, there is an example `primes` function in the Cython tutorial: http://docs.cython.org/src/tutorial/cython_tutorial.html#pri...
Here is an equivalent implementation in Nim. As you can see, it's really almost identical in syntax:
proc primes(kmax: int): seq[int] =
var kmax = kmax
var n, k, i: int
var p: array[1000, int]
result = @
if kmax > 1000:
kmax = 1000
k = 0
n = 2
while k < kmax:
i = 0
while i < k and n mod p[i] != 0:
i = i + 1
if i == k:
p[k] = n
k = k + 1
n = n + 1
When doing scientific research, I'd prefer to start with Python and then iteratively diverge. The aim is to get my task done as quickly as possible (so I can go write those 15 journal articles due last Monday) and go just as far as is needed. It's the scientist/engineer vs programmer mindset difference.
However, if the code was done and I knew I'd need to use it again, I like the idea of rewriting it from scratch in Nim - not that I or any scientific colleagues know Nim (only one has heard of it).
Anyway, I look forward to keeping an eye on your work and maybe using it in the future :)
Granted, generics and macros are a nice advantage of Nim. What I think is the essential difference here is that Nim is higher level than Cython. So it seems to boil down to a matter of preference. However, when Python is not fast enough, Cython lets you go all the way to generating C code without garbage collection, inline assembly etc. With Nim there's yet another layer between Python and C.
I agree with you that Cython is a mature, very respectable project, and it evidently has more widespread adoption at this time.
I'm betting on Nim because I think it has a larger positive gradient, even though its y-coordinate is lower right now.
Now, a core principle in the design of Cython is to be as much as possible source compatible with Python, which is a language without static types. This is not a very good principle for me - I would rather work in a language that is not literally identical to Python, but has its own design. Adding static types often requires to do things slightly differently. In Nim a find a very principled and well designed language, and PyMod seems a great to tool to make interoperability with Python easy.
No I think you are arguing that you want it to be designed differently. It's like the contrast between having a high and low level language (Python, C/Cython) versus having a unified middle-level language such as Julia (see http://graydon2.dreamwidth.org/189377.html ). It's a matter of preference which is better, and an empirical matter which approach will be more successful in the end.
> Just grafting types onto Python is not enough
Clearly it is already successful though...
CPython for-loops are slow. PyPy for-loops are not slow. A loop that just counts is about 25x faster in PyPy than in CPython, for large numbers of iterations.
I fear you might have missed the point of that paragraph. The point was that multiple reasons contributed to the need for Numpy, so now it exists and is widely used by those who are serious about their scientific computing. The vast majority of those reasons are still valid, even though PyPy speeds up some general-purpose programming tasks in Python.
"A loop that just counts" is not anywhere near a substitute for Numpy. Nor does a 25x speed increase in a particular operation (from CPython to PyPy) hold a candle to the number-crunching speedups provided by algorithm-optimized, code-optimized (often to the point of targeting specific vector instruction sets like SSEx) special-purpose numeric libraries.
Is there any write-up or tutorial that expands on this aspect?
That second link describes the `nimcache` directory into which the generated C code is written. If you poke around in that directory (starting with the generated C version of your Nim code, which will have the same filename but a `.c` suffix instead of a `.nim` suffix) then you can see that the generated C code is really quite readable.
Theano's output, on the other hand...
I'm looking for a high level language that generates really succint human-readable code. Essentially, what a human would do to create the exact program in C. And doing what I did in nim in 3 lines, really only takes less than 10 lines in C.
Even APIs such as JNI that do their best to abstract over such details tend to block a JIT's view of what is going on and end up being a real pain.
I think it's an interesting question as to whether PyPy "is the future" or rather something more like Numba, or even a whole ecosystem of specialized variants of Numba that may be tweaked or tuned to take advantage of different hardware optimizations, work better for different domain problems (e.g. image processing vs. network programming).
In a lot of cases where PyPy improves over some standard CPython code, it's exactly the sort of code you really wouldn't care about optimizing in the context of a larger application. In any sort of scientific application, the pure CPython parts of it are usually small and insignificant compared to what's written in NumPy/SciPy/pandas and science libraries written on top of those, and Numba can be seen as the first "specializing compiler" for this sort of application -- with a lot of room for people to write other kinds of specializing compilers that compile different genres of Python code for domain-specific purposes, the way that Numba compiles generally scientific Python for numerical computing purposes.
Other than "to-LLVM" and "targets like GPUs", jitpy does exactly that for PyPy-in-CPython.
It's not quite as fast for numeric code (though it's close and conformant) but it's also not a language subset nor does it degrade to CPython performance on dynamic parts of the language. It should be a terrific option if you're able to section out discrete workloads that CPython's struggling with, even in non-numeric code.
Few people know of it though.
So even though JitPy won't "degrade to CPython performance", it's slightly moot since you can only handle a limited set of types in JitPy. I guess one area where JitPy should win in theory is if there is a dynamic Python object inside of the function, like a dynamic list. Then JitPy is OK since it's not part of the signature, and when it can't optimize due to indeterminate type inside the function, the fallback will be PyPy instead of CPython, and hence should be faster.
But Numba offers a lot on the array computing side, which is what it is specialized for. For example, from the Numba docs
< http://numba.pydata.org/numba-doc/0.21.0/developer/architect... >:
> Numba implements a user-extensible rewriting pass that reads and possibly rewrites Numba IR. This pass’s purpose is to perform any high-level optimizations that still require, or could at least benefit from, Numba IR type information.
> One example of a problem domain that isn’t as easily optimized once lowered is the domain of multidimensional array operations. When Numba lowers an array operation, Numba treats the operation like a full ufunc kernel. During lowering a single array operation, Numba generates an inline broadcasting loop that creates a new result array. Then Numba generates an application loop that applies the operator over the array inputs. Recognizing and rewriting these loops once they are lowered into LLVM is hard, if not impossible.
> An example pair of optimizations in the domain of array operators is loop fusion and shortcut deforestation. When the optimizer recognizes that the output of one array operator is being fed into another array operator, and only to that array operator, it can fuse the two loops into a single loop. The optimizer can further eliminate the temporary array allocated for the initial operation by directly feeding the result of the first operation into the second, skipping the store and load to the intermediate array. This elimination is known as shortcut deforestation. Numba currently uses the rewrite pass to implement these array optimizations. For more information, please consult the “Case study: Array Expressions” subsection, later in this document.
One reason why I see the Numba route as more effective than the PyPy route is that these kinds of highly-specific optimizations seem likely to occur all over the place, and to be very related to the genre of computing you are doing. One person may not want Numba to perform these optimizations, because loop fusion isn't important to their codebase which doesn't do a lot of array computing. Instead, they might want some kind of domain-specific optimization that assists with a kernel bypass for some low-latency algorithm.
This could all be exposed in Numba, but as separate domain-specific sub-modules or via user-defined IR optimization passes or something. Or it could exist as totally separate JIT compiler projects, of which JitPy/PyPy's JIT compiler and Numba's JIT compiler would each just be genre-specific examples.
I tend to see this as the more likely future, and that apart from cases where someone is really just taking a bunch of pure Python code and running it with PyPy interpreter instead of CPython interpreter (which I forecast would be a rare case), the difference between PyPy and CPython won't matter nearly as much as the difference between domain-specific JIT compilers.
You do need to be able to build an abstraction boundary, such that complex classes never cross it, but nothing prevents you from instantiating a PyPy class and communicating to it though function calls.
You'll probably want some way to share references across them, but that should be as simple as having a global mapping and a `Ref` class that cleans it up on destruction. Should be a few dozen lines of code.
Most python code I write would get 0 speedup from that. I think numeric optimization is a very different target than compiling and optimizing general purpose python code. I don't see either strategies "winning" in the near future.