Hacker News new | past | comments | ask | show | jobs | submit login
Python C API, PyPy and the road into the future (lostinjit.blogspot.com)
178 points by mattip on Nov 16, 2015 | hide | past | favorite | 55 comments

I strongly disagree with the author's assertions that the need for better C API support is strictly a need for integrating with "legacy applications".

The fact of the matter is that a vast array of systems-level software and high-performance software is only available via a C API interface. New or old, that's still true today.

I've used ctypes and I've used cffi; I'm glad they exist, but authors of popular packages often don't have the luxury of supporting both of them and the CPython API.

If by legacy, they're attempting to paint the CPython API as the legacy one, I also find that a misleading argument at best. PyPy is not yet the official future of Python, and attempting to paint themselves as the successor without any clear indicator of support for that future from the Python leadership seems odd.

I'm all for them getting financial support they need to do the work, but attempting to justify that work by claiming only "legacy" applications need it seems uninformed at best.

"PyPy is not yet the official future of Python."

We're past that point. PyPy is the future. CPython is the past. The PyPy team succeeded in making an optimizing compiler for a language that fought them every step of the way with gratuitous hidden dynamism. That's a considerable achievement. It extends the life of Python by making it competitive on speed.

Go exists mostly because Python was too slow. Google used to use Python quite a bit internally, but their effort to speed it up, Unladen Swallow, was a disaster. That provided some of the motivation for Go.

> We're past that point. PyPy is the future. CPython is the past.

CPython is the past and is the present, and the future you can't predict. However, if you try to extrapolate, you should perhaps consider as much historical context as possible. For instance, why is Python what it is? Is it speed? I think it is the accessibility of language syntax, design and features. CPython is the base of this language evolution, and PyPy is improving just speed. So I would extrapolate CPython to always be more popular.

Go exists because someone at Google wanted to make better C and C++, a statically typed language. It doesn't have much to do with Python. Google always preferred C++ and Java to python because of static typing, not just because of speed.

Overall, I think it's a mistake to fixate so much on speed of execution, when often times speed of development is considered more important. This niche is never going away, despite of how hard some people hammer square pegs in round holes.

>Overall, I think it's a mistake to fixate so much on speed of execution, when often times speed of development is considered more important. This niche is never going away, despite of how hard some people hammer square pegs in round holes.

That's the point to PyPy. You get fast speed of development and fast CPU performance. Best of every world. That's why we use it and not CPython. It's already bigger than you think.

>That's the point to PyPy. You get fast speed of development and fast CPU performance. Best of every world. That's why we use it and not CPython. It's already bigger than you think.

What you use it for, where you need the speed and don't have a C based module to rely on? I don't mind having a bit more speed for 'free', but every time I tried, pypy hasn't been hassle free due to some module compatibility, and yet pure python code has never been a bottle neck. I have been using Python for +10 years.

Anyway, the above question is for curiosity sake, it doesn't change the point that CPython is where new language features are added. If Guido adopts pypy tomorrow, I would be happy, but otherwise, it will always try to catch up, so I don't see how it can be the future.

I think the best thing about PyPy is that it's a modern implementation freed from the stupid architectural choices that plague CPython, and that PyPy doesn't have to get along with Guido's quirks. By that I mean things like Python 3 which did some things right but most things wrong. It's seven years since Python 3 was released and there still is no pressing need to unilaterally switch. Thus, there's no need to catch up, PyPy can continue the 2.x line while importing useful features from 3.x and possibly adding some of their own (I hope they do).

>the above question is for curiosity sake, it doesn't change the point that CPython is where new language features are added.

So far. Everything changes the moment the PyPy team announces they're an official continuation of Python2. They can easily integrate more 3rd party backported features from 3->2 into PyPy4. Guido doesn't really matter there.

That said, it doesn't take that to get me to use PyPy. The speed improvement is a dream come true. Writing pure Python is much more preferable to me than writing a C extension thus PyPy is the present and future for most of us.

Well, that would be kind of hostile of PyPy, I really don't see that happening.

I'm still quite curious what is your use case for PyPy where it is such a god send? I don't enjoy C, but it's just been so rare that I had to Cython or C anything, things are quite well optimized in the ecosystem.

Why would it be hostile? There's tons of Python2 users out there like me that need support and a secure path forward with Python, and my code.

PyPy didn't create that situation, CPython did. They're just filling a market need. It may be opportunistic, but it's hardly hostile. If anything was hostile it was CPython3, but I'd say neither would be "hostile" moves.

Everyone should always act in their own best interests. Especially users like myself. The CPython team did what they felt was in theirs. I do what I feel is mine. PyPy should do the same, not be fearful of some toxic, "hostile" accusation.

The godsend is that I don't have to worry about CPU performance being a limiting factor at all. That's a big deal and reduces my hardware needs to do the same amount of work. My VPS is hardly overpowered, and I like it that way because it's cheap for my projects. :) CPython only exacerbates hardware issues.

It's very nice to have a simple interpreter to fall back on, but at this point in time I think most dynamic languages need to be on a JIT like PyPy. It's just too good and I can't wait for the STM branch to be merged into PyPy4. No more GIL for those of us using it.

I know more people that use Python 3 in production than ones that use PyPy in production. If PyPy really thinks of itself as being the future, I don't understand why they are not adding more Python 3 support. I understand that they don't get funding for that, which to me only means that the people that use PyPy have lots of legacy code themselves.

They might be the future of Python 2.7 though, a very fast Python 2.7, but still with all the annoyances and quirks that Python 2.7 brings. And without a lot of the nice features that are in Python 3.

I upvoted you on the sheer idea that performance is paramount to the future of Python. I think Python needs to be 5x faster and use 1/5th the memory for the same tasks, in order to remain relevant years from now.

But, I still think PyPy has a long way to go.

CPython is the past only if you conveniently ignore the entire Scientific Python community, which is a huge use case for Python and depends heavily on C-extensions. Frankly saying a transition to PyPy is "done" when NumPy support is not complete is laughable. Huge Python shops (we're talking NASA, CERN, and other institutions on that scale) depend on a Python stack that's heavily reliant on a huge range of things, including the C-API, that aren't even close to mature in PyPy.

NumPy proves that you can do everything that you'd need to do without the CPython C API, no? The fact that other things used in scientific Python only offer a C API interface and not cffi seems like a "legacy"-shaped problem.

Only if you label basically everything as "legacy", which is really stretching the definition. Just because someone invented a New Thing, doesn't mean all the Other Things are suddenly "legacy" overnight.

Fact is, PyPy may be "fast", but in science CPython is in effect much faster because you can use the massive and optimised scientific stack including python, pandas, hdf5, etc etc

The author only claims that better CPython C API support is needed for legacy applications.

Of course there are a vast number of applications that have C API themselves that one might want to accesses. You can interact with those through cffi in a way that's simple and performant already.

The bigger issue is the larger number of packages using the C Api, not the applications themselves

GUI apps is an area that comes to mind. Gtk3 uses the C API, although you can try and use PGI it is not fully ready.

In fact, lots of old graphics and sound APIs are the same... basically anything that is not hooked up to the web.

If you can change the extension, then going for cffi seems like a no-brainer. It works on cpython, it's easier to use, safer etc. Usually the problem is if the application/library is too big/too cumbersome. Then I think it's fair to call it legacy

C APIs also offer significantly easier distribution, as you don't need to worry about things like name mangling or support for other languages. Much better than alternatives like COM.

cffi works fine on CPython and it's also quite fast. It ends up with pretty much the same result.

I have nothing but good things to say about the pypy/cffi team. I have been maintaining a project using it for about a year and a half, and we've had great experiences reporting bugs to and getting help from Armin (the maintainer of cffi, a pypy guy).

Distributing cffi libraries that depend on each other can still be somewhat of a snarl, but last I looked there are a lot of remedies for this being discussed.

Chapeau to the team!

I've been using PyPy as my main development platform for quite some time now. So for me if I were to move to Python3, it would have to be Python 3.2 since that's all PyPy3 supports. I've also been told that PyPy3 is not production-ready. While PyPy4 is, so it's a no brainer for me to develop for PyPy4 then fallback to CPython2.7 if I run into problems.

Fingers crossed, but so far my home business is running off pure PyPy4. No C-extensions, no interpreters. :)

This post highlights an interesting dichotomy in the Python scientific computing community. Everyone knows that PyPy runs faster than CPython for many common tasks [1].

[1] PyPy is on average 7x faster than CPython: http://speed.pypy.org/

But those in the Python community who are serious about scientific computing (or image processing, like my startup) are already using Numpy & Scipy, which provide hand-coded C implementations of most matrix-related operations. Everyone knows that Python for-loops are "slow" [2], and storing a large 2-D matrix as a list-of-list-of-Python-int-object would require a huge amount of memory & indirection. So, Numpy offers an N-dimensional array type, implemented in C: C arrays of densely-packed C primitive types, with for-loops in C to iterate over the matrix elements. Then Scipy builds a lot of Matlab-like functionality as modules on top of this fundamental Numpy array type.

[2] Python for-loops are slow: https://jakevdp.github.io/blog/2014/05/09/why-python-is-slow...

So the most expensive operations in a Python number-crunching program are likely already implemented using Numpy & Scipy operations, which run in compiled C (and additionally, often make use of Blas/Atlas/LAPACK/etc, for even greater speedups in sustained number-crunching).

But unfortunately, Numpy & PyPy do not naturally work together. Being written in C, Numpy makes substantial use of the CPython C-API -- and in fact, Numpy provides its own C-API [3]! The official Numpy package doesn't work with PyPy; the PyPy project very thoughtfully provides its own PyPy-compatible Numpy package [4].

[3] Numpy C-API: http://docs.scipy.org/doc/numpy-1.10.0/reference/c-api.html

[4] PyPy-compatible Numpy package: http://pypy.org/download.html#installing-numpy

Furthermore, Numpy is fantastic, but it can't offer all possible permutations of matrix operations. In particular, there are certain image-processing operations that are awkward (and thus, computationally-inefficient) to express using Numpy operations. So you might ultimately need to go to the Numpy C-API anyway.

This is why we created (and, just a few days ago, open-sourced) Pymod: https://github.com/jboy/nim-pymod

Pymod is a Nim+Python project that auto-generates all the Python C-API boilerplate & auto-compiles a Python C extension module that wraps the functions in a Nim module. Pymod enables us to write our Numpy array-processing code in Nim, then compile it (for C++-like speeds) as a well-behaved Python module. Nim made this very easy, because it compiles to C.

After considering our Python-integration options (CPython C-API, `ctypes` and `cffi`), we decided to go with the CPython C-API & Numpy C-API. We explained this decision in greater detail in the "Implementation details" section of the Pymod README [5]; the executive summary is that `ctypes` seems better suited to wrapping C types in Python, rather than exposing existing Python types in C, while the CPython C-API code could be generated & compiled with the C code that Nim was going to produce anyway.

[5] Pymod implementation details: https://github.com/jboy/nim-pymod#implementation-details

That said, we would be delighted for Pymod-produced Python modules to be able to run under PyPy. We've been strongly considering implementing a `cffi` back-end for Pymod, but this won't necessarily solve the Numpy issue. It would be even better if PyPy could support all the CPython C-API extension modules in the world in one fell swoop.

Can you comment on the difference between Pymod and Cython? Cython's pretty mature now and it's straight up amazing how easy it is to generate cross-platform extension modules. It even exposes a lot of C++ too.

Cython is a separate programming language that allows you to write C code with a special Python-like syntax, or write Python with Python syntax (including NumPy), and has a few extra type annotation bits here and there (like array syntax or pointer syntax). At the end, it creates the equivalent C code (with all needed CPython API boilerplate) and compiles it into an importable shared object file.

Just from the description above, it sounds superficially the same as Pymod; is it fair to say Pymod is to Nim what Cython is to C/C++? I'd be very interested to know more about how the two tools compare and contrast.

Sure, this is a question that people ask quite frequently. I most recently answered it here: https://news.ycombinator.com/item?id=10569768

"""Actually, Pymod was designed to be almost an anti-Cython. :)

My issue with Cython is that it's a limited sub-language within Python, where you add Cython elements incrementally & iteratively (diverging from Python in the process) until the code runs "fast enough". I'd rather work directly in a full-featured, internally-self-consistent language from the start. Nim has a clean Pythonic syntax, with all the best parts of C++ (including its runtime speed).

Hence, Pymod takes the form of an `exportpy` annotation (a user-defined Nim pragma) onto existing Nim functions, which are then auto-compiled into a Python extension module. So there's no gradual divergence of my Python code (as it becomes more "Cythonized"); rather, the high-performance code is written directly in pure Nim. :)


There are a few more details in that thread comparing the wrapping of existing C libraries in Cython vs Pymod. (It doesn't seem right to copy-paste an entire thread...)

I'm not sure I appreciate that criticism of Cython. A general strategy in Cython is to use the annotation feature (cython -a) and visually inspect the degree of CPython API necessary for the lines of Cython source code. I've generally found this process to be really enlightening. Of course you can use that information to select portions of the code that don't need to be involving the CPython API, add typed contructs to those parts of the Cython source, and iterate. But you can also learn a lot about how CPython works ... for example how using the 'and' keyword can invoke a long chain of Python special functions with lots of type checking overhead.

What this lets me do is to be extremely fine-grained about my optimizations, or conversely, to also see when some optimizations are not worth it because they don't help much but they do hurt the readability, or cause too much Python-to-Cython divergence as you put it.

In a lot of cases, I prefer that this is left up to me to do, rather than if Cython had already made hand-mapped choices about which Python things compile to which C or machine code things. If taken to the limit, a Cython that did that would just become what PyPy is, except it would be ahead of time compiled instead of JIT compiled.

But I do see the benefit of both approaches. Sometimes you don't want the burden of choosing your annotations to induce the desired compilation effect, and you don't want to allow for similar but not identical Cython source files to result in dramatically different C code, as can often happen currently.

The preferred workflow that you've described seems to be a (more considered) form of the standard Cython workflow that I see described:

(1) Write Python. (2) Compile with Cython. (3) Run compiled Cython, profile & review. (4) Consider what Cython annotations to make to code; make code changes. (5) Goto 2.

The emphasis is always on iteration & incremental additions.

Of course I practice iterative & incremental development, and of course I'll prototype a quick proof-of-concept implementation first (often in Python+Numpy) before profiling & algorithmic optimization. But the Cython workflow seems to me to add more iteration (of incremental Cython syntax additions) than is really necessary. When I'm working to implement some algorithm, I don't really want to iteratively learn how Cython or CPython implement various Python functions; I'd rather write my "proper version" code just once, properly the first time.

So why not take it to the logical extreme and write it all in Cython-lang from the start? If we're writing for-keeps code in a language with Pythonic syntax & static types, I find Nim-lang a more expressive, more full-featured language (with features such as generics & type-safe Lisp-like macros, in particular; I note that Cython does support pointers & operator overloading) than Cython-lang in general-purpose uses, without being very different at all in simple uses.

For example, there is an example `primes` function in the Cython tutorial: http://docs.cython.org/src/tutorial/cython_tutorial.html#pri...

Here is an equivalent implementation in Nim. As you can see, it's really almost identical in syntax:

  proc primes(kmax: int): seq[int] =
    var kmax = kmax
    var n, k, i: int
    var p: array[1000, int]
    result = @[]
    if kmax > 1000:
      kmax = 1000
    k = 0
    n = 2
    while k < kmax:
      i = 0
      while i < k and n mod p[i] != 0:
        i = i + 1
      if i == k:
        p[k] = n
        k = k + 1
      n = n + 1
    return result
All of this said, I understand that a great deal of this decision comes down to personal preference: Would you rather start with Python & then iteratively diverge? Would you rather start & stay in Nim? And I can also see the benefit of both approaches in different circumstances. :)

> The emphasis is always on iteration & incremental additions. > All of this said, I understand that a great deal of this decision comes down to personal preference: Would you rather start with Python & then iteratively diverge?

When doing scientific research, I'd prefer to start with Python and then iteratively diverge. The aim is to get my task done as quickly as possible (so I can go write those 15 journal articles due last Monday) and go just as far as is needed. It's the scientist/engineer vs programmer mindset difference.

However, if the code was done and I knew I'd need to use it again, I like the idea of rewriting it from scratch in Nim - not that I or any scientific colleagues know Nim (only one has heard of it).

Anyway, I look forward to keeping an eye on your work and maybe using it in the future :)

If you're writing in Cython (not Python) from scratch you don't need to work any more iteratively than in any other language. You can start out with low level memory management with malloc and declare everything with C types.

Granted, generics and macros are a nice advantage of Nim. What I think is the essential difference here is that Nim is higher level than Cython. So it seems to boil down to a matter of preference. However, when Python is not fast enough, Cython lets you go all the way to generating C code without garbage collection, inline assembly etc. With Nim there's yet another layer between Python and C.

Or you can use numba and with pure python syntax and get really fast.

It is strange that you characterize Cython as a limited sub-language. Cython supports almost the full Python language, plus adding a superset for producing native C code. In addition, Cython can be used to wrap C code, so what you describe as "exportpy" is possible as well. Cython is a mature project, being in use in various parts of the scientific python ecosystems such as in Pandas and scikit-learn. Nim sure looks interesting but from what I hear, the language and the compiler still have rough edges. In terms of stability, maturity, and ecosystem, I'd say Cython has a clear advantage.

I elaborated on my thoughts about the Cython language in this sibling comment: https://news.ycombinator.com/edit?id=10578834

I agree with you that Cython is a mature, very respectable project, and it evidently has more widespread adoption at this time.

I'm betting on Nim because I think it has a larger positive gradient, even though its y-coordinate is lower right now.

I will try to give my 2 cents. As you correctly state, Cython is in effect another language.

Now, a core principle in the design of Cython is to be as much as possible source compatible with Python, which is a language without static types. This is not a very good principle for me - I would rather work in a language that is not literally identical to Python, but has its own design. Adding static types often requires to do things slightly differently. In Nim a find a very principled and well designed language, and PyMod seems a great to tool to make interoperability with Python easy.

You have a point that adding static types can mean having to rewrite. But in practice I often start out with Cython and static types, making this a non-issue. If you use Cython in this way it can approach C with Python syntax. See also http://spacy.io/blog/writing-c-in-cython/

I am not only saying that one needs to write things differently in Cython. I am arguing that the whole language needs to be designed differently, to take into consideration the presence of types. This includes adding features to recover the lost dynamism, such as interface or macros, and modifying the standard library to a style that best suits a typed language. Just grafting types onto Python is not enough

> I am arguing that the whole language needs to be designed differently

No I think you are arguing that you want it to be designed differently. It's like the contrast between having a high and low level language (Python, C/Cython) versus having a unified middle-level language such as Julia (see http://graydon2.dreamwidth.org/189377.html ). It's a matter of preference which is better, and an empirical matter which approach will be more successful in the end.

> Just grafting types onto Python is not enough

Clearly it is already successful though...

"Python for-loops are slow"

CPython for-loops are slow. PyPy for-loops are not slow. A loop that just counts is about 25x faster in PyPy than in CPython, for large numbers of iterations.

> CPython for-loops are slow. PyPy for-loops are not slow.

I fear you might have missed the point of that paragraph. The point was that multiple reasons contributed to the need for Numpy, so now it exists and is widely used by those who are serious about their scientific computing. The vast majority of those reasons are still valid, even though PyPy speeds up some general-purpose programming tasks in Python.

"A loop that just counts" is not anywhere near a substitute for Numpy. Nor does a 25x speed increase in a particular operation (from CPython to PyPy) hold a candle to the number-crunching speedups provided by algorithm-optimized, code-optimized (often to the point of targeting specific vector instruction sets like SSEx) special-purpose numeric libraries.

I've been intrigued by the C code-generation feature of Nim but haven't really gotten around to exploring the language yet.

Is there any write-up or tutorial that expands on this aspect?

There is some documentation on the various Nim backends here: http://nim-lang.org/docs/backends.html & http://nim-lang.org/docs/nimc.html#compiler-usage-generated-...

That second link describes the `nimcache` directory into which the generated C code is written. If you poke around in that directory (starting with the generated C version of your Nim code, which will have the same filename but a `.c` suffix instead of a `.nim` suffix) then you can see that the generated C code is really quite readable.

It's also quite fun digging around the output of other code generation toolkits. Cython code, for example, can look like quite idiomatic C if you add enough type annotations.

Theano's output, on the other hand...

Actually I recalled I played with nim about a month ago. Really only created a three liner (print "what's your name", read a string, print "Hello" + string_read). The generated C code was a couple hundred lines. Honestly that turned me off.

I'm looking for a high level language that generates really succint human-readable code. Essentially, what a human would do to create the exact program in C. And doing what I did in nim in 3 lines, really only takes less than 10 lines in C.

One of the great things about Python is the C API and how well documented it is.

APIs like that are both good and bad because they let you do almost anything so make many optimisations either hard or impossible, and have a nasty habit of exposing implementation details which constrain future changes to the language implementation.

Even APIs such as JNI that do their best to abstract over such details tend to block a JIT's view of what is going on and end up being a real pain.

Personally I'd be in favor of a more general solution... Some kind of cffi/CPython bridge.

I'm afraid I'm not up to speed on pypy. What is the benefit here?

PyPy is much faster for many workloads. It's a garbage collected just-in-time compiler and mostly compatible with the CPython runtime (the one you get from python.org). It also happens to be written in a subset of Python which is pretty neat. The CPython API exposes a bunch of details specific to the CPython implementation (like object reference counts) and so hasn't historically been supported in PyPy (which supports the whole language but internally is very different). It sounds like it would be a pain to do but the PyPy team thinks it's feasible to build.

Speed. In many common cases, pypy is significantly faster than the python C runtime. Unfortunately, the python C api was not designed with a jit in mind.

But as Numba has shown, it's achievable to write a CPython-to-LLVM-IR compiler for subsets of the language, then add NumPy-awareness, and add multiple targets like GPUs, and achieve better-than-PyPy speed ups jitting pure Python (CPython) code.

I think it's an interesting question as to whether PyPy "is the future" or rather something more like Numba, or even a whole ecosystem of specialized variants of Numba that may be tweaked or tuned to take advantage of different hardware optimizations, work better for different domain problems (e.g. image processing vs. network programming).

In a lot of cases where PyPy improves over some standard CPython code, it's exactly the sort of code you really wouldn't care about optimizing in the context of a larger application. In any sort of scientific application, the pure CPython parts of it are usually small and insignificant compared to what's written in NumPy/SciPy/pandas and science libraries written on top of those, and Numba can be seen as the first "specializing compiler" for this sort of application -- with a lot of room for people to write other kinds of specializing compilers that compile different genres of Python code for domain-specific purposes, the way that Numba compiles generally scientific Python for numerical computing purposes.

> But as Numba has shown, it's achievable to write a CPython-to-LLVM-IR compiler for subsets of the language, then add NumPy-awareness, and add multiple targets like GPUs, and achieve better-than-PyPy speed ups jitting pure Python (CPython) code.

Other than "to-LLVM" and "targets like GPUs", jitpy does exactly that for PyPy-in-CPython.


It's not quite as fast for numeric code (though it's close and conformant) but it's also not a language subset nor does it degrade to CPython performance on dynamic parts of the language. It should be a terrific option if you're able to section out discrete workloads that CPython's struggling with, even in non-numeric code.

Few people know of it though.

Thanks for the link to JitPy. I will definitely read more about it. Just from the first parts of the docs, though, it looks very limited compared with Numba. In Numba, there is a distinction between 'python mode' and 'nopython mode' -- meaning that even at the final LLVM IR emission, if Numba is forced to punt on type inference (e.g. because an untyped Python list object is an argument or something), Numba can still fall back directly to the CPython API, and in general even this has speed benefits as a lot of intermediate calls or intermediate variables can be lowered or typed.

So even though JitPy won't "degrade to CPython performance", it's slightly moot since you can only handle a limited set of types in JitPy. I guess one area where JitPy should win in theory is if there is a dynamic Python object inside of the function, like a dynamic list. Then JitPy is OK since it's not part of the signature, and when it can't optimize due to indeterminate type inside the function, the fallback will be PyPy instead of CPython, and hence should be faster.

But Numba offers a lot on the array computing side, which is what it is specialized for. For example, from the Numba docs < http://numba.pydata.org/numba-doc/0.21.0/developer/architect... >:

> Numba implements a user-extensible rewriting pass that reads and possibly rewrites Numba IR. This pass’s purpose is to perform any high-level optimizations that still require, or could at least benefit from, Numba IR type information.

> One example of a problem domain that isn’t as easily optimized once lowered is the domain of multidimensional array operations. When Numba lowers an array operation, Numba treats the operation like a full ufunc kernel. During lowering a single array operation, Numba generates an inline broadcasting loop that creates a new result array. Then Numba generates an application loop that applies the operator over the array inputs. Recognizing and rewriting these loops once they are lowered into LLVM is hard, if not impossible.

> An example pair of optimizations in the domain of array operators is loop fusion and shortcut deforestation. When the optimizer recognizes that the output of one array operator is being fed into another array operator, and only to that array operator, it can fuse the two loops into a single loop. The optimizer can further eliminate the temporary array allocated for the initial operation by directly feeding the result of the first operation into the second, skipping the store and load to the intermediate array. This elimination is known as shortcut deforestation. Numba currently uses the rewrite pass to implement these array optimizations. For more information, please consult the “Case study: Array Expressions” subsection, later in this document.

One reason why I see the Numba route as more effective than the PyPy route is that these kinds of highly-specific optimizations seem likely to occur all over the place, and to be very related to the genre of computing you are doing. One person may not want Numba to perform these optimizations, because loop fusion isn't important to their codebase which doesn't do a lot of array computing. Instead, they might want some kind of domain-specific optimization that assists with a kernel bypass for some low-latency algorithm.

This could all be exposed in Numba, but as separate domain-specific sub-modules or via user-defined IR optimization passes or something. Or it could exist as totally separate JIT compiler projects, of which JitPy/PyPy's JIT compiler and Numba's JIT compiler would each just be genre-specific examples.

I tend to see this as the more likely future, and that apart from cases where someone is really just taking a bunch of pure Python code and running it with PyPy interpreter instead of CPython interpreter (which I forecast would be a rare case), the difference between PyPy and CPython won't matter nearly as much as the difference between domain-specific JIT compilers.

I agree with everything you've said. jitpy is not so much competing in the numeric space (although it's not far from it), since Numba is as you say really good there.

You do need to be able to build an abstraction boundary, such that complex classes never cross it, but nothing prevents you from instantiating a PyPy class and communicating to it though function calls.

You'll probably want some way to share references across them, but that should be as simple as having a global mapping and a `Ref` class that cleans it up on destruction. Should be a few dozen lines of code.

Many other solutions implement both loop fusion and shortcut deforestation, for instance theano. In PyPy we had this for numpy expressions, we called it "lazy evaluation", we have temporarily removed it until we achieve more functional parity with upstream numpy

> But as Numba has shown, it's achievable to write a CPython-to-LLVM-IR compiler for subsets of the language, then add NumPy-awareness, and add multiple targets like GPUs, and achieve better-than-PyPy speed ups jitting pure Python (CPython) code.

Most python code I write would get 0 speedup from that. I think numeric optimization is a very different target than compiling and optimizing general purpose python code. I don't see either strategies "winning" in the near future.

It tends to considerably outperform CPython on performance benchmarks.

Applications are open for YC Winter 2022

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact