PyPy 1.6 Released - Full Python 2.7.1 Implementation

lliiffee · on Aug 18, 2011

Can I take advantage of this thread to ask the HN crowd a technical question? Some time ago, I implemented an automatic differentiation tool. Using operator overloading on a special "autodouble" type the tool would trace the execution of a block of numerical code. Then, some calculus would automatically happen, and it would output and compile fast c-code that would compute the original function and derivatives in pure c. This was great, except the c-code that was output was freaking gigantic (like hundreds or thousands of megabytes) albeit very simple, and so the c compiler would take forever to run. Sigh.

My question is: could I leverage pypy somehow to avoid this? Can I output RPython? Can I output whatever RPython is compiled down to instead? Can I do this with no more than, say, a 3x penalty compared to c?

(I apologize for asking a question only marginally related to the particular article here...)

carterschonwald · on Aug 19, 2011

Off hand and guessing about the problem:

You might want to look at approaches that use dual numbers. Likewise, in instead of inlining the procedures, generate a differentiated version of each procedure with a new name. If those don't cover your problems, perhaps look at how other autodiff tools for C do it?

lliiffee · on Aug 19, 2011

For being offhand, those are very good guesses! Dual numbers won't be efficient, as I want reverse-mode autodiff.

As to the multiple procedures: Well, as I was doing it, even a single procedure can be many hundreds of megabytes large. Even for very very simple code, however, I noticed that GCC was superlinear in code size. I suppose I could somewhat arbitrarily break up the code into arbitrary functions. I wonder if that would speed things up?

Other autodiff tools: Well, they basically trace execution, but then run an interpreter instead of trying to actual generate compiled code. I wanted to both be faster than that and have the wonderful experience of writing pure python...

carterschonwald · on Aug 19, 2011

it sounds like you're getting lots of code duplication. 1) try running a common subexpression elimination process on your code before doing the autodiffing, and create a procedure for each shared expression 2) for prim ops, again, have a procedure created for the diffed version instead of inlining, and sub in the procedure instead.

perhaps something like these ideas would help

If you want an example of a nice high level Auto diff lib, a nice one that works via operator overloading is http://hackage.haskell.org/package/ad , which seems quite nice though I've not had the opportunity to use it myself.

yes, optimizing compilers such as gcc use algorithms that are superlinear in code size when they're optimizing. Perhaps you should instead try out the operator overloading approach (and see if you can )?

gl :-)

Aside: When I hear the phrase execution trace in the context of program analysis, i think abstract interpretation, though I'm not sure if thats relevant for you.

cheers!

lliiffee · on Aug 19, 2011

It isn't exactly common subexpressions. Basically the problem is things like matrix multiplies always get unrolled.

I've used lots of operator overloading based autodiff packages for C++. They are great, but the issue is not how the function is recorded (I used operator overloading myself in my python package) but how it gets executed at runtime. Unless a compiler (or JIT) is called sometime between when the operator overloading happens and execution happens, the function is basically being interpreted at runtime. This is what happens in, e.g. ADOL-C, SACADE, and CPPAD, all of which come with a significant (e.g. 20x) performance penalty as compared with hand-written derivatives.

beambot · on Aug 19, 2011

Theano (http://deeplearning.net/software/theano/introduction.html) might be worth a shot if you have closed-form expressions.

'''Theano is a Python library that lets you to define, optimize, and evaluate mathematical expressions, especially ones with multi-dimensional arrays (numpy.ndarray). Using Theano it is possible to attain speeds rivaling hand-crafted C implementations for problems involving large amounts of data. It can also surpass C on a CPU by many orders of magnitude by taking advantage of recent GPUs.'''

sjs · on Aug 19, 2011

LLVM? I don't have much experience with runtime code generation but you might find it useful.

lliiffee · on Aug 19, 2011

LLVM would seem to (maybe? possibly!?) be a good solution, but I was never able to convincingly determine if it was likely to solve my problems, and the documentation seemed to be quite sparse for doing what I wanted to do. (This was a couple years ago, however, and I've love to be proven wrong.)

sjs · on Aug 19, 2011

Instead of generating C code you might be able to generate LLVM IR and have it execute immediately, saved to disk, or both. I don't know how it will perform for you or how high level your generated C is though, it may not be realistic.

Or use it to compile your C as it is pretty quick. It's not going to speed things up by orders of magnitude but every bit helps.

lliiffee · on Aug 19, 2011

Awesome. Am I right in understanding that LLVM IR should be LLVM IF or LLVM assembly? The generated code is most extremely simplistic. An example would be:

  A[12800]=A[0] * A[6400];
  A[12801]=A[1] * A[6480];
  A[12802]=A[12800] + A[12801];
  A[12803]=A[2] * A[6560];
  A[12804]=A[12802] + A[12803];
  A[12805]=A[3] * A[6640];

Which would be very easy to deal with. The only problem is that I sometimes call out to libraries for exp, log, sin, cos, etc.

sjs · on Aug 19, 2011

That's right it's a platform independent representation. I think LLVM assembly is really low level, I'm pretty sure you want to generate IR. You should be able to link the math lib.

py2llvm looks pretty interesting.

http://code.google.com/p/py2llvm/

http://code.google.com/p/py2llvm/source/browse/trunk/CodeGen...

ori_b · on Aug 22, 2011

RPython compiles down to C. Also, do you have an example of the generated code? I'm not sure what sorts of patterns would need it to be hundreds of megabytes.

(Also, if Pypy's compilation time is any guide, it takes massive amounts of time to compile rpython)

voyvf · on Aug 18, 2011

> has beta level support for loading CPython C extensions.

Is this via ctypes, or "real" support in much the same as how CPython would behave?

I ask because this is one of the features that I've been waiting (impatiently) for - I've run some Flask projects using PyPy and gunicorn, and love how fast it goes, but really want to be able to use the rest of my codebase, which unfortunately does rely on some C (and Cython) extensions. (:

vgnet · on Aug 18, 2011

It's "real" support, using your criteria. However, it's slower in PyPy than in CPython and than what the same thing based on ctypes would be on PyPy.

If there's a pure-Python version of the C extension, it might be faster on PyPy than the C extension support that cpyext (PyPy C-API compatibility module) provides.

Cython-based code is currently incompatible (it goes well beyond the public C-API), but a GSoC project to generate ctypes-based pure-Python code from Cython is (was?) going on.

voyvf · on Aug 18, 2011

> It's "real" support, using your criteria. However, it's slower in PyPy than in CPython and than what the same thing based on ctypes would be on PyPy. > If there's a pure-Python version of the C extension, it might be faster on PyPy than the C extension support that cpyext (PyPy C-API compatibility module) provides.

That's fine, the mere ability to run them will be nice. (:

> Cython-based code is currently incompatible (it goes well beyond the public C-API), but a GSoC project to generate ctypes-based pure-Python code from Cython is (was?) going on.

I remember that GSoC! Any idea as to what happened to it, if it was scrapped, or whatever?

vgnet · on Aug 18, 2011

> I remember that GSoC! Any idea as to what happened to it, > if it was scrapped, or whatever?

I know it made a lot of progress in Cython-land (updates at http://rguillebert.blogspot.com/) but no real overview or high level status update has been given yet (AFAIK). Since it's from GSoC 2011, it's about time for that to happen... and for other people to start contributing code <hint, hint> :).

pjenvey · on Aug 18, 2011

Not via ctypes, via CPyExt, a layer emulating the CPython C extension API (it actually implements a reference counting GC and everything)

sylvinus · on Aug 18, 2011

I'm always blown away by the consistent performance gains they reach with each new version. Congrats!

kristofferR · on Aug 18, 2011

In general I'm very happy with my choice of Ruby/Rails instead of Python/Django, but PyPy is one of the few things I envy Python developers for.

I wish something similar could be developed for Ruby.

sirn · on Aug 18, 2011

What about Rubinius?

Can you also elaborate on why you're happier with Ruby and Rails than Python and Django. Since I'm completely the other way around, while Ruby paid my bills, if there's a Python job offering I would have jumped onto it without much thought.

One of few things I envy Ruby developers for is the HAML, SCSS, Jammit stuff. Apart from that, not so much, especially for documentations for Ruby modules. It might be from the fact that I prefers Flask[1] to Django, but I also like Python's explicitness than Ruby's implicit and found Python language design to better suits my taste (e.g. the Python class[1]).

[1]: http://lucumr.pocoo.org/2011/7/9/python-and-pola/

[2]: http://flask.pocoo.org/

mattdeboard · on Aug 18, 2011

I use SCSS with Django all the time.

rglullis · on Aug 19, 2011

As an alternative for scss and jammit, you can try django-css: https://github.com/dziegler/django-css

stavros · on Aug 19, 2011

I use django-mediagenerator[1], I love it to bits.

[1]: http://www.allbuttonspressed.com/projects/django-mediagenera...

Freaky · on Aug 18, 2011

If it's mainly the performance you're after, have a look at JRuby. It's already a decent chunk faster than MRI/YARV, and with Java 7 and invokedynamic there's a lot more to look forward to and play with now.

Last week one of the JRuby core devs tweeted: "Ok, another set of tweaks in JRuby and fib(35) is another 10% faster (and only about 10% slower than Java. Do I stop now?"

irahul · on Aug 18, 2011

> but PyPy is one of the few things I envy Python developers for.

Depending on your needs, there are others - numpy, scipy, matplotlib, nltk, gevent.

> I wish something similar could be developed for Ruby.

Isn't http://rubini.us/ supposed to be the PyPy for Ruby? It's not complete, but then neither is PyPy.

stephenjudkins · on Aug 18, 2011

Rubinius and PyPy take quite different approaches to solving the same general problem. It's inaccurate to analogize them.

Rubinius specifies a bytecode (see http://rubini.us/doc/en/virtual-machine/instructions/) and implements a VM that executes this bytecode. The VM is written in C++. The main difference between Rubinius and other Ruby implementations is that nearly everything else is written in Ruby. The lexer, parser, and parts that compile the AST to bytecode is all written in pure Ruby. Further, large parts of the standard library and Ruby language that are implemented in lower-level languages in other implementations are instead written as runtime-level Ruby. However, large and important parts of the infrastructure (including GC and memory allocation) are written in C++.

PyPy takes an altogether different approach. It specifies a restricted subset of Python known as "RPython". Basically, it's statically-typed Python. There is also a "compiler" (also called a "JIT generator"), written in pure Python, that compiles an arbitrary interpreter written in RPython into a fast JIT. (See http://morepypy.blogspot.com/2011/04/tutorial-writing-interp... for an example of writing a very simple, small interpreter in RPython.) A cool feature of RPython is that while there must a statically RPython AST to compile, during the "eval" stage you can leverage the full dynamic capabilities of Python.

The second part of PyPy is a full Python interpreter written in RPython. Since RPython is a subset of Python, you can actually run this interpreter sandboxed inside any other Python implementation, albeit very slowly. The PyPy toolchain compiles this down to a very fast Python interpreter that features JIT compilation, among other cool features.

The neat part of PyPy (in contrast to Rubinius) is that the toolchain part is largely decoupled and generalized away from Python-the-language. Conceivably, one could write ANY language in RPython (or what's compiled down to RPython) and have a fast JIT interpreter. Experimental language features can be added and changed (see the source code for the different garbage collectors in PyPy) with little mucking with the low-level VM.

Rubinius is a neat, pragmatic project that could turn into one of the best Ruby implementations around. However, it's not breaking new ground like PyPy is.

irahul · on Aug 18, 2011

> Rubinius and PyPy take quite different approaches to solving the same general problem.

> However, it's not breaking new ground like PyPy is.

I know about the differences. My response was to the parent post "if there is something like PyPy for Ruby".

Rubinius doesn't need to break new grounds, nor provide a toolset for implementing interpreters. All it needs to do is to provide a decent JIT with reasonable speed and low memory consumption, with the added benefit of large parts of it in Ruby, so that the contribution ecosystem is more active. That will be pretty much analogous(and all analogies, by definition, are inexact. Or else they won't be called ananlogy) to what PyPy is for Pythonistas.

nevir · on Aug 26, 2011

This is a great overview about the two interpreters! However, I think you're being a bit inaccurate about how flexible the Rubinius VM is. It is certainly more than flexible enough to support a wide variety of non-Ruby languages.

The core Rubinius team bas been working towards a language-agnostic core, and toolchains for building your own language on top of the Rubinius VM. See http://rubini.us/2011/02/17/rubinius-what-s-next/#multi-lang... and http://rubini.us/2011/02/23/introduction-to-fancy/

There are already a good number of language projects that build upon it: http://rubini.us/projects/ Fancy is one of the more mature languages built on it (and it's self-hosted): https://github.com/bakkdoor/fancy. Note that Fancy was originally written as an interpreter in C++ - they ported it to the Rubinius VM at a later point.

There's even an (immature) Python that runs on top of it: https://github.com/vic/typhon.

nevir · on Aug 26, 2011

This is a great overview about the two interpreters! However, I think you're being a bit inaccurate about how flexible the Rubinius VM is. It is certainly more than flexible enough to support a wide variety of non-Ruby languages.

The core Rubinius team bas been working towards a language-agnostic core, and toolchains for building your own language on top of the Rubinius VM. See http://rubini.us/2011/02/17/rubinius-what-s-next/#multi-lang... and http://rubini.us/2011/02/23/introduction-to-fancy/

There are already a good number of language projects that build upon it: http://rubini.us/projects/ Fancy is one of the more mature languages built on it (and it's self-hosted): https://github.com/bakkdoor/fancy. There's even an (immature) Python that runs on top of it: https://github.com/vic/typhon.

nickik · on Aug 18, 2011

If I would want to speed up ruby I would probebly just implment a ruby interpreter with RPython. They could probebly get up to speed with pypy with much less work. The semantics of the two langauges are not that diffrent.

headius · on Aug 23, 2011

Rubinius's parser is C/++ code derived from the same Bison grammar as regular Ruby. That's not to say it couldn't be pure Ruby, but it isn't right now.

stephenjudkins · on Aug 27, 2011

I stand corrected. Thanks.

luismgz · on Aug 25, 2011

> I wish something similar could be developed for Ruby.

There's something like Pypy for Ruby already: It is Pypy itself. Pypy is a framework for implementing jitted dynamic languages (any language, not just python). You can generate interpreters for any language you want, as long as you write them in Rpython (restricted python). So, if you want a pypy for Ruby, just write a ruby interpreter in Rpython, and then use the pypy translation toolchain to compile it down to C, while automatically generating a just in time compiler for free.

alphakappa · on Aug 18, 2011

matplotlib and numpy make Python really awesome. If only there was a way to get matplotlib to work on Lion.

wynand · on Aug 18, 2011

I'm using the Scipy superpack (http://stronginference.com/scipy-superpack/) on Lion and Matplotlib has given me no trouble so far.

Nitrof · on Aug 18, 2011

The solution with HomeBrew is shown here : http://sourceforge.net/mailarchive/message.php?msg_id=278883...

iuyterw · on Aug 18, 2011

Until the patched version is released...

pip install -e git+https://github.com/matplotlib/matplotlib#egg=matplotlib-dev

Derbasti · on Aug 19, 2011

But they don't run on PyPy yet. Now that will be awesome!

lobo_tuerto · on Aug 18, 2011

Maybe rubinius will fill this need? http://rubini.us/

Also, this could be an interesting read: http://www.engineyard.com/blog/2010/making-ruby-fast-the-rub...

bryanwb · on Aug 18, 2011

but what is the status of rubinius? is it nearly as far along as pypy?

I would think that the metaprogramming features of ruby which make it so much fun would also make it n times harder to build a JIT for.

lobo_tuerto · on Aug 18, 2011

I think the second link I posted in my previous comment could help you get a grasp about what they are doing in respect to the dynamic nature of the language and JIT.

Here is the current status towards 1.9 implementation: http://status.rubini.us/

From the frontpage:

"How compatible is Rubinius? From the start, compatibility has been critical to us. To that end, we created the RubySpec to ensure that we maintained parity with official Ruby. We are currently at a 93% RubySpec pass rate and growing everyday. For now Rubinius is targeting MRI 1.8.7 (1.9 is on the post 1.0 list). Most Gems, Rails plugins and C-Extensions work right out of the box. If you find a bug, let us know and we'll get on top of it."

tedunangst · on Aug 19, 2011

The dynamic features of ruby are not so different than those of python, lua, or javascript. Actually in terms of reasonable memory consumption I think rubinius is quite a bit better than pypy.

postfuturist · on Aug 18, 2011

http://rubini.us/

sho_hn · on Aug 19, 2011

I'm still bummed at being stuck with the dilemma of having to chose between CPython 3.x and PyPy. PyPy with Py3k support would rock.

socratic · on Aug 19, 2011

Are there production users of PyPy?

I feel like PyPy has always been the most academically interesting Python implementation. But has it taken away mindshare from CPython?

vgnet · on Aug 19, 2011

Yes, there are.

Recently Quora announced it was running on PyPy[1]. Some other disclosures were made (a Django project[2], LWN internal processing[3], tweets about speedups in production, etc.), but the PyPy team is thinking about officially asking for success stories in the near future[4].

[1]: http://www.quora.com/Alex-Gaynor/Quora-product/Quora-is-now-...

[2]: https://convore.com/python/whos-using-pypy-in-production/

[3]: http://lwn.net/Articles/442268/

[4]: https://bitbucket.org/pypy/extradoc/src/tip/blog/draft/succe...

socratic · on Aug 19, 2011

To me, this list seems to suggest the opposite actually.

[1] is the company where one of the main developers of PyPy works. [2] appears to be a discussion where they are looking for anyone using PyPy in production. [3] seems to be an article about someone experimenting with PyPy for git processing. [4] is the developers looking for non-toy examples of production use so that they can get more funding.

vgnet · on Aug 19, 2011

It suggests that there aren't many production users, true. But it does prove that there are some.

socratic · on Aug 19, 2011

Indeed. Though two-ish production users after 7 years of development hardly seems like a success.

By way of contrast, (1) in the Ruby world, YARV went from an alternative implementation to the official implementation within two years, and (2) in the JavaScript world, node.js is similarly being used in production in tons of places after only two years. (Though I realize that those examples aren't exact parallels.)

This isn't meant to criticize your response, rather, I find it interesting that the Python world seems to have so many alternative implementations (PyPy, IronPython, Jython, Cython) despite what appears to be really minority mindshare compared to CPython.

sqrt17 · on Aug 19, 2011

* Jython is a really nice alternative if you want to use Python-the-language in a Java environment. Its suboptimal speed and the fact that it's really easy to write Java code that can be used from Jython means that you'll probably use it more as a scripting language * The relation between IronPython and .NET is probably very similar * For CPython, people have always wrapped their favorite C library and started using it - with SWIG in the old times and nowadays with Cython, which is a compiler for a subset of Python extended with types. In fact, quite a lot of useful Python libraries use Cython (whether as a glue language or as an implementation language), and it's probably in production use at a lot of places.

One of the problems of PyPy is that, while targeting Python-the-language, it also gives up CPython compatibility by using their own PyPy VM (in the sense of, GC'd execution environment with a JIT that's not really compatible with other C/C++ code or even CPython). Because they're targeting their own execution environment, they cannot profit from either the plethora of C/C++ libraries that exist or even the very decent library support that exists on the JVM.

The PyPy people have recognized this and started to take care of the affair with a JNI-type version of CPython's ctypes library. The problem is that very few library wrappers (or even non-Python CPython extensions) are based on ctypes, so that it's not useful as a replacement for CPython for most of the people (especially those that need the speed improvements).

The reluctance of PyPy developers to start thinking about a solution for all the existing CPython extensions has severely limited the appeal for using PyPy in production use - having 10x gains for new code is not all that appealing once you realize that you need to rewrite the other couple thousand lines of (C++ or Cython) code and hope that you can optimize the PyPy code as well as you optimized the C++ or Cython code.

Having said that, I'm really excited by the idea of a Cython variant that can be used to wrap C/C++ code (or use Cython code) in PyPy's execution environment. (This is the GSoC project that was referred to earlier) Or even if I could use Cython to compile my C/C++ Python extensions for the JVM or .NET.

nickik · on Aug 19, 2011

Thats the problem all scripting languages have. They start slow, then people want to be faster and start writting C code and that makes it really really hard to have alternativ implementations. The guy implmenting Erjang had the same problem he had to rewrite alot of C functions in Java. I think the python comunity should recognisse what an awesome technolagy pypy is and start making an widespreed effort and discurage C extentions. Pypy makes python-code so much faster that alot of C extentions are not really worth it anymore.

sqrt17 · on Aug 19, 2011

Rewriting things in C, or Cython, also makes things much, much faster, without the hassle of PyPy. And because you can do your own memory management (in the places where it makes sense), Cython code is quite a lot better for well-defined numerical applications than what you can get out of a GC-based environment.

Realistically, you always want to be able to take advantage of one of the two big ecosystems - namely the C world and the JVM world - because there are so many libraries out there doing nontrivial things you do not have to reimplement. Right now, writing C code that works well with generational garbage collection (or really any kind of garbage collection that moves objects around - i.e. all the well-performing ones) is either very tedious (when you try to take account of objects being moved) or slow and possibly error-prone (if you rely on JNI-style locking and unlocking of object references).

As a result, it may actually be more attractive to build a Java bytecode JIT into PyPy (and be able to have PyPy use Java classes within its more powerful representation scheme) and get mindshare among the people currently using Jython than trying to get the diehard C extension users to switch.

So much for the '"the X community should recognize what an awesome technology Y is" is a surefire recipe for building sucky software' talk. People will do whatever they do, and calling them idiots because they don't do what you think is awesome doesn't lead anybody anywhere. (Though having a decent installer and usable documentation may actually lead to more people discovering the advantages of PyPy).

mixmastamyk · on Aug 19, 2011

I don't know about others but this article finally convinced me to take a serious look at it. It appears to finally be up to date and much faster.

As in everything a little bit better yet behind the curve isn't good enough to get people to switch.

MostAwesomeDude · on Aug 19, 2011

Node's analogy in Python isn't PyPy, but Twisted, which has been in production since, oh, 2002 or so.