Hacker News new | past | comments | ask | show | jobs | submit login
PyPy 1.5 Released: Catching Up (morepypy.blogspot.com)
147 points by jnoller on Apr 30, 2011 | hide | past | web | favorite | 70 comments

I did a benchmark of Unladed Swallow 1.5 years ago and at this time PyPy looked like a very academic project. It could not really run any interesting code. At this time Unladed Swallow looked like a great project since improving cPython rather than starting from scratch seemed much easier, especially since Unladed Swallow based their JIT optimization on LLVM's JIT engine.

I also recall reading PyPy's blog, especially a blog post about their Prolog based JIT prototype and I thought "wow, this sure seems like a complicated way to implement a JIT engine, I wonder if they will ever implement something that can run real code".

Fast forward to today and Unladed Swallow is dead and PyPy has implemented a Python implementation that's compatible with Python 2.7 and beats cPython 2.6 on various benchmarks. Pretty impressive and kudos to the PyPy team.

Exactly my thoughts -- it seemed like a crazy research project, generating an interpreter written in C, and ones for jvm and clr. And it has 3 different pluggable garbage collectors apparently.

But I'm impressed that it actually runs my code, in contrast to the other alternative Python implementations I've tried.

Look forward to seeing more of sandboxed mode, which has been a much desired feature of CPython for many years now. And stackless mode. I just hope they can get the executable size and build times to a reasonable level.

Many people seem to ignore that pypy is not a new project. It's been on the works for more than 8 years now.

It's also had a lot of funding, and more full time workers than on CPython (who has zero full time coders).

Unladen Swallow isn't dead, actually -- it just merged into the main CPython code base. The first few quarters of optimizations are in Python 2.7 (it's noticeably faster than Python2.6) and the more adventurous bits are on separate branches in SVN.

somewhere on the net are comments i made about european idealists v american pragmatists (the whole "worse is better" thing).

i have never been so glad to be proved completely wrong :o)

I'm impressed that they support the features of CPython 2.7 already. PyPy seems to be developing much faster than other alternate python implementations.

IronPython actually beat us to 2.7, however I will note that it was pretty fast. We went from 2.5 to 2.7 support in under 12 months.

I will love it if in a couple of years the mainstream implementation of Python is PyPy. People are going to be upgrading their Python installations anyway when they move to Python 3. If y'all can get Python 3 support in, maybe people will switch to PyPy at the same time.

I just wish Google would throw some support your way.

Indeed, what are your current plans for Python 3.x?

Future cloudy, ask again later :) Because of the way our JIT is written it would be entirely possible for us to branch for py3k and put JIT improvements on both with relatively little effort, it's likely that at least our next release will still be 2.x and just feature optimizations though.

Here's one vote for focussing on py3k sooner rather than later, fwiw :). I understand you're anxious to reap some rewards and see (more) production use at this point, but as a big fan of much of what was done to the language in py3k and having made the switch already, PyPy vs. CPython 3.x is an unhappy dilemma for me :).

For me (and I suspect quite a lot of other people), PyPy and CPython 3 both suffer from the same problem: lack of support for third-party libraries. Moving to Py3k would only exacerbate that problem for PyPy, removing their current support for the likes of Django and Twisted.

I'd assume they'd very likely keep up support for Python 2.7.x for quite some time past the first py3k-compatible release, similar to what CPython is doing.

Although personally I actually wouldn't mind giving up one for the other, as that would allow PyPy to act as an extra incentive to drive py3k adoption in third-party libraries. Of course I understand if the PyPy developers don't just want to use their hard work as a gaming piece to drive the py3k adoption agenda, though :).

I wonder how hard is the integration of Stackless and the JIT features. That would be killer.

There's actually a sprint going on right now (where this release was done), and that was a topic of discussion. I don't think anyone has written up the conclusions yet, but estimate I heard was that it'd be about 1-month of person-work.

Fantastic... Christian worked hard on Stackless and I think it`s a great feature. On Pypy it would be easier to maintain and add functionality, unfortunately, I struggle on Stackless internals and it`s low level C gimmicks so I can't help on it. It would be great to have something like a good framework in the future to make it`s adoption easier. We have the Stackless Examples project to address this initial learning curve thing. http://code.google.com/p/stacklessexamples/

Nice, the CCP guys are great maintaining Stackless... Kristjan and Richard do a great job on it. If there are any clues or tips on how can I help out, it would be a pleasure.

Have you asked in the Stackless mailing list?

Yes, unfortunately Christain couldn't make it to the sprint, we would have liked to have him there. However, Kristjian (sp?) from CCP who maintains Stackless was there.


Any chance you could update your Ubuntu PPAs soon?

My sentiments exactly: I would love to have a reasonably up-to-date PyPy around to try things in, but I'm not yet so invested in the project that I'm willing to watch for release announcements and download source-tarballs (or binary tarballs). Having a reliably updated PPA is pretty much exactly what I'm looking for.

Since you can't see votes, pretty please yes, this.

Maybe this is a dumb question but why not just work on making CPython faster?

Edit: I guess this provides insights:


I wish I could find the video the talk where we explained this in great detail, but basically: it's hard, it's been tried and failed (notably by Armin Rigo, one of the leads of PyPy). Adding a new GC to CPython is insanely hard, and manually writing a JIT for Python even moreso.

Unladen Swallow seems to indicate that making CPython substantially faster in the general way is not feasible as of now.

I'm really interested in this project. The abundance of C as "the new machine language" for language implementation, as well as the JVM-based languages, makes me wonder what it means for the programming language landscape. I would be very interested in learning whether this un-C implementation made any difference to Python, and what can be learned from it in way of other language implementations

I'm not really intressted in the Python interpreter that pypy is but the other part pypy (the toolchain) is maybe the best thing since sliced bread.

I have the plan to start working on the scheme implmentation.

The thing I don't really like about the project is that it has one name for two things. I can understand how that came about historicly but I think the should make two names out of it.

We've started referring to it as the "RPython translation toolchain" which will hopefully reduce confusion.

You should think of a snappier name. Other than that keep up the good work. I'm atm reading all the pypy papers, very cool.

you should have a look at scheme interpreter that was written already maybe: https://bitbucket.org/pypy/lang-scheme/overview

The benchmark here http://attractivechaos.github.com/plb/ suggests they are doing much better on performance than they were a while back. Its a straight numeric benchmark so not relevant for everything but its better than V8 which is good work.

1.5 should be a lot better on these, the Loop Invariant Code Motion really helps on very tight loops, as these benchmarks often have.

Thats great. Dynamic languages at say Java like speeds is a real game changer over the PHP/Rubylevel speeds. At the moment we only have LuaJIT proving its possible, but PyPy is the next contender.

While this is great news for Python, there are quite a few dynamic languages now beyond LuaJIT showing that it's possible to get stellar perf - JavaScript, Racket, Clojure.

Last I heard, no JS engine was close to LuaJIT's performance; they were all worse by a factor of five or so. Clojure being in the LuaJIT ballpark would surprise me (do you have benchmarks?) but Racket wouldn't. (But I wouldn't call Scheme a dynamic language!)


Clojure 1.2.0 is already faster than Racket. Version 1.3.0 will probably bring it close to Go territory on those benchmarks.

For my own work I've found that getting Clojure in the Java ballpark is certainly possible.

You're right; it looks like Racket is not in the C/LuaJIT ballpark, either. It's too bad the LuaJIT results are no longer on the web site.

Last Alioth results I saw LuaJIT was slower than Java, and certainly not in the C ballpark at all.

http://lua-users.org/lists/lua-l/2009-10/msg01098.html shows LuaJIT beating GCC 4.3.2 on some parts of SciMark, and no more than 3× worse on any part. I'm importing the shootout CVS repository into Git, so hopefully we can make some more definitive comparisons soon.

That post is ancient ... here are newer SciMark results for LuaJIT:


Thank you very much! Looks like LuaJIT's beating GCC on almost everything now, and the JVM on most things. But that's with LuaJIT-specialized code that won't run on Lua, which is a little less exciting.

Those overall results show that LuaJIT doesn't really compete much the JVM running under server mode.

On the contrary, although the JVM is always faster, LuaJIT is within about 15% of it on SOR, and never worse by even a factor of 2.

> But I wouldn't call Scheme a dynamic language!

Care to elaborate?

Scheme operations are mostly early-bound, even more so than Common Lisp; you standardly use vector-ref or string-ref, for example, instead of nth. (There are Schemes where this is not true, such as RScheme, but Racket is not among them.) Vectors aren't resizable; for that kind of thing, you must use lists. You can't attach arbitrary properties to some arbitrary object, the way you can in JS or Ruby. You can't introduce new variable bindings in the middle of a block. (What you do instead is make a new nested block.) Finite maps (assoc, make-hash-table) are second-class citizens. Standard Scheme is nonreflective; you're supposed to do your metaprogramming with macros, not reflection and interposition. (I don't know how much reflection Racket supports, but I'd guess almost none.) The object-literal syntax is quasiquote, which is clumsy.

It does have dynamic typing, but that's not what it means to be a "dynamic language". Dynamic languages are late-bound; everything is up for grabs at runtime. In standard Scheme, this is true in the fairly useless sense that everything depends on the bindings in the global namespace, so you could in theory (set! car somethingweird), but this is very rarely practical; it almost serves only to make optimizing Scheme compilers difficult to write. Aside from that, the language leans much further toward early binding, doing things at compile-time, and using efficient data structures at the cost of flexibility.

I don't think the meaning you give to "dynamic" here has much bearing on the difficulty of just-in-time compilation, except for reflection.

In Scheme you can define and create new functions and macros at runtime. There is not much else that is dynamic about the Scheme standard[1] because... there's not much else to the Scheme standard.

As for reflection, I see that mainly as a tools/debugging aid, and therefore very implementation-specific stuff. "Dynamic" languages have reflection in their definition because they are mostly defined by implementation.

That said, the Scheme standard doesn't preclude reflectivity at all, nor seems particularly compile-oriented to me. Nowhere it is assumed that implementations be compilers at all.

I think most of the things you find missing there were not left out for efficiency, but because it was rightly considered that they don't belong in the language spec at all.

[1] By the "Scheme standard" I mean R5RS.

I agree, none of these features have much to do with the difficulty of just-in-time compilation. The stellar performance results Mike Pall is getting with LuaJIT, which has all of those features, kind of prove that. They do, however, have a lot to do with the usability and dynamic feeling of a language. This makes their absence from most Schemes less understandable.

> As for reflection, I see that mainly as a tools/debugging aid

You can use it that way, but you can also use it for metaprogramming.

> "Dynamic" languages have reflection in their definition because they are mostly defined by implementation.


> That said, the Scheme standard doesn't preclude reflectivity at all, nor seems particularly compile-oriented to me. Nowhere it is assumed that implementations be compilers at all.


Most of the things you've said I imagine don't apply to Racket, I didn't say anything about Scheme in my original post.

Well, I imagine they do, but I'd be interested in finding out whether my imagination is misleading me. Which things I said don't apply?

Indeed. The LuaJIT interpreter with JIT disabled is a similar speed to V8 in most benchmarks.

Done leave common lisp out - sbcl is no slouch and hasnt been for quite some time.

Totally forgot, Common Lisp's performance w/o sacrificing dynamism is what got me excited about Lisp in the first place.

My very limited experience with SBCL has been that, although you can get performance that's considerably better than CPython, you don't get close to C performance without sacrificing dynamism and safety.

Wow, it looks like Python (PyPy) is with an order of magnitude of C on every benchmark. Nice work!

Can numpy be used with pypy? Or are there plans for that?

There are plans, but nothing yet, this reddit thread has some decent information: http://www.reddit.com/r/Python/comments/h0uuv/pypy_15_releas...

I'd like to second the request for a very detailed blog post written by one of the PyPy devs on how interested parties can contribute to the development of Numpy on PyPy. Lack of Numpy support is the showstopper for me and a lot of other people in terms of switching over -- almost everybody who wants faster Python is using Numpy in some way or another :-) This is something I would be really interested in working on.

Not really, numpy depends too much on cpython internals. There is an interest in rewriting numpy for pypy, but I am not sure whether this is the best way of doing it. From what I understand, the choice is between better pypy's compatibility with C extensions (easier, but slower because of cpython emulation in pypy) vs rewriting it for pypy (harder, but would benefit more from pypy).

Why did they decide on a name that can easily be (mis)pronounced as a synonym for urine? That just seems like a poor branding decision, up there with "blur-ray".

I've always pronounced it like "pie pie". (But then again, maybe that's because I already knew the "py" was the first syllable of the word "python"?)

Worked out ok for Nintendo on the Wii...

Fair enough.

I wouldn't say Blu-ray has much to worry about on the "blur-ray" front. Plays on words like that are damaging when they call attention to real or perceived weaknesses in a product, but while I can think of a lot of mean things to accuse Blu-ray of, blurry simply isn't one of them.

What language uses "ee' sound for "y"?

And who pronounces blue, blur?

Spanish does.

Not agreeing with grandparent, though. Lots of people admit to playing with their Wiis for hours a day.

It's not pronouncing "blue" as "blur", but extending or duplicating the r.

Registration is open for Startup School 2019. Classes start July 22nd.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact