

PyPy 2.5.0 released - cyber1
http://morepypy.blogspot.com/2015/02/pypy-250-released.html

======
aidos
_" We would like to thank our donors for the continued support of the PyPy
project, and for those who donate to our three sub-projects, as well as our
volunteers and contributors (10 new commiters joined PyPy since the last
release). We’ve shown quite a bit of progress, but we’re slowly running out of
funds. Please consider donating more, or even better convince your employer to
donate, so we can finish those projects! The three sub-projects are: "_

PyPy are doing such incredible work and they seem to ask for very little in
the way of funding to make it happen (and they're very explicit about where
they're going to spend the money).

Why is it that there aren't more donations from the industry? Is it just a
marketing issue? Do they need to do a snazzy kickstarter video to build the
hype?

~~~
fijal
PyPy dev here. I would be interested to explore the ideas what can be done to
improve our funding situation. We seem to have moderate success in those
fundraisers, but nothing REALLY huge.

~~~
aidos
Unfortunately, it's not something I have answers to. I guess I just look at
the success of something like the Micro Python campaign [0] (also a cool
project) and wonder if something similar could be achieved.

There are plenty of companies that could benefit from PyPy doing well though.
How many python servers do youtube / reddit etc run? With some investment in
PyPy they could reduce their on-going costs. Youtube could easily dump $100k
into the PyPy project, so what's stopping them? (I suspect the answer in that
case is that Google just aren't interested)

I'm in a startup that doesn't have money to invest in PyPy, but if we we had
more cash we'd definitely look at putting some into PyPy (though our use case
is the hairy NumPy side of things).

I guess in the case of PyPy it's maybe less interesting to individuals but
maybe there's some way of reframing it to make it interesting.

In a way, I'm sort of the target market here, because I use python all the
time and I haven't donated to PyPy. So why haven't I donated (other than being
quite broke because I'm starting a business)?

I think part of it is the urgency. There are 3 specific campaigns you're
running [1] to get funding, along with funding for the general project itself.
And the campaigns themselves are great - you have good overviews of the issues
that you're solving, a realistic approach and a proven track record of being
able to execute. But I don't know when those campaigns started and what the
cut-off date is. They sort of feel ongoing and it feels like there's no loss
in not donating right now. Compare with something like kickstarter, where you
have a date by which if there hasn't been enough funding you may end up losing
out on something really cool.

Sorry for the rather rambling brain-dump, but maybe there's some useful
perspective here.

[0] [https://www.kickstarter.com/projects/214379695/micro-
python-...](https://www.kickstarter.com/projects/214379695/micro-python-
python-for-microcontrollers)

[1] [http://pypy.org/tmdonate2.html](http://pypy.org/tmdonate2.html)
[http://pypy.org/py3donate.html](http://pypy.org/py3donate.html)
[http://pypy.org/numpydonate.html](http://pypy.org/numpydonate.html)

~~~
gwys
I think the number one thing is you have to give people a reason to pay. Being
useful is of course the first step, but people also want to feel they're
getting something "extra", they want to feel like it's a transaction, however
bad.

One of those things can be to provide a very clear road-map tied to donations
and then follow up on it. That's what so powerful with these crowd funding
campaigns. People feel like they are making a difference though at times they
are essentially buying something.

I don't think you necessarily have to "compromise your values" either. You're
probably already limited by money, so add some planning and just be honest
about it. "With this amount, this will happen" etc. As always of course
"under-promise and over-deliver".

That's mainly for individuals though. Companies are somewhat the same, but
they want things that fit into their budget.

------
kbd
Congrats to the PyPy team on what sounds like a pretty big release!

Something in the release notes caught my eye:

> The past months have seen pypy mature and grow, as rpython becomes the goto
> solution for writing fast dynamic language interpreters.

I asked this question[1] on the Perl 6 thread from a few days ago but didn't
get an answer. Does anyone know why on earth the Perl 6 folks created yet
another dynamic language VM+JIT with MoarVM instead of taking advantage of all
the great work done with PyPy? Does anyone know whether PyPy was even
considered as a target before writing MoarVM?

[1]
[https://news.ycombinator.com/item?id=8982229](https://news.ycombinator.com/item?id=8982229)

~~~
bkeroack
Perl6 started long before PyPy (2000 vs ~2007).

~~~
scott_karana
And yet, work began on MoarVM only in 2012.

[http://en.wikipedia.org/wiki/MoarVM](http://en.wikipedia.org/wiki/MoarVM)

~~~
simula67
I think MOAR was designed specifically to run NQP efficiently.

NQP is a subset of Perl 6 which the Perl 6 compiler is written in. That
compiler takes a program written in Perl 6 and converts it into NQP. That NQP
program is then compiled to different targets such as JVM, Parrot etc. This is
my understanding, of course.

~~~
rurban
I don't want to hijack a pypy thread with perl stuff, but this needs to be
corrected:

> I think MOAR was designed specifically to run NQP efficiently.

yes

> NQP is a subset of Perl 6 which the Perl 6 compiler is written in.

yes

> That compiler takes a program written in Perl 6 and converts it into NQP.

No. moar is a vm backend for nqp, with a traditional bytecode interpreter, a
jit, gc, ffi, and binary format.

>That NQP program is then compiled to different targets such as JVM, Parrot
etc. This is my understanding, of course.

No. Of the current three perl6 backend moar, parrot and jvm is the fastest,
but has problems with its traditional threading model. It does not scale
linearily up the number of physical cores. It needs locks on all data
structures, but this is still better than with perl5 or pypy. parrot does
scale linearily, has locks only in the scheduler, but needs compiler support
to create writer functions, only owner can write, access through automatically
created proxies. The jvm threading model is also faster and well known, but
has this huge startup overhead.

perl5 has to clone all active data on thread init and creates writer hooks to
copy updates back to the owner. this is a huge startup overhead, similar to
the jvm.

Overall, would you expect any c/perl compiler dev to switch over to rpython to
back your perl6 vm? Writing a jit for a good vm was the matter of a summer,
and writing the compiler optimizations needs to be done in the high-level
language of choice to be more efficient and maintainable.

You should better look over to rperl to compare it to rpython. This is the
similar approach. A fast restricted optimizable language subset, which
compiles do typed C++, with fallbacks on unoptimizable types to the slow perl5
backend. No need for a jit with its compilation and memory overhead. A jit
only makes sense for dynamic profile-guided optimizations, as in pypy or v8.

Even if the simple static jit's as in p2 or luajit without much compiler
optimizations still run circles around those huge battleships in the jvm, clr,
v8, pypy or moar. optimized data structures still are leading the game, not
optimizing compilers.

~~~
simula67
rurban, I just want to point out that I was not calling for an rpython target
for nqp nor was I claiming nqp is the rpython of the perl world. Not really
sure what you are correcting.

>No. moar is a vm backend for nqp, with a traditional bytecode interpreter, a
jit, gc, ffi, and binary format.

But surely NQP is a "language" and Moar is a vm ?

------
Fede_V
I'm really looking forward to the numpy specific announcements. Numpy is THE
basic building for every single scientific library - if pypy can get a high
performance numpy, that will go a long way towards allowing scientific users
to use pypy (there is still the detail of libraries that use the c-api to wrap
c libraries, but cffi is pretty neat).

~~~
rcthompson
If most of your code is numpy stuff, will you actually see a speedup from
PyPy? (I mean hypothetically, once NumPyPy is properly optimized.)

~~~
dagss
Yes, ignoring that most people need a loop now and then, even pure NumPy-using
code has a lot of potential.

Consider "a + b + c + d". For large arrays, the problem is that it creates
many temporary results of same size as original arrays that must be streamed
over the memory bus. And since FLOPs are free and your computation is limited
by memory bandwidth, you pay a large penalty for using NumPy (that gets worse
as expression gets more complex).

Or "a + a.T"... here you can get lots of speedup using basic tiling
techniques, to fully use cache lines rather than read a cache line only to get
one number and discard it.

And so on. For numeric computation, there are large gains (like 2-10x) from
properly using the memory bus, that NumPy can't take advantage of. So you have
projects like numexpr and Theano that mainly focus on speeding up non-loop
NumPy-using code.

------
mrmagooey
Assuming trunk and 2.5.0 are roughly the same thing it seems like a decent
performance increase [http://speed.pypy.org/](http://speed.pypy.org/)

~~~
mikeash
I just tried it on a heavy Python workload I run regularly and it looks like a
substantial improvement for my purposes. I'm just measuring by eye but for
this particular code I'd say it's around a factor of two faster. This code is
really dict-heavy so I'd guess I'm benefitting a lot from those improvements.
I love this project and the continual improvements it puts out.

~~~
mattip
PyPy dev here. Could you give us some info on how to help you more, for
instance, a small benchmark of where we are still slow? We love to hear how
PyPy is being used "in the real world"

~~~
mikeash
I was too optimistic in my eyeballed assessment, but a real measurement shows
there's still a good improvement. I gave my program a typical workload and
tested with both PyPy 2.4.0 and 2.5.0, and the result was 13 minutes 31
seconds on 2.4.0, and 11:22 on 2.5.0. That's great!

What would be the best way to profile this thing under PyPy to see where it's
spending time? I'm barely familiar with Python profiling in general, and
totally unfamiliar with PyPy specifically.

As for how I'm using it, I belong to a glider club. For every weekend day we
operate (which is every weekend from around the end of February through mid-
December) we need to assign four club members to carry out various tasks for
the day. (Specifically, we need a designated tow pilot, an instructor, a duty
officer, and an assistant.) I'm the scheduling guy, and I wrote a Python
program to generate the schedules for me. I put in various constraints (when
people are unavailable, what days they prefer, etc.) and then the program uses
a hill-climbing algorithm along with a bunch of scoring routines to find a
good schedule. The actual workload operates on Schedule objects, which just
contain a list of dates and the members assigned to each position on each
date. Then I make a ton of duplicates, each with one adjustment, score them
all, pick out the best, repeat. I can also optionally go for two adjustments,
which takes much longer but gives better results, and that's what ends up
taking 10+ minutes as above.

~~~
fijal
Hey, if your code is Open Source, then I would be willing to look at your code
and help you optimize it. Hit me on #pypy on IRC or just send me a mail at
fijall at gmail.

------
rcarmo
FYI, you'll still need to compile a specific gevent branch if you want to use
it with this. lxml built fine, uWSGI seems OK too (except for the lack of
gevent workers in my build).

Things seem adequately speedy, haven't investigated the network throughput
tweaks yet.

~~~
kapilvt
which branch is that?

[update] [https://travis-ci.org/gevent/gevent](https://travis-
ci.org/gevent/gevent) passes w/ pypy fwiw.

~~~
rcarmo
Just tried the current pip package and it doesn't build for me on OSX or
Ubuntu. I have a local snapshot of the version I used with 2.4.0, but can't
remember where I got it from now.

------
ngoldbaum
Does anyone have any experience with numpypy? Is it useful for real work yet?

~~~
Derbasti
In my experience, almost everything works. Also, workarounds are usually easy.

[http://buildbot.pypy.org/numpy-
status/latest.html](http://buildbot.pypy.org/numpy-status/latest.html)

------
bmoresbest55
This is very exciting. I will have to look into using PyPy more regularly.

------
ldng
Now if only someone could fix swig to be compatible with Pypy ..

~~~
sitkack
What packages are you using that use Swig? Can cffi or cppyy work for you?

~~~
ldng
I'm thinking of Mapscript bindings that allows to interact with Mapserver.
They generate biding for several languages through Swig.

Some projects uses Swig precisely because of that characteristic. Project with
"low bandwidth" for whom maintaining different binding generator for every
language they support would be painful.

In my case, I could probably hack a minimal cffi binding. But solving the
problem at the root would be a better solution for everyone.

Swig already use some form of cffi for Common Lisp apparently so maybe python-
cffi support could be derived from that. I don't know for sure, just thinking
out loud.

~~~
fijal
The problem with SWIG is that current SWIG bindings would not be reusable. Any
major project using SWIG for Python already has a bit of glue and this glue
tends to be CPython C API. SWIG has other modes, but it has not been the case
in the past.

~~~
ldng
Sure, but still, having a way within SWIG to generate both standard python
binding and more cffi-friendly python could help transitioning.

That way, new projects can still benefit from the strength of SWIG that allows
them to propose bindings for multiple languages with a single tool. Because,
let's not kid ourselves, using specific binding libs for every single language
you want to support has a cost. And in that case backward compatibility does
not matter much.

All the while giving a transition path for projects still using what I would
call the legacy way. Or those who want/need to poke in the CPython API.

~~~
sitkack
I haven't used SWIG in 15 years, but having it emit a cffi shim would be a
great way to get existing projects ported over to a more portable interface
without requiring them to change anything.

------
tbrock
I wish you could compile scons with pypi.

~~~
mattip
You mean "I wish you could run scons with PyPY". It looks to me like scons
does

contents = open(file).read()

where they should be doing

with open(file) as fid: contents = fid.read()

~~~
anon4
What is the difference between the two?

~~~
objectified
The latter uses a Python context manager to open the file, the former doesn't.
By using the open() context manager like this, you don't need to worry about
closing the file yourself, since the context manager takes care of it. The
code that runs inside the 'with' block gets yielded inside the context
manager. The former version of the code does not store a reference to the file
handle, thus cannot close the file handle. Of course you can close file
handles without using a context manager too:

    
    
      f = open('foo.txt')
      contents = f.read()
      f.close()
    

But I'd prefer this way:

    
    
      with open('foo.txt') as f:
        contents = f.read()
    

Context managers can of course be used for all kinds of things. For more
information, see
[https://docs.python.org/2/library/contextlib.html](https://docs.python.org/2/library/contextlib.html)

~~~
codefisher
The equivalent code with out the context manager is actually this:

    
    
      f = open('foo.txt')
      try:
         contents = f.read()
      finally:
         f.close()
    

So yes, extra reason to prefer the context manager.

