
PyPy v5.8 released - pettou
https://morepypy.blogspot.com/2017/06/pypy-v58-released.html
======
boultonmark
I have a question and then a general vent

1\. Does anyone know the latest update on NumPyPy? PyPy for me is just not a
usable proposition because I heavily use Numpy (and Scipy et al). So I am
forced to use slow Python + fast Numpy or slow Numpy + fast Python. Very
saddening. The C-Extension is just so off the pace, NumPyPy was meant to solve
that quandry.

And I know some smart Alec will trot out the usual 'downshift into C' line
that everyone (including Guido) use as the final goto solution for performance
but that is simply a disgrace in 2017. Even JavaScript is fast. Why can I not
choose to write Python and it be fast?? And yet Python 3 is getting slower.
Don't agree? Look at these benchmarks of Python heaps written in Python (not
using the C based builtin heapq) [https://github.com/MikeMirzayanov/binary-
heap-benchmark](https://github.com/MikeMirzayanov/binary-heap-benchmark)
Python generally is off the pace but Python 3 is about twice as slow as 2 and
miles off JavaScript.

But PyPy is proof that Python can be fast. It makes quote/unquote "Pure
Python" within striking distance of Go and and when I run that test suit on
PyPy, its similar to the Node.js score. Why does this matter? Because I want
to write bloody Python not C.

And it is so tantalisingly close - look at a blog post like:
[https://dnshane.wordpress.com/2017/02/14/benchmarking-
python...](https://dnshane.wordpress.com/2017/02/14/benchmarking-python-
heaps/) The performance of the Fibonacci Heap that someone wrote in
quote/unquote "Pure Python", when run in CPython can never compete with HeapQ
(the C based builtin lib), but on PyPy it can. Fast code written in Python. So
what are the problems holding back PyPy? I think possibly money and number of
devs working on stuff. Javascript had Mozilla, Google, Microsoft and Apple in
a browser war + loads of open source input.

But is the biggest stumbling block not Guido himself and the core Python devs?
Do they just philosophically not agree with PyPy or is it just disinterest?

Well whatever it is, it is heart-breaking to want to write fast code in my
favourite language and leverage all its power including Numpy/Scipy etc and
not be able to. And yes my use-case is perhaps quite unique, a very CPU
intensive service that ideally computes and returns a real-time calculation
(that includes 500k function calls) in 10-50ms.

But getting fast Numpy in the PyPy mix (i.e all the speed of the JIT + no
worse Numpy) would be a HUGE step forward for me in PyPy adoption. What is the
latest? How can I help?

~~~
fijal
in short - funding. If we can find someone who wants fast numpy AND fast
python under the same hood, we can combine the approaches of cpyext and
numpypy and make it fast. The project is just too big to do on spare time.
I've been trying to find some funding for that for quite a while, but I
haven't been able to find any sizable backer just yet.

Cheers, Maciej Fijalkowski

~~~
boultonmark
Maciej, how much would that require, ballpark? I think this is something there
would be massive support for. My company would support it

~~~
bastawhiz
How much does it cost to pay a knowledgeable engineer for a few years?
Probably the better part of a million dollars, at least.

~~~
boultonmark
Is your name Maciej?

~~~
surye
Is this a private correspondence?

------
pjmlp
Awesome work, congratulations on bringing Python forward.

Still wishing one day PyPy might become the canonical implementation.

------
mattbillenstein
PyPy is great -- while I still use CPython for our more complex webapp and
associated tools that have heavy dependencies on C-extensions; I increasingly
use PyPy for the more mundane cpu/data heavy lifting I do. It's typical to get
2X the performance (comparable to some compiled languages) and still use much
of our utility code, configs, etc.

~~~
rubber_duck
>comparable to some compiled languages

Given that python programs usually run an order of magnitude slower than
compiled languages even a 2x performance increase doesn't put it in the
"comparable" range from my experience. Not bashing python - I use it regularly
- but for computational stuff it's a hog unless you're just passing stuff to C
libs - like I have a resource build pipeline that does some blender 3D model
transformations - code is written in python and takes forever - equivalent
code in C++ would take roughly 1/100 of the time and performance would be
irrelevant but atm. we're seriously considering rewriting parts in C++ to
reduce build times.

~~~
lqdc13
Blender Python lib by default is not optimized much. It has nothing to do with
Python as a language.

Use numpy for matrices. If you have to implement an algo with a hot inner
loop, use cython or numba.

I've never seen 100x difference in Python-C++ rewrite if Python was optimized
already.

Here is a good article about some of the options: [https://rare-
technologies.com/word2vec-in-python-part-two-op...](https://rare-
technologies.com/word2vec-in-python-part-two-optimizing/)

~~~
dr_zoidberg
The one time I saw 100x increase in performance in Python-to-C (which was done
through Cython) was in code that worked with strings calculating a machine-
learning related distance between two strings. The code was doing a lot of
accessing particular positions in the strings, which in pure python resulted
in slow retrieval of every character (lots of .__getitem__ calls), which were
optimized to having 2 predefined empty arrays (in heap, not stack, and their
corresponding counters of valid items) and then walking the strings and
storing the "hot" values in them.

So it was a very specific case where we could get that 100x speedup at work.

~~~
dr_zoidberg
Just noticed after a while: it was "in the stack, not heap" (about the
arrays).

------
robocaptain
Coming from someone who uses python but doesn't really follow alternative
compilers, PyPy sounds great. What are some of the downsides, if any? Are you
sacrificing library compatibility for faster core+standard libs?

~~~
bdarnell
In addition to being incompatible with (some) third-party libraries, pypy
tends to use significantly more memory than cpython. It's also slower than
cpython for scripts that don't run long enough to warm up the JIT, so you
probably wouldn't want to use it by default. (Disclaimer: I'm basing this on
experience with older versions of pypy and haven't verified it recently)

~~~
lanstin
He memory thing is still an issue. I had to go thru. Lot of tuning on max GC
size to keep it runnable for long times. Too low and it is slow and too high
and it kills the box.

------
make3
Awesome news, congrats to the team :) On an unrelated note, I wish Google gave
them money to make it work with Tensorflow.

------
dr_zoidberg
Why are they still comparing to Python 2.7.2? I couldn't find benchmarks
against Python 3.5 for their Py3 interpreter.

All the times I tried PyPy I came into a hurdle where one of the libraries I
needs doesn't work (or underperforms) in PyPy, the most important ones being
Numpy and OpenCV.

So in the end I just gave up with them, and stuck with Python 2/3 and Cython,
which solved my speed problems without having to do all the work of
C-extensions from the ground up.

Edit: the one benchmark I found covering PyPy3 is this:
[https://pybenchmarks.org/u64q/benchmark.php?test=all&lang=py...](https://pybenchmarks.org/u64q/benchmark.php?test=all&lang=pypy3&lang2=python3&data=u64q)

It shows PyPy3 5.7.1 being about 8x faster to 100x slower than CPython 3.6.1.

For comparison, PyPy2 5.7.1 ranges from 10x faster to a bit over 30x slower to
than CPython 2.7.13.

~~~
BlackFingolfin
The benchmark showing a factor 100x compares a pure python implementation of
"pidigits" running in PyPy3 vs one that uses GMP (via gmpy2) in python. I am
actually impressed it's only a factor 100. And with progress made by cffi, I'd
hopeful that a GMP-using PyPy version could be written rhat matches the speed
of the code using gmpy2.

The next benchmark "only" runs 6x slower in PyPy ; still bad, but that paints
quite a differen picture.

~~~
dr_zoidberg
If it's running pidigits inside PyPy then it shows how much slower the C
interface (not PyPys cffi, but it's CPython compatible interface) is compared
to CPythons.

For the record, I'm not claiming PyPy3 to be 100x slower, these are benchmarks
and they're hairy beasts that we have to shave and try to see what they tell
us about performance.

My point is that PyPy3 is still behind in language features (in CPython we
have a lot of nice things from 3.6 and 3.7 coming soon, while PyPy still lags
behind having the complete 3.5 feature set), and they haven't optimized it as
much as their 2.7 branch. But the PyPy people are always showing those "7x
faster than CPython" claims (which come from an average of benchmarks, which
seems to have been cherry picked to avoid the ones in which they're actually
slower than CPython).

On the other hand, with Cython moving over to Py3 was never an issue (it
actually helped in some cases), and it always helped to deliver better
performance, and in just the right spot where it's needed. True, you have to
know about profiling and identifying where to use Cython, but at my workplace
it's been a far better tool to solve our performance needs.

------
Tobu
Any word on the "single codebase" aspect of supporting both major Python
versions? I remember suggesting it years ago at a time when the team wanted to
do Mercurial backporting instead. What changed their mind?

That looks like it could fix the lag on CPython releases, so it's a big
feature.

------
oblio
I guess the next release is the one that should support Python 3. At least as
a non-beta feature.

~~~
wyldfire
FYI at least for everything I've thrown at it, PyPy 3.x works really well.

------
ipunchghosts
Here is the dumbest question in the world: our application has a gui which is
pyqt, can we use pypy? Aside from pyqt, its completely vanilla python.

~~~
flavio81
Most code that is vanilla Python will work just fine with PyPy. However, in
real life, your code most likely also uses (imports) other Python libraries,
you just need to make sure they work OK with PyPy.

Many, many, many libraries already work fine with PyPy, so give it a shot.

