
PyPy 2.0 Released - craigkerstiens
http://morepypy.blogspot.com/2013/05/pypy-20-einstein-sandwich.html
======
adamt
Whilst I applaud the efforts of PyPy, I've always been disappointed with the
performance benefit it's given me with real world code.

Specifically:

* PyPy Dict performance is very slow (slower than Cpython). A lot of the code I need to write that is cpu-intensive python code processes data (from stuff like logs) into various data structures that use dicts. Pypy is rarely more than 10% faster and sometimes slower.

* The memory management/GC can be bad. I've seen the same code that runs fine on cpython end up using excessive amounts of memory (and causing out of memory issues etc) with PyPy. Again - this is normally involving complicated data-structures.

On about 10 occasions now I've had CPU bound Python tasks, then I've tried to
use with PyPy and never had > 20% performance improvements. Which is a big
contrast with the benchmarks.

Is it just me? Or have other people had similar experiences?

[edit: typos]

~~~
CJefferson
I've had exactly the same problem. I have had tree-manipluation AI problems
which run for > 30 minutes (seems like an ideal thing for pypy, before I
rewrite them in C++). pypy is almost always slower, and never more than about
15% faster, whereas a simple line-by-line C++ rewrite can be 20x faster.

~~~
fijal
Python is a vast language. It's very hard to know upfront what sort of
patterns people use - if you don't talk to us, don't post stuff on the bug
tracker, don't do anything - it's your own fault. PyPy is known to speed up
real world code to various degrees - sometimes 10x sometimes not at all, but
it all really depends. We would be happy to help you with _your_ problem, but
if the only thing you do is to complain on hackernews, well, too bad, we can't
help you.

~~~
pekk
I am sympathetic because PyPy is very good and is improving fast. but...

That doesn't change how the PyPy project tends to represent itself, which
almost always comes across as something like "6x speedup for everything
(excluding JIT warmup)". If you want everyone to adopt PyPy instead of CPython
then it is part of your job to find the cases where PyPy is not actually
faster rather than saying it is the user's fault. And it is not your job to
select only benchmarks which tell the story you want.

If the difference between interpreters is nuanced then that nuance should be
expressed so that people can make intelligent decisions rather than dismissing
one or another interpreter as "slow".

~~~
gsnedders
FWIW, it's trivial to get benchmarks into their comparisons, provided they
aren't microbenchmarks, at least in my experience. Saying they pick benchmarks
that are favourable is untrue: the majority were added because of PyPy
performing badly on them, and they've improved as a result of being included
thus making them mostly benchmarks PyPy does well at.

------
amix
If you like this remember to donate some money to the project on
<http://pypy.org/> [I am not in any way affiliated with PyPy, only supporting
the project].

------
melling
Does anyone have any performance comparisons to Perl? I've used Perl for over
a decade. Just started using Python on a project. I've got a lot to learn but
Python makes you feel like you've got it down pretty quickly.

One of the things that kept me on Perl was its great performance.

<http://benchmarksgame.alioth.debian.org/u64q/perl.php>

I'd like to go for the win, win. Is that 'FTWW?'

~~~
jholman
Interesting datum you used. That chart shows that, on those compilers, on
those programs, with those test data, Perl and CPython are VERY similar in
speed. If you compare your link (Perl / CPython) with the inverse (CPython /
Perl), you almost can't tell the graphs apart.

[http://benchmarksgame.alioth.debian.org/u64q/benchmark.php?t...](http://benchmarksgame.alioth.debian.org/u64q/benchmark.php?test=all&lang=python3&lang2=perl&data=u64q)

So CPython is about the same speed as Perl. PyPy's speed comparison suggests
that PyPy is 2-20x faster than CPython. What else do you want to know?

Perl, btw, is not noted for its "great performance" in any context I've ever
heard of.

~~~
melling
Perl is famous for it regex performance. For many years people have compared
Python, Ruby, and Perl performance and it has usually been Perl as the fastest
followed by Python then Ruby. I've seen many posts like this:

[http://stackoverflow.com/questions/12793562/text-
processing-...](http://stackoverflow.com/questions/12793562/text-processing-
python-vs-perl-performance)

These days, however, it appears that Python has really come into its own. I
started using Sublime Text so that got me into trying out Python.

~~~
srean
I think regex performance of Perl is nothing extraordinary. What it is famous
for is pushing the scope of matching beyond regular languages.

Of all the common scripting languages I think TCL uses a different algorithm
for regex matching and is considerably faster than Perl, especially on longer
strings. Let me find some corroborating docs.

Ok this has some info <http://swtch.com/~rsc/regexp/regexp1.html>

~~~
jholman
Exactly. Perl regexes have _poor_ performance in terms of running time, but
good usability, including by squeezing non-regular features into their
allegedly-regular expressions, and also things like numerous and flexible
character classes, consistent behaviour for escapes, etc.

I love vim, but jesus I can never remember which vim regex metacharacters need
escaping to get their meta-meaning, and which need escaping to get their non-
meta meaning.

~~~
Leszek
\v is your friend.

<http://vimdoc.sourceforge.net/htmldoc/pattern.html#/magic>

~~~
jholman
I sort of prefer \V, but same difference

I dunno, I haven't used it in the past, because...

1) "It is recommended to always keep the 'magic' option at the default
setting, which is 'magic'. This avoids portability problems."

2) Nor do I want to type two extra characters in every damn regex!

So, it only occurs to me right now, the right thing to do for me is this: the
first instant I'm confused about whether \\( means grouping or literal paren,
I should immediately start my pattern with \v or \V. No unportability of
plugins, no marginal cost on regexes that don't care, marginal benefit when it
does matter.

~~~
Leszek
Yeah, that's basically what I do too. No \v on trivial regexes, \v on ones
where there's more than one or two groups.

------
gtaylor
The gevent/eventlet part of this has me pretty excited. We needed this in
order to do some experimenting with PyPy without investing a good deal of time
on a test conversion. I'm also interested to see if cffi is as good as I've
heard it is (relative to ctypes).

Excellent work, fijal and team!

~~~
rdtsc
This is exciting for eventlet. Coupled with recent resurgence of work on it,
it might bring it back into spotlight.

------
MichailP
Can someone knowledgeable compare PyPy, Numba and Cython? I mostly use Cython.
Tried Numba also, it has very nice workflow when it works with autojit (when
it doesn't error messages are pretty cryptic). With PyPy I don't understand
why for example all that type information obtained from jit wouldn't be used
to make something like Numba specialized functions or Cython modules (noob
question but please answer).

~~~
Thrymr
PyPy still doesn't work with Numpy, though they're working on it.

~~~
apendleton
That's not quite right. Stock numpy doesn't install, and nobody's working on
that as far as I know, but they've reimplemented part of the numpy API (at
least the Python API, not the C one), and the subset they've implemented works
fine. I have an application in production that uses it.

------
philfreo
What are the largest sites using PyPy in production?

~~~
riffraff
quora did at some point but it seems they replaced it (or some usage of it)
with scala, I can't find any pointer at the moment but google could help you.

------
pjmlp
Congratulations!

Looking forward to the day PyPy becomes the reference implementation.

~~~
keeperofdakeys
In many cases, you want to be careful about using JIT'd platforms. Many people
forget that the memory requirements skyrocket. When you want to use a language
on many different platforms, including embedded, you start to see how having a
JIT interpreter as your reference platform can be disadvantageous. CPython
isn't exactly slow either, especially when you consider many 'intensive'
modules are written directly in C.

Pypy should stay as it is, an experiment that can be used for people who
require more performance for certain workloads. Of course, having part of your
language written in C for CPython can hurt sometimes, when you can't easily
use the functionality on other interpreters.

~~~
baq
almost agree, except for the 'experiment' part. i don't want it to be an
experiment, i want it to be a production-ready interpreter that i can use
instead of cpython with minimal effort when i know i can trade memory for
speed.

------
jmelloy
I get about 2x speedup on my CSV reader & processing app, which is nice.

When I tried the beta, it was crashy, but 2.0 seems pretty good so far.

------
snaky
Is it still slower than LuaJIT?

------
xyproto
pypy is cool, but why no love for python 3?

~~~
ak217
<http://pypy.org/py3donate.html>

[http://morepypy.blogspot.com/2013/03/py3k-status-
update-10.h...](http://morepypy.blogspot.com/2013/03/py3k-status-
update-10.html)

