Hacker News new | past | comments | ask | show | jobs | submit login
PyPy Status Blog: Multicore Programming in PyPy and CPython (morepypy.blogspot.de)
83 points by spdy on Aug 9, 2012 | hide | past | favorite | 12 comments



First off, I really, really like pypy. It's been slowly taking over my data-massaging-type tasks, and consistently yields 5-10x speedups over cpython.

That being said, I'm a little nervous about all the energy being exerted on novel multicore stuff. Pypy has always been somewhat of a research project, but it's so close to being usable in a mainstream setting. I'd love to see them getting it over the final few hurdles (an out of the box as-fast-as-simplejson json module being a major problem, better docs for performance tuning being the two I've noticed).


You don't get to pick what other people work on, unless you pay them.

PyPy is quite friendly to new contributors, so if you want to help speed up JSON or improve docs, visit #pypy on freenode and offer to pitch in, and I'm sure some core developers will do their best to help you help them.

Or if you really want a feature but don't have time to work on it, see if any PyPy devs are accepting contract work. I suspect speeding up JSON wouldn't take one of the experts long, so it probably wouldn't cost you much.

Docs are more of a moving target: no matter how much you write, users will always want more (and then some of those complaining users will fail to actually read what you do write). There was a tutorial session at PyCon US this year about using PyPy to speed up Python code, and the video is on Youtube.


A bit on topic, this is my tweet about that: https://twitter.com/fijall/status/230597001056768000

That said, not all PyPy devs are working on STM, so there is definitely work going in other directions. I should probably write a blog post detailing stuff that's actively being worked on.

STM is also funded - which means there is enough community interest to sponsor at least some of the work (and thanks to all the donors)


"You don't get to pick what other people work on, unless you pay them."

Tangent: I donated a small amount of money to two PyPy projects (STM and py3k). But I didn't see the donation bars move at all -- at least in the case of STM, I'm almost positive that it didn't move a single dollar after my donation. Do they update those things? If not, it's kind of hard to tell what is sufficiently funded to succeed and what is not.


They do update them, but manually and probably not very often. Looking at commit logs, all three features that have asked for donations have had a lot of work done, but that's different from success.


We don't have a direct access to the funding buttons (it would require quite a bit of work, probably doable though), so we update it every 2 weeks (roughly) when it lands on the bank account. That way we don't have bounced payments to deal with either (and it works easier). Sorry if you're used to real-time donations, but banking is not real time.


Hmm... I figured that at the beginning, but I made the donation around June 10 and I could swear the number for STM is exactly the same as back then.

Maybe I should try to contact them and make sure the donation wasn't lost? But that seems strange, surely others have donated during that time as well.


updated today. sorry for the lag


As I understand this, this wouldn't automatically protect you from race conditions. If you have a shared variable x then a statement like

  x += 5
may be fine depending on whether it is implemented as a single bytecode instruction or not. However, more complicated updates are still subject to races:

  x = somefunction(x)
It is only safe if you use:

  with thread.atomic:  x = somefunction(x)
Having a serialisation doesn't mean it's the right serialisation.

That said, the fundamental problem is shared mutable state, and I don't see an easy way around that in Python. In that sense, this is probably easiest to work with.


Right, this is no different than it would be programming on current Python with the GIL. There are two different levels of locking in current Python implementations: locks within the implementation of the interpreter (the GIL for cpython and current pypy, or all the micro-locks in Jythin), and the locks accessible from the Python environment by Python programmers (e.g., the locks in the threading module: http://docs.python.org/library/threading.html#lock-objects ). The pypy STM project attempts to replace the first set to allow for simultaneous multi-core use while making it seem to the programmer like the GIL is still there, but if the programmer needs things like a specific execution order, it will need to managed by hand, the same way it always has.


> We would have to consider all possible combinations of code paths and timings, and we cannot hope to write tests that cover all combinations.

Not to take away from this fantastic article, but I've felt for a while that this particular point is bullshit. Surely it's possible to write tools that do, indeed, test all possible code paths. I'm not up on the literature, but don't there exist proofs for unsynchronized lockless algorithms in certain specific cases? If so, it would be interesting to think about how this could be generalized.. I imagine that much of the literature on atomic transactions must touch on this kind of stuff, no to mention networking protocols that may receive messages out of order or deal with unpredictable timing, etc.

There should be tools that we automatically reach for, like gdb, as soon as we introduce threading and need to check if we did it reliably. Moreover, such tools should help you find problems that may not occur on your system, but might be theoretically a problem on other systems, just like compiler warnings that warn of dependencies on pointer size.


Surely there will be a huge benefit in using the hardware TM support if the implementation does not suck. It is integrated with the cache so should work much better. Structuring stuff to make this work seems a win.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: