
PyPy Status Blog - STM update: back to threads? - craigkerstiens
http://morepypy.blogspot.com/2012/05/stm-update-back-to-threads.html?utm_source=feedburner&utm_medium=feed&utm_campaign=Feed%3A+PyPyStatusBlog+%28PyPy+Status+Blog%29
======
StavrosK
How is this at all different from simple locking?

~~~
pjjohnson
If you use a lock, then only one thread can enter the locked region at the
same time.

If you use a sufficiently advanced STM, and have an atomic region, then two
threads can enter the same region, as long as they don't use each other's
memory. You may think, why synchronise if they're not using each other's
memory? Well it can be extremely hard to work out a priori if two threads are
using the same memory or not. In fact for some problems, such as many graph
problems, working out what memory they're going to use is the same problem as
the problem that you're trying to solve in the first place.

STM is like having a single lock for every single word in your memory, and
then magically always acquiring every lock that you will need at the start of
the transaction in a determinate order, so that you don't get deadlock.

If two threads enter an atomic region, but do access each other's memory, then
the transaction is silently aborted and restarted. Ideally these aborts are
rare and the program runs very much in parallel.

A more concrete example - in Java if you have a hash table you will usually
lock during all reads and writes. If you used STM, and two writes accessed
different buckets of the table, then there's no need to lock. The STM works
this out for you.

~~~
StavrosK
Ah, thanks for the clarification. From the post I gathered that the atomic
region locked the entire data structure, and that any thread would wait, I
didn't realize it was a specific data region.

That sounds pretty nice, then. I'm looking forward to see the sort of speedups
possible with this.

~~~
pjjohnson
Well take a look at any of the copious literature on the subject:

This paper shows some example speedups, but generally doesn't give anything to
compare to. Find papers that reference this paper, to see more examples.

C. C. Minh, J. Chung, C. Kozyrakis, and K. Olukotun. STAMP: Stanford
transactional applications for multi-processing. In IEEE International
Symposium on Workload Characterization, 2008.

------
IgorPartola
The fact that threads are one of the most widely supported and frequently
chosen models for concurrency programming should be a clue that they are at
least not a bad choice. I applaud the PyPy project for trying something
different, but also doing it in a way where they did not paint themselves into
a corner with a less-than-ideal solution.

