

Multithreaded data structures for parallel computing - AndreyKarpov
http://www.ibm.com/developerworks/aix/library/au-multithreaded_structures1/index.html?ca=drs-&utm_source=dlvr.it&utm_medium=twitter

======
jey
The code in listings 3 and 5 is not thread safe. The list could easily become
empty between the call to empty() and acquiring the mutex, leading to a race
condition. I see two other subtle race conditions as well. I haven't looked at
the rest but this density of bugs is enough to convince me that this author
doesn't understand multithreaded programming well enough to give advice on the
topic.

Other bugs:

1\. Assumes that std::list<T>::empty() is an atomic operation.

2\. Assumes it's safe to use separate locks for reading and writing, which
std::list says nothing about. It probably screws up even in practice for at
least the case where you have only one element in the list and two threads
execute push() and pop() at the same time.

~~~
mentat
I agree. It looks like their read implementation means that every thread pops
the same item off as well, since they clone to a temp variable.

~~~
jey
Threads have separate stacks and the call to front() and pop_front() is
protected by the mutex, so that part is fine as far as I can see.

~~~
mentat
I'm not familiar with the STL implementation, but it seems the pop would need
to be coherent with the push in an empty stack. Like if a thread was trying to
pop the only entry while another was trying to push, 'intuitively' there'd be
a race condition there if STL isn't thread safe (which is the whole point of
the article I think).

------
pjscott
And for priority queues, there's a wonderfully clever scheme which involves
using a skiplist:

<http://www-cs-students.stanford.edu/~itayl/ipdps.pdf>

The probabilistic nature of the data structure lets them get those nice O(lg
n) and O(1) expected times without the wide-ranging memory conflicts that slow
down concurrent min-heaps. Some more basic info on skiplists for people who
aren't familiar with them:

<http://en.wikipedia.org/wiki/Skip_list>

~~~
marshray
Yes I'm working on an implementation of those myself.

There are even designs for lock-free skiplist priorty queues, but they tend to
be a little slower than the locking versions.

~~~
pjscott
Lock-free skiplist priority queues, incidentally, become very elegant if your
processor supports even the most basic hardware transactional memory.

------
hamidpalo
There are much better ways to implement a concurrent queue than the ones
shown. Using locks for every single queue/dequeue is a very good way to
completely kill performance. A much better alternative is to use hand-over-
hand locking, or use CAS for a lock-free implementation.

TAOMP ( [http://www.amazon.com/The-Multiprocessor-Programming-
Maurice...](http://www.amazon.com/The-Multiprocessor-Programming-Maurice-
Herlihy/dp/0123705916) ) goes over this in great detail and is overall an
_excellent_ book.

~~~
atechie
Following is a great resource for Concurrent programming without locks and
software transaction memory :
<http://www.cl.cam.ac.uk/research/srg/netos/lock-free/>

------
strlen
There are rather basic (producer/consume queues) and as some have pointed out,
are buggy. I'd highly suggest looking at Boost's threading library (for an
example of an object oriented approach to threading, taking advantage of RAII
-- much of it is now standard in C++11), Intel's Thread Building Blocks,
Java's built in Concurrency Utilities (java.util.concurrent package), Doug
Lea's Fork Join Framework. The a great book on the subject is Maurice
Herlihy's The Art of Multiprocessor Programming:

[http://www.amazon.com/The-Multiprocessor-Programming-
Maurice...](http://www.amazon.com/The-Multiprocessor-Programming-Maurice-
Herlihy/dp/0123705916/ref=sr_1_1?ie=UTF8&qid=1335394746&sr=8-1)

The book is in Java, but C++11 has the required primitives (cross platform
compare and set for integral types, a defined memory model) so you could
follow allong in C++11.

------
adunsmoor
Intel's Threaded Building Blocks (<http://threadingbuildingblocks.org>) has a
very nice collection of threaded data structures and algorithms.

I've found the performance to be quite good and the interfaces are compatible
with STL in most cases which can be useful.

------
ZephyrP
As Jey has so eruditely pointed out, this advice borders on useless. I've
always implemented LL queues as a segment of a list, where each element is
protected by it's own mutex wherein each thread may add or remove a node by
taking a lock on the first node, then moving to the second node, letting go of
the first lock and so on. Although you'll end up holding more than one lock
while traversing and three while deleting, this is at the core of powerful
ideas like hand-over-hand locking.

------
krosaen
Another approach: use a persistent data structure ala clojure's:

[http://blog.higher-order.net/2009/02/01/understanding-
clojur...](http://blog.higher-order.net/2009/02/01/understanding-clojures-
persistentvector-implementation)

------
darkstalker
If you're using C++ (like the article) you should be using boost::thread
instead of low level pthreads, or std::thread if you're using C++11

