Hacker News new | comments | show | ask | jobs | submit login
Multithreaded data structures for parallel computing (ibm.com)
79 points by AndreyKarpov on Apr 25, 2012 | hide | past | web | favorite | 20 comments

The code in listings 3 and 5 is not thread safe. The list could easily become empty between the call to empty() and acquiring the mutex, leading to a race condition. I see two other subtle race conditions as well. I haven't looked at the rest but this density of bugs is enough to convince me that this author doesn't understand multithreaded programming well enough to give advice on the topic.

Other bugs:

1. Assumes that std::list<T>::empty() is an atomic operation.

2. Assumes it's safe to use separate locks for reading and writing, which std::list says nothing about. It probably screws up even in practice for at least the case where you have only one element in the list and two threads execute push() and pop() at the same time.

Agreed. It's better idea to use mutex + 2 semaphores to implement queue with blocking: http://en.wikipedia.org/wiki/Producer-consumer_problem#Using...

(it's a classical computer science problem with classical solution)

In fairness to the rest of the article, in Listing 9 the author does protect the call to empty() with (the same) lock and also moves the condition inside of a while loop.

I agree. It looks like their read implementation means that every thread pops the same item off as well, since they clone to a temp variable.

Threads have separate stacks and the call to front() and pop_front() is protected by the mutex, so that part is fine as far as I can see.

I'm not familiar with the STL implementation, but it seems the pop would need to be coherent with the push in an empty stack. Like if a thread was trying to pop the only entry while another was trying to push, 'intuitively' there'd be a race condition there if STL isn't thread safe (which is the whole point of the article I think).

I agree. Listing 11 won't even compile as was_empty is declared const.

That use of const is fine, as it's never modified after the initial value is assigned.

No, Jabbles is correct: look at listing 11. It's reassigned five lines down.

Oops, I was looking at listing 8. Thanks.

And for priority queues, there's a wonderfully clever scheme which involves using a skiplist:


The probabilistic nature of the data structure lets them get those nice O(lg n) and O(1) expected times without the wide-ranging memory conflicts that slow down concurrent min-heaps. Some more basic info on skiplists for people who aren't familiar with them:


Yes I'm working on an implementation of those myself.

There are even designs for lock-free skiplist priorty queues, but they tend to be a little slower than the locking versions.

Lock-free skiplist priority queues, incidentally, become very elegant if your processor supports even the most basic hardware transactional memory.

There are much better ways to implement a concurrent queue than the ones shown. Using locks for every single queue/dequeue is a very good way to completely kill performance. A much better alternative is to use hand-over-hand locking, or use CAS for a lock-free implementation.

TAOMP ( http://www.amazon.com/The-Multiprocessor-Programming-Maurice... ) goes over this in great detail and is overall an excellent book.

Following is a great resource for Concurrent programming without locks and software transaction memory : http://www.cl.cam.ac.uk/research/srg/netos/lock-free/

There are rather basic (producer/consume queues) and as some have pointed out, are buggy. I'd highly suggest looking at Boost's threading library (for an example of an object oriented approach to threading, taking advantage of RAII -- much of it is now standard in C++11), Intel's Thread Building Blocks, Java's built in Concurrency Utilities (java.util.concurrent package), Doug Lea's Fork Join Framework. The a great book on the subject is Maurice Herlihy's The Art of Multiprocessor Programming:


The book is in Java, but C++11 has the required primitives (cross platform compare and set for integral types, a defined memory model) so you could follow allong in C++11.

Intel's Threaded Building Blocks (http://threadingbuildingblocks.org) has a very nice collection of threaded data structures and algorithms.

I've found the performance to be quite good and the interfaces are compatible with STL in most cases which can be useful.

As Jey has so eruditely pointed out, this advice borders on useless. I've always implemented LL queues as a segment of a list, where each element is protected by it's own mutex wherein each thread may add or remove a node by taking a lock on the first node, then moving to the second node, letting go of the first lock and so on. Although you'll end up holding more than one lock while traversing and three while deleting, this is at the core of powerful ideas like hand-over-hand locking.

Another approach: use a persistent data structure ala clojure's:


If you're using C++ (like the article) you should be using boost::thread instead of low level pthreads, or std::thread if you're using C++11

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact