

Beyond Locks and Messages: The Future of Concurrent Programming - rbranson
http://bartoszmilewski.wordpress.com/2010/08/02/beyond-locks-and-messages-the-future-of-concurrent-programming/

======
cageface
Interesting that he's so gung-ho about STM when the Microsoft paper sounds so
pessimistic:
[http://www.bluebytesoftware.com/blog/2010/01/03/ABriefRetros...](http://www.bluebytesoftware.com/blog/2010/01/03/ABriefRetrospectiveOnTransactionalMemory.aspx)

The more I read about these new approaches to concurrency the more I suspect
that there's going to be no free lunch. Concurrent programs are going to
require careful, bottom-up design with concurrency in mind and concurrent
design is going to remain a difficult thing.

------
noss
For me, this article introduces some useful terminology in relation to
concurrency that I had not read about before. I am thinking about "SPMD,
Single Program Multiple Data" and data-driven vs control-driven parallelism.

As an Erlang user and forum/chat hang-around, I frequently see people come in
with a problem that is very data-driven. Large matrix inversions, AI back-
tracking, then dismiss Erlang as academic hot-air because spawning an Erlang
process per element and taking care of calculation results was too much work
and no speedup. Obviously they were having data-driven problems and Erlang is
made for control-driven problems.

I would love to find well-written and succinct articles on the difference
between these two kinds of problem domains to refer people to. I find that it
is very difficult to describe this well myself.

~~~
anonymousDan
I'm still not sure I understand the difference between data- and control-
driven parallelism, primarily because I don't have a concrete example of the
latter. Also, he distinguishes between threads and tasks, but I'm not really
sure what the difference is... are the latter simply not preemptable?

~~~
noss
The goal, the scarce resource one optimize for, tend to be different. (To use
the language from [http://freakonomics.blogs.nytimes.com/2010/07/30/know-
your-s...](http://freakonomics.blogs.nytimes.com/2010/07/30/know-your-
scarcity/), posted here earlier ).

Control-driven concurrency tends to be io-bound if being bound at all, and the
major issue tends to be dealing with the complexity of the rules. I.e. if-
that-happens then-start-that-thing which-tells-that-system and-then-continues-
with-that-task which-notices-the-registered-plugins. These rules should not
block for other concurrent but independent sessions. Scalability tends to be
an issue here. A relevant scalability example for Erlang would be a telecom
system where each machine-added make you able to handle X number of more
subscribers.

Data-driven concurrency tends to be more about making use of the concurrency
available in the current hardware generation to speed up very cpu-bound
computations. Using more threads than cores available just mean more context-
switches and less performance. These are problems that likely also want to
make use of SIMD instructions.

------
signa11
obligatory acm piece: "Software Transactional Memory: Why Is It Only a
Research Toy?" [ <http://queue.acm.org/detail.cfm?id=1454466> ]

------
loewenskind
Message passing doesn't have to be hard to reason about at all. Each Erlang
process is just that: a process that receives inputs and sends outputs. This
is its "interface". Why would reasoning about this be any harder than
reasoning about e.g. a REST service your application depends on?

The advantage, at least with Erlang is that the system can do things under the
hood to make this as efficient as possible. Today, Erlang might actually run
all the processes in one thread of execution and basically just share the data
between threads. It can do this since the data is read only. Later on, when we
have too many cores for this to be the fastest way anymore, it can do
something else without a change to your code.

STM is still locking under the hood and when we're talking hundreds of cores,
that little bit of locking is going to end up putting all your threads into a
nice wait queue.

Further, the _real_ reason that manual thread management is untenable is the
same reason that manual memory management is untenable: composability. You
can't compose functions arbitrarily when you every function has it's own
expectations about who owns the data and you can't compose functions when you
don't know what locking they have. I'm not sure STM solves this completely.
Does it ensure that one transaction block can't depend on another in a way
that causes an unresolvable deadlock? If I write a massive framework, do I
need to worry about how many layers of isolation blocks I'm building up? I
know they say this is supported, but what if it's 20 layers deep and the
layers cross compilation boundaries?

Message passing, on the other hand (like unix processes) doesn't have these
composability concerns (though personally I think futures probably have the
most potential since they're the most high level).

------
megaman821
I don't quite understand what these three big companies did that was beyond
what Haskell already offers their sparks and STM implementations.

I am all for inventing new stuff but perhaps they would have been better
served by contributing to Haskell anything that they invented that was novel.

