
Why Is Concurrent Programming Hard? - josch_m
http://www.stefan-marr.de/2014/07/why-is-concurrent-programming-hard/
======
kazinator
Concurrent programming is hard because "concurrent programming" usually means
"concurrent programming with mutable states and imperative control
constructs". So concurrent programming is hard because imperative programming
is already hard, and then multiple threads with unpredictable scheduling order
are let loose on it.

Also, due to variation in scheduling order, concurrent program's bugs may be
only sometimes reproducible, even when the input between the working and
failed cases are identical. It's as if all concurrent programs have an
implicit real-time input source that perturbs their behavior.

The bugs in concurrent programs are an entirely new class, like "race
condition", "lost wakeup" and "deadlock". Even a great programmer who
understands N levels of pointer dereferencing and "leaps complicated
algorithms in a single bound", but has no concurrent experience, may well be
defeated by concurrency bugs.

~~~
kazinator
I'd like to expand on this just a little; indulge me.

When we write programs, we normally keep them tidy by taking advantage of
scoping. Data is localized using local variables. We use parameter passing to
communicate among modules and avoid globals. However, imperative concurrent
programming is all about globals: the "shared variables" that have to do with
the concurrency of the program, violate scoping: necessarily so, because to be
shared, they have to be accessible from multiple scopes.

In imperative programming, there is an important reassurance: namely that
variables retain their last stored value. If you put 42 into x, then x keeps
holding 42 until you put something else in x. If x is globally visible, and
you call some function, forgetting that it has a side effect of changing x, it
wrecks this assumption. Suddenly, x has mysteriously changed: you could swear
you put in 42, but now it somehow has 43. Why? All you did was call foo; that
has nothing to do with x. Oh, but someone committed a change, and foo now
calls bar, and bar makes several other functiosn calls which end up back in
this same unit in another function that frobs x.

In concurrent programming, you are opening the doors to this type of problem:
and what is worse is that shared variable x can change at any time, not over
some predictable action like calling a function. When you assign 42 to x, x
could turn into 43 even before control passes out of the assignment statement
and to the next statement.

All the surprises arising from multiple scopes having visibility over the same
shared data become worse thanks to the asynchronous nature and unpredictable
scheduling order.

If you keep programming in the same way that you are used to, you are
_guaranteed_ to have a problem. Simply to have the same assurance that "x is
holding the value I put there in the current thread of control" you need to
take special steps, like locking a mutex (consistently, from every user of x)
whereas in non-concurrent programming, the assurance comes from good design,
like making x local, or simply not branching into any code which might change
x for as long as you need x not to change. I.e. you can achieve the basic
assumption quite _passively_ in the absence of concurrency (based on what you
don't do), whereas you have to be proactively cunning in concurrent
programming.

~~~
boomlinde
_> imperative concurrent programming is all about globals: the "shared
variables" that have to do with the concurrency of the program, violate
scoping: necessarily so, because to be shared, they have to be accessible from
multiple scopes._

I may be nitpicking here but "accessible from multiple scopes" is not the same
as being "all about globals". Data can usually still be localized to the
routines that use it without throwing it in the global scope. The inherent
problem of two threads sharing data is also something that will _have to_ be
dealt with on some layer of the implementation if you want the threads to
communicate, regardless of programming paradigm. Some languages provide really
nice abstractions to deal with this in ways that don't explicitly have the
programmers invoke locks. Among imperative languages, Go provides a really
useful and sensible such abstraction with channels.

 _> In concurrent programming, you are opening the doors to this type of
problem: and what is worse is that shared variable x can change at any time,
not over some predictable action like calling a function. When you assign 42
to x, x could turn into 43 even before control passes out of the assignment
statement and to the next statement._

Still nitpicking, but you should separate the concerns of parallel programming
that don't necessarily have anything to do with concurrent programming.
Imperative concurrent programming can be implemented by having routines
explicitly yield control to other routines (e.g. protothreads, greenlets),
making the execution order entirely predictable.

~~~
kazinator
Explicit yielding of control is not concurrent programming, almost by
definition. The execution order is entirely predictable precisely because it
is serialized by the yield calls. The software specifically relies on these
cooperative threads not being concurrent, such as "if I don't call anything
which can yield, then nothing changes in a surprising way". Also, because it
is not concurrency, it won't run on multiple processors.

(There can be a loose order in cooperative threading in not knowing which
thread will be run next when yield is called. It's still not serialized.)

~~~
boomlinde
Again, note the difference between parallel execution and concurrent
execution. The software specifically relies on these cooperative threads not
being _parallel_ , which is not an inherent property of being concurrent, most
certainly by definition.

Reading the first part of the article is a good start on understanding the
distinction. Maybe this sort of lack of a basic understanding of the concepts
is why concurrent programming is hard.

~~~
kazinator
There is no substantial difference between "parallel" and "concurrent". Bot
mean "running side by side" except that by "parallel processing" we usually
exclude _simulated_ concurrency.

However, simulated concurrency requires asynchronous behavior: an external
event, such as a timer interrupt, or the arrival of input, can come in at any
time and change the control among the tasks (preemption). This looks a lot
like true parallelism and must be programmed with the same care as real
parallelism, which is why we dare call it concurrency.

Tasking without preemption is not concurrency. The tasks know that they cannot
be preempted and rely on that assumption for their correctness. This approach
falls into the same domain as the use of coroutines, or dispatch queues of
lexical closures, first class continuations, generators and such: none of
which represent concurrency.

I do not believe I require any correction in terminology or concepts; I've
been programming for 33 years, and I'm the person who invented and first
implemented glibc's PTHREAD_MUTEX_ADAPTIVE_NP mutex back in 2000.

------
seanflyon
There are only 2 hard problems in Computer Science: naming things,
concurrency, and off by one errors.

~~~
eldelshell
And date/time handling... actually, those three you mention are quite easy to
me, I just hate working with date/time stuff.

------
tel
Concurrent programming would be hard even if we had great _abstractions_
because concurrent programming means maintaining exponentially many possible
states. If you go down that route then there are few good tools for handling
or even _conceiving_ of such problems[0].

The other route is, I think, less well described as "abstraction" and more
around _restriction_. For instance, Erlang's model simply ensures that while
you're dealing with exponentially many program states, the kinds of
interaction between those states means that you basically are able to pretend
like they don't exist for most problems. (This is also the mechanism of things
like Bloom and LVars which are whole models of commutative/convergent
programming that _happen_ to be executable in parallel).

[0] Here's a good one: combinatorial topology!
[https://www.elsevier.com/books/distributed-computing-
through...](https://www.elsevier.com/books/distributed-computing-through-
combinatorial-topology/herlihy/978-0-12-404578-1)

------
ahelwer
I was lucky enough to ask Leslie Lamport some questions about TLA+, which is a
programming language of his creation based on the Temporal Logic of Actions:
basically predicate logic with some reasoning about Time thrown in. With it,
you specify concurrent algorithms in terms of possible state transitions and
assertions which should hold in all program states (or subsets of those
states). A model checker then runs through all the states, checking your
assertions (this can understandably take quite a while, so there's a
distributed model checker for large systems).

A really good takeaway from the talk is a point he made on software testing:
for sequential algorithms, conventional software testing is likely to find
most of the bugs you really care about. For concurrent algorithms, however,
finding those bugs through software testing is extremely unlikely; an Amazon
paper on TLA+ usage[0] details a bug found which required a 35-step trace to
reproduce. Given the speed of computers today, it's only a matter of time
before these bugs are hit and your system implodes in a confusing and
impossible-to-reproduce way. Software testing by itself just isn't good enough
anymore.

[0] [http://research.microsoft.com/en-
us/um/people/lamport/tla/am...](http://research.microsoft.com/en-
us/um/people/lamport/tla/amazon.html)

------
taeric
My personal gripe is the dual fight between segmenting out our domain
abstraction along with our concurrency abstractions.

That is, what makes your domain most understandable is not necessarily what
makes it most amenable to concurrency. Indeed, it may be directly opposed to
this.

This is particularly obnoxious in models where folks try to have a single Foo
object that is used everywhere anything representing even part of a Foo is
used.

------
MCRed
Answer: Because you're not using Erlang/Elixir.

This sounds flippant, but it's not. The erlang BEAM VM is the only system
designed to do concurrency correctly. Unfortunately, I think that many people
think that concurrency can be done in a library, or on non-concurrent systems
like the JVM. I understand fully why Go would like to claim to be concurrent,
and all the other languages would like to claim it, because the reality is
we're in a multi-core multi-node distributed world. But don't fall for it.

Worse, there are some simple low hanging "concurrent" fruit like go-routines
that make it seem like you've solved the problem.

I would have no problem if someone had made a language and VM or compiled-to-
binary like go, that was truly concurrent, and that copied the erlang solution
(no mutable state, supervision trees, let it fail, etc.)

Seriously, steal from the best if you're making a new language.

So far, nobody has. And since we live in a multi-node, multi-CPU world, you
really should learn erlang (or elixir- which adds some features and has an
easier syntax).

Seriously, I know it's much easier to go with flavor-of-the-month and pretend
that there is no downside. But sooner or later-- hopefully sooner, because it
will be much more painful later-- you have to pay the piper for not being
concurrent. We've seen massive failures and re-architecting of systems as a
result-- twitter survived years of fail whales because of their poor initial
engineering. Can you count on enough VC money to do the same?

PS- not to pick on Go. Go has several features that are really bright choices.
I am just using it as an example here, but I've heard of everything up to and
including node.js being claimed as making erlang unnecessary.

~~~
wyager
> The erlang BEAM VM is the only system designed to do concurrency correctly.

That's a fairly bold claim. Erlang is good at the actor model. It's not an
all-around winner in concurrency.

Another shining example is the Glasgow Haskell Compiler runtime system. It has
an exceptionally fast and efficient green thread system that (in the latest
version of GHC) can scale almost-linearly up to 32 cores, even with some
resource contention, and can easily handle millions of concurrent threads. I
suspect you could emulate Erlang actors with very little loss of efficiency.
(And, of course, you're also free to use a thread-based model.)

And being a language with type-system-encapsulated mutability (and very nice
resource contention management mechanisms like STM), concurrency is often
"free" and very safe. An example of how concurrency is "free":

    
    
        main = do
            forkIO (putStrLn "thing1")
            forkIO (putStrLn "thing2")
            threadDelay 1000
    

Anything that is IO typed can get free concurrency like this. I need make no
change to my putStrLn code: as long as it returns IO, I can use it
concurrently.

A few interesting things to look at:

[http://benchmarksgame.alioth.debian.org/u32/performance.php?...](http://benchmarksgame.alioth.debian.org/u32/performance.php?test=threadring)

[http://haskell.cs.yale.edu/wp-
content/uploads/2013/08/hask03...](http://haskell.cs.yale.edu/wp-
content/uploads/2013/08/hask035-voellmy.pdf)

edit: Added thread delay so code example works out of the box.

~~~
kaoD
Not a user of Haskell but interested in it.

What makes my spider-sense tingle in Haskell is lazyness. It's all cool until
you unexpectedly hit that lazy unroll (not sure what the correct term is) in a
time-critical system.

Are there ways to control lazyness in Haskell, just like there are ways to
control mutable state?

~~~
wyager
Laziness is just the default (and most likely to terminate) evaluation
strategy.

You can use a number of techniques (the simplest being Bang Patterns) to
enforce non-lazy evaluation strategies (including strict evaluation, parallel
evaluation, sequential evaluation, etc.).

However, there are _very_ few activities where Haskell's laziness will cause
disruptive timing variations. Any language that you might use instead of
Haskell is probably heavily heap-oriented, and likely garbage collected, which
means that you're already dealing with unpredictable timing variance.

------
dragonwriter
Concurrent programming is hard because:

1\. Reasoning about systems with multiple concurrent series of events which
can interact with eachother is harder than reasoning about a single, serial
chain of events, and

2\. For a long time most programming environments (languages and libraries) --
and this is still true of many that are still widely used -- have only
rudimentary support for concurrency.

Progress in language facilities and libraries is mitigating #2, but #1 is
fairly fundamental.

~~~
tjradcliffe
Agreed. All the talk about language constructs and better abstractions ignores
the reality that reasoning about time is one of the hardest things that humans
can do. It is one of our most fragile abilities, and we routinely fail to do
it correctly when thinking about a single temporal sequence much less multiple
temporal sequences interacting with each other.

No languages nor tools will ever entirely fix this. The promoters of Erlang
and Haskell in this discussion are optimists. I'm particularly amused by the
notion that Haskell threads only yield on memory allocation, which is the
worst kind of imperative programming heuristic, forcing the programmer to
worry a great deal about the ordered sequence of operations and the way in
which it will change the state of the system in exactly the way Haskell is
supposed to not to.

I don't think this is a fault in Haskell: it is just a pragmatic language
designer bowing to the inevitable (although I'd argue that having the
potential to yield as a side-effect of allocation is a sub-optimal choice for
a side-effect-free language... I presume there must be an explicit instruction
also that permits a yield to take place?)

~~~
dragonwriter
Strictly speaking, the issue of Haskell threads only being preempted on memory
allocation isn't a "language designer bowing to the inevitable", since its not
a language feature, its a feature of the GHC implementation of Concurrent
Haskell [1] (you have to have some heuristic like this if you are using green
threads, but there's no reason Concurrent Haskell _couldn 't_ be implemented
with native threads, but M:N green threads have advantages in practice, even
with pathological cases that rarely occur in real programs.)

> although I'd argue that having the potential to yield as a side-effect of
> allocation is a sub-optimal choice for a side-effect-free language...

Preemption isn't a side-effect in the sense in which Haskell is side-effect
free even in the single-threaded case, since timing isn't guaranteed to start
with (and, particularly, a single-threaded Haskell program can be preempted by
the host OS at any time, based on any heuristic applied by the host OS, and
for any amount of time), and to extent that preemption effects semantics, it
only effects the semantics of things like IO actions which are _not_ , by
design, side-effect.

> I presume there must be an explicit instruction also that permits a yield to
> take place?

Yes, Concurrent Haskell includes an explicit yield.

[1]
[https://hackage.haskell.org/package/base-4.7.0.0/docs/Contro...](https://hackage.haskell.org/package/base-4.7.0.0/docs/Control-
Concurrent.html)

------
aurelianito
We have a great, widely used, highly successful model to handle concurrency.
Processes and pipes! It has been used for over 40 years, and it is the base of
UNIX. Why do we look for something else? I don't know.

------
fleitz
Because concurrency in imperative languages is generally a euphemism for non-
determinism.

Now the problem can be correctly rephrased as:

Why is it hard to get non-deterministic programs to do deterministic things.

