
Structured Concurrency in High-Level Languages - rumcajz
http://250bpm.com/blog:124
======
Animats
The control flow issues mentioned there have been addressed many times.
Coroutines and their discontents are well known. But the author has some good
points about cancellation.

Cancellation isn't well-studied. It has two main cases - canceling a wait, and
interrupting actual computation. The first case comes up more often.

If anything that can wait can be aborted, then things unwind better. Language
support for this needs to be block-structured, as in Python's

    
    
        Lock lok;
        with lok :
          (do something)
    

Python uses exceptions, so the appropriate mechanism there would be to
indicate lock failure with an exception, and not execute the "with" block. In
error code languages, the locking block should return an error code.

Once a thread/coroutine/whatever is in cancellation, all waits should abort.
So you rapidly fall through the code until the concurrent construct reaches
its exit point. Unwinding still takes place. Non-blocking concurrency
primitives should be allowed in this mode, so you can add things to a list,
but can't wait on I/O. RAII things like file-closing on block exit have to be
done that way, because the canceled construct can't wait for them.

QNX real-time messaging works something like this. QNX's basic interprocess
communication primitive is MsgSend, which sends a message and waits for a
reply. MsgSend can be canceled, or can time out; an error code is returned.
All I/O requests are MsgSend operations and thus can be canceled. In QNX,
everything that can block can be timed out or canceled. This is used to place
upper bounds on how long things can take.

Canceling a compute loop is harder. If the compute loop is operating only on
local data, it's not so bad. You need some data flow analysis to determine
that the function being forced to error out doesn't have non-local side
effects. If, say, you're inverting a matrix, and that needs to be aborted, you
need to be sure that the matrix is never looked at again. This is probably
easier in functional languages.

Once wait cancellation under control, so the thing you're canceling can't
block, having to wait for the other thread/coroutine/whatever to reach its
join point isn't so bad. You don't need to kill it; you can wait for it to
unwind. So solving cancellation moots the problem of how to handle thread
joins for failed threads.

------
sparkie
The idea of looking at things as a call tree (or graph) rather than a plain
old stack is how Schemers think about things. We specifically avoid
terminology like "stack frame" and refer to them as "continuation frames"
because we don't tie the abstract feature to how it is implemented. A branch
in the tree is an invocation of call/cc.

These recent complaints about structured concurrency in popular languages to
appear to shadow the complaints about call/cc vs delimited continuations. More
info here: [http://okmij.org/ftp/continuations/against-
callcc.html](http://okmij.org/ftp/continuations/against-callcc.html).

------
CodeArtisan
Oz's declarative concurrency is worth taking a look at. In Oz, a thread
implicitly goes into sleep or wake up on data dependency. for example,

    
    
        declare 
          FOO BAR % FOO and BAR are not bound to any value yet.
        in 
        thread
          {print FOO} % the thread blocks here until FOO is bound to a value.
    
          BAR = 456   % Binds BAR which wake up any thread waiting for it.
        end
    
        thread
          FOO = 132   % Binds FOO which wake up any thread waiting for it.
    
          {print BAR} % the thread blocks here until BAR is bound to a value.
        end
    

Because of this, ordering of operations is easy.

~~~
TeMPOraL
This sounds like an easy recipe for race conditions. Consider:

    
    
        declare 
          FOO BAR % FOO and BAR are not bound to any value yet.
        in 
        thread
          {print FOO} % the thread blocks here until FOO is bound to a value.
    
          BAR = 456   % Binds BAR which wake up any thread waiting for it.
        end
    
        thread
          {print BAR} % the thread blocks here until BAR is bound to a value.
    
          FOO = 132   % Binds FOO which wake up any thread waiting for it.
        end
    

I swapped the instructions in the second thread. Does Oz detect that, or
otherwise protect programmer from such mistakes?

~~~
CodeArtisan
It would deadlock but there is a debugger that shows the state of any thread
and where it is blocked.

~~~
smaddox
So it's race free.... If you test every single possible system state, and
remove all the race conditions. No thanks.

~~~
ghkbrew
A race condition is a situation where the result depends on the
ordering/interweaving of threads. And that example isn't actually one.

It's a deadlock which are also bad but much more deterministic and easier to
debug.

------
yxhuvud
Great follow up to the original argument. I certainly hope some language
designers start to look really hard at these arguments.

    
    
        Off the top of my head:
    
        * Cancel all threads in the bundle.
        * Block until all threads in the bundle exit.
        * Block until at least one thread in the bundle exits. Cancel all the other threads.
    

I think there also is one 'restart any exiting thread' and a variant of the
block until all rule where only failed threads are restarted. Hmm, actually -
perhaps there should be variants specifying the behavior of failed threads in
all cases?

~~~
zzzcpan
> perhaps there should be variants specifying the behavior

It feels like Erlang/OTP is being reinvented right here again.

Once you get deep enough into concurrency the ideas start to converge on
killable isolated processes with asynchronous message passing and programmable
behaviors. Because with shared memory killing a thread has unpredictable
consequences, you need to either enforce isolation or avoid consequences in
some other way, like by not having multithreading and using an event loop.
Isolated processes need to communicate with other processes somehow. You need
message passing. But it has to be asynchronous to make it possible to
implement various waiting behaviors. With event loops you get there via
cancellable contexts and higher order programming, but on a single cpu.
Getting into multi core or distributed systems yet again requires asynchronous
message passing. At the end it's all actors everywhere.

~~~
pacala
What would be some usecases for which synchronous is insufficient and one has
to reach for the hard to reason about asynchronous?

~~~
toast0
Synchronous messaging is insufficient for making multiple requests at once and
waiting for the responses. Also asynchronous messaging allows for optimistic
notifications -- for example to clear a cache when coherency isn't required.

With asynchronous messaging it's simple to build synchronous interactions if
desired --- send a message and wait for a response; but you can't build
asynchronous interactions on top of synchronous messages.

------
samatman
A proposal: what's being referred to as a 'bundle' or a 'nursery' should be
called a 'loom'.

A loom has the precise relation to actual warp threads as a virtual loom would
to software (green) threads.

~~~
irq-1
How about a "knot" which implies a structured organization of threads (like a
knot of rope), and a messy tangle of threads (like a knot in hair.) Plus, it's
funny that it's a homophone for "not".

------
KirinDave
For those interested in various approaches in this space that are very
interesting, may I recommend:

* Haskell and in particular Marlow's async library.

* The Erlang language's Actor system along with the OTP framework as a different way of thinking about orphaned processes (via supervision trees).

* The Pony language, which can safely pass ownership of objects (and thusly coroutines) around.

------
noelwelsh
Isn't this really well trodden ground? Coroutines were known by the 60s,
according to Wikipedia, and I have a really hard time believing that just
about everything worth saying about them hasn't been said. I feel a little bit
of scholarship would help avoid creating problems that have already been
solved.

~~~
chubot
This comment doesn't add anything to the discussion. You could say this about
anything, like say parsing, which was also "known by the 60's".

It's a "middlebrow dismissal": [http://www.byrnehobart.com/blog/why-are-
middlebrow-dismissal...](http://www.byrnehobart.com/blog/why-are-middlebrow-
dismissals-so-tempting/)

It ignores the difference between theory and practice. Whether you do
coroutines in C, Python, or another language matters. Language features
interact, and those interactions affect how you write specific programs, so
treating it abstractly has limits. Python actually has 2 completely different
notion of coroutine, both of which were implemented after the 60's.

The ground isn't really well trodden, because most languages don't have
coroutines. Probably 99% of programs don't use coroutines (nearly every
program in C, C++, Java). Pre-emptive threads are more common.

You also didn't mention any literature which would have "said all the things".

That said, here is a good paper from 2004 from the authors of Lua: _Revisiting
Coroutines_. Among other things, it points out that "coroutine" doesn't mean
one thing.

[https://scholar.google.com/scholar?cluster=14280883920152261...](https://scholar.google.com/scholar?cluster=14280883920152261782&hl=en&as_sdt=0,5&sciodt=0,5)

~~~
noelwelsh
It's hard to give constructive criticism of the article, beyond suggesting
background reading, because it seems both ignorant of existing research and
naive as to the issues that arise.

Here are some of the relevant ideas and papers:

\- "A coroutine is like a control construct"\---a coroutine is a control
construct. There is no "like". But there is a fundamental difference from a
goto because it captures execution state (the stack and so on, also known as
the continuation) which a goto does not. A continuation can only jump to
program states that make sense (for some definition of "sense"\---you can at
least count on variables being initialised, for example). A goto can jump
anywhere.

\- "But coroutine is also like a variable"\---a coroutine closes over an
environment. I.e. it is a closure. This doesn't really have much to do with
being "like a variable". Representing closures is well studied in programming
language research and there are many techniques for handling it E.g.
[http://matt.might.net/articles/closure-
conversion/](http://matt.might.net/articles/closure-conversion/)

\- "A coroutine can't be orphaned"\---those are your continuations. You now
need to worry about either heap allocating continuations or copying the stack.
E.g.
[http://lampwww.epfl.ch/teaching/archive/advanced_compiler/20...](http://lampwww.epfl.ch/teaching/archive/advanced_compiler/2006/assignments/part5/continuations.pdf)
And when do you free continuations? You probably need to GC them, as your
control flow is now more complicated.

\- "A coroutine can be orphaned"\---this is where you need to think about
semantics. What does it mean for a coroutine to be abruptly exited? It
probably means leaking resources and hard to track down bugs. A lot of thought
needs to go into this. (Somewhat relevant is this paper:
[https://www.cs.indiana.edu/~dyb/pubs/LaSC-7-1-pp83-110.pdf](https://www.cs.indiana.edu/~dyb/pubs/LaSC-7-1-pp83-110.pdf))

The rest of the article isn't so interesting to me, because I don't think
thread cancellation should be possible at arbitrary times.

So that's some of the existing work in the field, but the biggest issue is the
need to think about the semantics, or programming model, of the concurrent
tools you're introducing. The article does very little of this. Can you
actually understand programs written using some proposed concurrency
primitives? That's where it really gets interesting, and where recent work
like
[http://www.csc.kth.se/~phaller/doc/haller16-scala.pdf](http://www.csc.kth.se/~phaller/doc/haller16-scala.pdf)
is making some thought provoking contributions.

~~~
coldtea
> _- "A coroutine is like a control construct"\---a coroutine is a control
> construct. There is no "like"._

Pedantic. A coroutine is a control construct in the same way an Exception is a
control construct, or in the way a tomato or a cucumber is a fruit.

Adopting a particular naming depends on the categorization and the intention,
not merely on the capabilities of a construct.

Even more so in a post, were the notion needs to be introduced to people who
only associate the concept of a "control construct" to the more explicit and
tamer constructs referred to as such, if and for loops, goto, etc.

> _- "But coroutine is also like a variable"\---a coroutine closes over an
> environment. I.e. it is a closure. This doesn't really have much to do with
> being "like a variable". Representing closures is well studied in
> programming language research and there are many techniques for handling it
> E.g. [http://matt.might.net/articles/closure-
> conversion/*](http://matt.might.net/articles/closure-conversion/*)

The first part is just a formal definition, which doesn't negate the
statement, and the latter part is just a generic reference to research on
closures.

One could use the same kind of saying-nothing responses to all kinds of posts
or papers.

~~~
noelwelsh
"One could use the same kind of saying-nothing responses to all kinds of posts
or papers."

You misunderstand.

If you know the name of the concept you can search for more information.

If you read the section on "But a coroutine is also like a variable" you'll
see the author is struggling with how to represent coroutines. Closure
conversion deals with representation.

------
UncleEntity
Isn't this the same as _Notes on structured concurrency, or: Go statement
considered harmful_ [0] except using c rather than python?

One has a "nursery" while the other has a "bundle" with both hoping to control
the lifetime of coroutines.

[0][https://news.ycombinator.com/item?id=16921761](https://news.ycombinator.com/item?id=16921761)

~~~
yxhuvud
Did you read the preface? It is a direct follow-up to that post written by one
of the people that inspired it in the first place. The post goes through some
parts in a bit more depth than what the previous post you are referring to
did.

But yes, the author has a very similar approach to the problem.

