
Structured Concurrency - vmorgulis
http://250bpm.com/blog:71
======
chadnickbok
Great series of articles, I very much enjoy reading Martin Sustrik's thoughts
on concurrency, mainly because he has the creds to back up his criticisms.
Reminds me a lot of reading Poul-Henning Kamp's writing (the author of Varnish
cache).

What's most interesting to me personally is that if you squint, Martin is
essentially making an argument for a low-level parallel to Java's Thread.stop
method (although in much more detail). I always thought that Java's
deprecation of Thread.stop was a huge cop-out from actually solving the
underlying causes of its issues, and very disappointing. But from a high-level
view, there are often differences between whether a thread should be stopped
when its ready, or if a thread is no longer useful and should be terminated
immediately.

Its also interesting that go's solution to this is to pass a context with a
'done' channel through every function call dealing with a specific session,
and every time a channel is waited on you also wait on the done channel. While
its an effective method, its a bit of a shame that the concept wasn't built
into go more deeply.

Some relevant articles:

\- Why Thread.stop was deprecated:
[http://docs.oracle.com/javase/1.5.0/docs/guide/misc/threadPr...](http://docs.oracle.com/javase/1.5.0/docs/guide/misc/threadPrimitiveDeprecation.html)

\- Go concurrency patterns:
[https://blog.golang.org/context](https://blog.golang.org/context)

------
bmurphy1976
I'm not sold on Martin's proposal, however I recently spent some time getting
clean cancellation into some code I was working on so I can sympathize with
what he is going through. Here are a few tips that will hopefully help others
who face a similar situation:

1\. The simplest way to cancel something is to have a "cancelled" channel and
_close_ it, don't send a cancel message! This is important because closing the
channel acts as a broadcast. Multiple goroutines can listen for the same
signal and you don't have to spend time managing who's sending what message to
whom and when. Close the channel and everybody is immediately notified.

2\. You can select on both send and receive. You should check if the cancelled
channel has been closed on sends so that your goroutines don't block forever.
It's incredibly difficult to guarantee that somebody will be listening on the
channel to unblock your goroutine after cancellation happens, so you should
check on send as well.

Example receive:

    
    
      select {
        case message := <-c:
          // do something with message
        case <-cancelled:
          // cleanup and exit goroutine
      }
    

Example send:

    
    
      select {
        case c <- message:
        case <-cancelled:
          // cleanup and exit goroutine
      }
    

3\. One commenter asked Martin why he doesn't use the standardized context
package. Martin replied stating there is no way to distinguish between hard
and soft cancels. While this is true, given a single context, there's nothing
saying you can't use more than one context or cancellation channel. I'm not
saying this is a particularly good idea, but these are not singletons and you
can have as many as you need.

Doing cancellation cleanly is still a tough problem, but with some focus and
especially #1 and #2 it's doable, but you have to roll up your sleeves and put
some effort into it.

~~~
rumcajz
Thanks for making the above clear. I should probably have made that explicit
in the article but I was so focused on hard cancellation that all I've said
about soft cancellation was, basically, "do it by hand", without going into
details.

In fact, being able to share single channel between multiple goroutines and
using "close" as a broadcast mechanism is one of the best -- if rarely
mentioned -- ideas in Go.

------
eps
> _The protocol will be a simple sequence of messages, each message consisting
> of 32-bit size field in network byte order and the data. How hard it can be
> to implement such a simple specification? It turns out it can take months._

It could also take years, if not decades. It doesn't mean it should. This is a
good article but this sort of straw-man shenanigans only serves to hurt its
credibility.

~~~
rumcajz
It comes from experience. I've tried to implement such a simple protocol and
it literally takes months to get all the details right. Also see the quote
from Joe Armstrong in the article.

~~~
eps
You aren't the only person who's well-versed in the network development and
that statement is patently absurd. If it did in fact take you several months
to get a framing layer right, the it only means that it was an issue stemming
from specifics (and complexity) of your implementation approach. Hence my
original credibility remark - you are clearly over-extrapolating and over-
generalizing your anecdotal data.

------
lkrubner
Does anyone know what the 2nd "processes" is referring to, in the 2nd
paragraph here?

\---------------------------

In a sane world, specification 30 words long, like the one above, would be
implementable in 100 lines of C code. If it requires several thousand lines of
code, there's clearly something broken with our programming model.

As I've argued in part I of this essay, to get a simple programming model you
have to spawn a thread per TCP connection. But given that threads are
expensive (and processes even more so) and that sharing state between threads
is insane, bordering on suicidal, what you need are lightweight green threads
and a simple communication mechanism to send messages between them. In UNIX of
old they were called processes and pipes. In Go they are called goroutines and
channels. In Erlang they are known as processes and mailboxes.

\-----------------------------

I understand this:

"But given that threads are expensive (and processes even more so)"

but I am confused by this 2nd use of "processes":

"In UNIX of old they were called processes"

Was there an old version of UNIX that had lightweight threads? What were they?
How were they implemented? How were they different from the processes we have
now?

When I think of processes on Unix I think "threads are expensive (and
processes even more so)". Was this different at some point in the past?

As to implementing new communication protocols, none of us should
underestimate the complexities. Jon Postel was not stupid, but even he made
mistakes. Recall that by 1973 they felt they had TCP 3 in a mostly final form,
but then in August of 1977 Postel said they had made a terrible mistake by
lumping too much together in TCP. He then spent a year pulling TCP and IP
apart into the 2 separate protocols that we have now.

That example is well known, but if you read through the old RFCs, you realize
there are some ideas that were explored and then abandoned. Among the really
wild ideas:

"we are contemplating allowing sender and receiver to specify different byte
sizes and consider bits as the basic unit "

This is from:

[https://tools.ietf.org/html/rfc128](https://tools.ietf.org/html/rfc128)

And if you try to create your own communication protocol, you will also go
through a similar period of trial and error.

~~~
to3m
The 2nd type of process is just an ordinary Unix-style process. It refers to
the way that on older hardware process switches were cheaper on account of the
smaller amount of state that needed to be shuffled about and cheaper system
calls. So no need to (as you might today) prefer green threads over OS
threads, and/or OS threads over processes, on account of the overheads.

This is mentioned in part 1 near the bottom:

    
    
        When UNIX was still young ... you were supposed to fork a
        new instance of the process for each TCP connection
    
        I guess it made sense from performance point of view back
        then. All the stuff that makes process switching slow today
        haven't yet existed. ... In such an environment process
        switching was closer to what we call green threads today.

------
PaulHoule
I blame C more than I would praise Erlang.

Java has a threads implementation that actually works which is a lot more than
you can say about C/pthreads, although lately they've been tuning up the C
memory model.

It may not be easy to write multithreaded code in Java but it is definitely
possible. It's actually pretty easy in a lot of cases if you use the Executor,
I think most of the carping you will hear about Java these days is because
people think they are being clever using Fork-Join or work stealing or actors
when in fact they are just screwing up.

~~~
jkot
And Java has async network IO. Single threaded example from article would be
fairly simple to implement with Grizzly or Netty.

------
raiph
cf "It's time to embrace structured approaches to parallel, asynchronous, and
concurrent programming." ~~ [http://jnthn.net/papers/2015-yapcasia-
concurrency.pdf#p=76](http://jnthn.net/papers/2015-yapcasia-
concurrency.pdf#p=76)

~~~
PaulHoule
More structured is probably going to look a bit like UML, Petri Nets, or RETE
networks.

~~~
yxhuvud
Indeed. See the for example the Disruptor pattern for something that look like
a (cyclefree) Petri Net.

~~~
fizwhiz
Could you expound a little further on why the Disruptor pattern looks like a
cycle-free Petri Net?

~~~
yxhuvud
Well, the disruptor have a set of requests that are processed in parallel.
Each request have a counter associated with it. Each request also have a step
associated with it. A step can depend on an arbitrary number of earlier steps.
This dependency based setup of steps can easily be mapped to petri form.

Ok, maybe it isn't that close, but both are some kind of state machines that
allow for arbitrary number of dependencies for each state and the mapping from
a cycle free petri net to a disruptor is very trivial. Of course, the lack of
cycles hobble the expressive power of the nets.

------
halayli
In lthread-cpp (wrapper around lthread) it's even easier because of RAII.
Example:

[http://lthread-
cpp.readthedocs.org/en/latest/listener.html#e...](http://lthread-
cpp.readthedocs.org/en/latest/listener.html#examples)

TCPListener closes the socket once its done and cleans up. And it also allows
you to pass instance methods as the coroutine which has the advantage of
accessing the instance context.

