
The Beauty of Concurrency in Go - iand
http://pragprog.com/magazines/2012-06/the-beauty-of-concurrency-in-go
======
Jabbles
This article isn't bad... but it misses several important points of Go. I also
note the article is 9 months old. In the hope that my criticism will be taken
as constructive, with apologies for not writing detailed explanations:

1\. Goroutines are not threads

2\. type inference allows you to elide types in var declarations: var host =
flag.String(...

3\. Go's convention is to use camel case, not underscores.

4\. Calling os.Exit all over the place is unusual imo - it may be better to
panic().

5\. fmt.Fprintf exists os.Stderr.WriteString(fmt.Sprintf...

6\. An explanation of why the standard log package isn't suitable would be
nice, although I see the format used is slightly different.

7\. Ignored errors when writing. Why do you wish to sync at every packet?

8\. Massive race condition by re-using b. Line 62 overwrites b. b is then
passed to c.logger (68) and c.binary_logger (70) for them to process
asynchronously. c.logger is then passed another byte slice (72), which forces
it to finish using b. c.binary_logger is not passed anything else, allowing it
to delay until the next time something is sent on that channel, which would be
after b is overwritten by the next packet. I think that simply moving line 58
to between 61,62 would fix this.

The author has not quite grokked the concept of "don't share memory". b is
shared, and undefined behaviour results :(

~~~
abecedarius
How are goroutines not threads? Do you mean because it's _possible_ for them
to communicate without shared mutable state?

Edit: Oh, apparently you all mean _OS_ threads. So say so. (For example, in
Haskell they're called threads without any implication that each one is an OS
thread. Haskell's not unusual that way.)

~~~
AYBABTME
Go schedules its own 'threads' (goroutines) to run on one or many (OS)
threads.

In this case, threads refer to not-a-process units of execution in an
operating system. For whatever its worth, Wikipedia defines threads as:

    
    
      the smallest sequence of programmed instructions that 
      can be managed independently by an operating system
      scheduler.
    

I think that's also the commonly accepted definition: that threads are related
to the OS scheduler.

~~~
tptacek
The notion that goroutines aren't threads seems to exist primarly because
Golang _also_ makes use of OS-native threads in order to schedule goroutines,
and so needs to draw a distinction in order to explain how the runtime works.

But that's just a detail. In reality, goroutines are threads; they're just
userland, non-preemptive threads. Similar constructions have been available,
even to C programmers, for well over a decade (and probably much longer).

~~~
furyofantares
> Similar constructions have been available, even to C programmers, for well
> over a decade (and probably much longer).

Of course they aren't called threads there, either (probably for the same
reason as they aren't called threads in Go)

~~~
tptacek
Sure they are. For instance, Gnu Pth.

Programming language support for threading predates direct operating system
support (at least in mainstream operating systems) by a lot of years, from
what I can tell.

~~~
furyofantares
The ones I've used have just been called coroutines, though I suppose I
haven't used any that function as drop-in replacements for threads like Pth

------
codewright
The beauty of concurrency in Clojure:

; Rough sketch: def defines a var (pretend it's a reference)

; @ is used to dereference the future and block to wait for the result.

    
    
      (def f 
        (future 
          (Thread/sleep 10000) (println "done") 100))
    
      user=> @f
      done
      100
    
      ;; Dereferencing again will return the already calculated value.
      => @f
      100
    

<http://clojuredocs.org/clojure_core/clojure.core/future>

Edit: And more importantly, there are wrappers for the standard data
structures designed around different concurrency use-cases (sync, async,
coordinated, uncoordinated)

Refs are for Coordinated Synchronous access to Many Identities".

Atoms are for Uncoordinated synchronous access to a single Identity.

Agents are for Uncoordinated asynchronous access to a single Identity.

Vars are for thread local isolated identities with a shared default value.

[http://stackoverflow.com/questions/9132346/clojure-
differenc...](http://stackoverflow.com/questions/9132346/clojure-differences-
between-ref-var-agent-atom-with-examples)

And you can use all (all!) of the Java concurrency tooling as desired,
including raw threads (for which Clojure has a wrapper as well).

Part of the reason I use Clojure rather than Go is because it doesn't try to
force you into a one-size-fits-all method for handling concurrency. I have no
problem with CSP but it doesn't fit everything I do. Sometimes I just want to
defer work or wrap it in a future. Or I want to use an intelligent coordinated
data structure rather than trying to meld flesh and bone to steel in order to
make a concurrency-naive data structure behave how I want in a concurrent
environ.

If I can avoid those unnecessary battles, I will.

So - Clojure.

~~~
kkowalczyk
Go doesn't "force" you to use one method of concurrency.

It has more than one (you can do erlang-style share-nothing style or Java/C++
style of using mutexes to protect shared state from concurrent access).

I know nothing about Closure so it's possible it has more features but it's
not necessarily a good thing. Is the complexity of 4 different solutions worth
it? (by "it" I mean: a programmer has to learn all of them and when to use
what; the implementor has to implement them; write wrappers for all standard
data structures (what about third party libraries?) etc.).

Feature bloat has a cost.

Go gives you all you need to easily write concurrent programs and it does it
with refreshingly simple design (both for people to learn and to implement).

~~~
mr_luc
So why use Go instead of node.js? You can probably take that reasoning and
substitute "Go" for "Clojure", and "node.js" for "Go".

Go people would probably object that the things that Go adds, and that node.js
lacks, aren't just window dressing -- sure, there are situations where a
node.js-style fast event loop that avoids blocking operations is all you need,
but there are also situations where you want something more like real threads,
because the problem demands it.

I'm a lisper but not a Clojure expert - but I'd assume that Clojure people
don't consider the existence of e.g. Actors to be "feature bloat". My
impression is more that the difficult/special concurrency-enabling feature of
the language is STM, and language-level support for different concurrency
paradigms, implemented on top of STM, are probably low-hanging fruit once
you've got it.

~~~
pjmlp
Static typing and availability of AOT compilation?

------
dlsspy
There's _so much_ code here.

If I weren't sick, I'd submit a new version that:

1\. Didn't reimplement io.Copy

2\. Didn't avoid io.TeeReader

3\. Didn't do weird things to avoid regular channel ranges.

4\. Didn't do non-standard date formatting.

5\. Didn't reinvent the log package.

6\. Didn't try to convince anyone runtime.GOMAXPROCS(runtime.NumCPU()) was a
good idea (it's not)

In fact, maybe I will anyway. brb

~~~
cpeterso
> _6\. Didn't try to convince anyone runtime.GOMAXPROCS(runtime.NumCPU()) was
> a good idea (it's not)_

So what is an appropriate GOMAXPROCS? As someone who has only dabbled in a few
Go tutorials, I would imagine that you would want GOMAXPROCS to be NumCPU()
(or even greater) so the goroutine thread pool could "fire on all pistons".
Why does Go's scheduler default to GOMAXPROCS=1 instead of NumCPU()?

~~~
smosher
In my experience the majority of well-written concurrent programs become
I/O-bound on a single processor. It's pointless to add more processors to
that, and can only slow you down, and Go programs are far better behaved in
the non-parallel case.

At other times you should think about the number of processors you want to
occupy. If the objective is to behave like an appliance, then 1:1
schedulers:cpus is not a bad ballpark.

~~~
dlsspy
Then why isn't that the default?

~~~
smosher
The default is 1 isn't it? As I pointed out, this will serve the majority of
concurrent code.

The best number of processes to use is equal to the parallelism of the
solution. Even with highly concurrent problems, this is still most often 1. If
you get it wrong performance will suffer. But in practical terms we have more
to worry about, and if you're talking to the disk and the network more than
you're computing, parallelism will only increase the contention on those
resources. The extra processes will consume more CPU without doing any more
useful work.

So the default is pretty good.

By the way 1:1 isn't the limit either. Sometimes you will want more. If the
problem truly is parallel enough to exceed your CPUs, you may want additional
processes anyway. This will keep things up to speed thanks to the host's
scheduler which is typically preemptive, unlike Go's. This sometimes works
much better if you can pick and choose which routines run on which schedulers,
and I'm not sure if Go exposes that.

~~~
dlsspy
I have no idea what my response was about. As I said, I was pretty sick that
day. Sorry you had to type all this stuff to explain to me that I'm a moron.
:)

------
octo_t
This has finally let me figure out what annoys me about Go, its a cargo-cult
language. People saw that Erlang's Actors/processes were really popular and
made it easy to write good, concurrent software.

They then went away and implemented their own language with lightweight
processes and message passing, but missed the fact that actors are the price
you have to pay for the benefits of not sharing mutable data.

And Go completely skipped that part (the most important part).

~~~
timclark
Doesn't go just implement Hoare's communicating sequential processes, as does
Erlang? They share the same inspiration.

You don't need to share state data between your goroutines if you don't want
to either just like you don't have to use mnesia to share state between erlang
processes if you don't want to.

I don't think you can really accuse go of being a cargo cult language either,
Rob Pike has implemented CSP multiple times (<http://swtch.com/~rsc/thread/>).

~~~
ridiculous_fish
It is easy to accidentally share state between goroutines. For example, we
wish to print out elements of a list:

    
    
        values := []string{"a", "b", "c"}
        for _, v := range values {
            go fmt.Println(v)
        }
    

Each of these goroutines shares the same variable v, so this code contains a
serious race condition.

~~~
Jabbles
It may be easy, but it's not _that_ easy.

    
    
        values := []string{"a", "b", "c"}
        for _, v := range values {
            go func(){fmt.Println(v);}()
        }
    

Does have a race condition.

    
    
        values := []string{"a", "b", "c"}
        for _, v := range values {
            go func(s string){fmt.Println(s);}(v)
        }
    

Does not have a race condition.

~~~
flyinRyan
You're still shaving v here, we can just see from the snipped that the shared
v isn't used inside the anon function. But I suspect what the GP meant by
"easy" is not having to think about this sort of thing. Your solution is good
when you know you need to do this, but it can't even happen in Erlang so it's
not a "gotcha" to watch out for.

~~~
Jabbles
Sharing v isn't the problem. The problem occurs when v is evaluated.

The parent's post has no race condition, as v is evaluated before the
goroutine starts.

The top example of my post has a race condition because there is no guarantee
when v will be evaluated wrt to the loop.

The bottom example has no race condition because v is evaluated on every
iteration and assigned to s, which is used by the goroutine at some point
afterwards.

~~~
flyinRyan
My point still stands: to fix this, you have to realize that (a) v could be
shared here and (b) that sharing could be a problem. I suspect the first time
most newbies get hit with a race condition here they're going to be beyond
baffled.

------
StavrosK
Ugh, I just finished writing the XMPP frontend for an XMPP/IRC bot I'm working
on (<http://www.getinstabot.com>). The frontends are in Go for concurrency,
and ferry messages back and forth from the channels to the backend.

Let me tell you, that problem is _hard_. Go coped pretty well, but the final
thing is a mess of global states, and it's pretty elegant for what the problem
is. I was hoping to avoid having many moving parts, but it ended up needing a
_lot_ of shared state between all processes.

Some problems are just hard, and, no matter how well-designed the language is,
they'll still be hard. Something I miss from the language after implementing
that is the ability to, say, monitor one goroutine from another to see if it
returns (so the former can return as well). I know it's possible with
channels, but when one goroutine is blocked on network Recv(), there's not
much of a chance to listen to channels.

Anyway, yes, parallelism in Go is great, but not everything is magically all
rainbows and unicorns (that's Django). Some problems will be messy and dirty,
and even more so when you use channels.

~~~
lobster_johnson
> _Something I miss from the language_ …

Does sound like you want to use channels, and break your logic into small
independent parts.

You will have one goroutine using blocking Read() in a loop and feeding data
to some channel. When it's done, you write to another channel that exists only
for signaling:

    
    
        defer {
          doneChannel <- true
        }
        for {
          data := make([]byte, 65535)
          _, err := conn.Read(data)
          if err != nil {
            errorChannel <- err
            break;
          }
          dataChannel <- data
        }
    

and then in your other goroutine:

    
    
        for {
          select {
            case data := <- dataChannel:
              // Handle data
            case <- doneChannel:
              // Other goroutine is done
          }
        }
    

The only things shared here are the channels.

The way to avoid too much state and moving parts is to break the problem into
isolated, manageable parts that communicate with channels. Often you will have
hierarchical relationships like this, where one piece of dumb code exists to
pass data from something lower down to somewhere higher up.

It's not hard, although some of the code gets a bit ugly and disjoint at
times, especially in how _anything_ synchronous has to use channels and
goroutines. For example, today I wrote a simple worker pool implementation
that runs a given function in parallel via goroutines and can adjust the
number of workers dynamically at runtime. That function has to be declared not
just as "func()" but as "func(abortChannel chan bool)", and the worker
function _has_ to honour the abort signal when it arrives from the pool. So
channels do leak everywhere, even into APIs. (Yeah, I know I can use "chan
struct{}" to avoid any storage, but I think "<\- true" looks nicer than "<\-
struct{} {}".)

What is harder is to intelligently handle complex cascading failures. That's
what Erlang, with its supervisor tree, is good at. Go's goroutines are "fire
and forget" and cannot even be terminated programmatically from elsewhere in
the program.

~~~
Jabbles
> I know I can use "chan struct{}" to avoid any storage, but I think "<\-
> true" looks nicer than "<\- struct{} {}".

I disagree. struct{}{} tells me that the value isn't important. Whenever I use
map as a set, rather than a key-value store I use map[string]struct{} (say),
rather than map[string]bool. Then I am forced to use the double assignment to
check for membership of the set. And that's exactly what I want. I'm able to
make my intent more obvious in the code I write. No one will ever look at it
and say "but what if it's false?" - I dislike using booleans instead of empty
structs in the same way I dislike other C programmers using integers as
booleans.

~~~
burntsushi
> I dislike using booleans instead of empty structs in the same way

Eh? If you have `map[keyType]bool`, then a key lookup is simply the set
membership function. If a key exists, it returns true. Otherwise, false. That
certainly doesn't seem analogous to abusing integers as booleans...

~~~
Jabbles
How do you work out how many elements are in your set?

~~~
_ak
With the built-in len function.

------
carbocation
This is a nice article. I would encourage the author to use the term
"goroutine" instead of "thread" when referring to Go's goroutines, because
they aren't threads. The pros and cons of threads do not apply to goroutines.
What's cool (for me) about this article is that the author knows this fact
(and states so at the outset), but he re-discovers it and internalizes it over
the course of his implementation.

------
EllaMentry
This article and many like suffer from one of my huge pet-peeves, absolutely
terrible coding conventions.

I am a person who likes to scan articles, I'm busy and generally make a read
now, read later, read never decision. The code from first scan was unreadable,
short 1 character variable names, "why is there a hardcoded date marked
2006.01.02-15.04.05 there??", etc.

Readable code takes a little more time - but it's worth it!

Further, the entire example seems contrived. Am I right in thinking that
simply firing up wireshark would solve this problem? Why is the author
continuing to write something in nearly every language when a tool exists for
exactly this purpose, is multi-featured and pluggable?

Even further, message passing! With the rise in recent years of message
passing libraries in nearly every language, multi-threaded, distributed
applications are becoming trivial to write. I do not see the Go code presented
as anything other than messy, I have seen C++ code utilizing message passing
libs that are smaller, prettier and infinitely more maintainable - again with
no mutex or conditional variables!

If you want to sell me on Go, make the code pretty, and present a USP.

~~~
Jabbles
I'm afraid many of the conventions you complain about are standard Go, err,
conventions... (Although see my complaint elsewhere.) Because they are so
widely used, people who use Go won't bat an eyelid. Unfortunately that doesn't
make the article a great introduction to Go.

I hope these talks can help sell you on Go
<http://blog.golang.org/2013/01/two-recent-go-talks.html>

To be more explicit:

Short variable names generally reflect the idea that you know what a variable
is for just by knowing its type. Thus you have a file named f, a time t, a
variadic argument called v. When the type is not enough, a longer name is
recommended. Naming things is hard though...

The hardcoded date is a wonderful piece of the time package, which I fully
appreciate will look bizarre at first. (And therefore isn't a great thing to
use in a first look at Go, without explanation). See the official
documentation <http://golang.org/pkg/time/#pkg-constants>

------
laureny
> > Notably, if a package is included but not used, Go treats this as an error
> and enforces removing unused declarations

A good illustration that the Go designers didn't think their ideas through.
This is a real pain in the butt when you are writing code and regularly
commenting in and out sections of code while you are testing things. And every
time you do this, you need to remove or restore the imports. And since Go's
tooling is nonexistent, there is no IDE to do this automatically for you.

This kind of thing belongs in a compiler plug-in (if it was designed with such
a thing in mind, which is not the case for Go), macros (if the languages
supports them, ideally the hygienic and statically typed kind) or an external
tool, not in the compiler.

~~~
jbarham
False. The Go designers have explained many times why they made unused imports
an error. In particular unused dependencies slow down compilation. There's
even a FAQ: <http://golang.org/doc/faq#unused_variables_and_imports>.

A trivial workaround to silence the compiler errors during development is to
use blank identifiers
(<http://golang.org/doc/effective_go.html#blank_unused>). I use them all the
time.

You may disagree with their decision, but it's disingenuous to claim that it's
because they didn't think it through.

~~~
WayneDB
Maybe they did think it through, but their conclusion reveals some
inexperience.

It's impractical on many levels. Thus the need for kludgy solutions like blank
identifiers. I'd rather see a strict mode or some other type of compiler flag.

~~~
jbarham
Maybe you don't realize that "they" includes Ken Thompson himself... You may
disagree with the Go design team's decisions, but it's amusingly absurd to
accuse them of inexperience.

~~~
laureny
> Maybe you don't realize that "they" includes Ken Thompson himself... You may
> disagree with the Go design team's decisions, but it's amusingly absurd to
> accuse them of inexperience.

Not really, if anything, Go shows that its designers have a lot of
inexperience when it comes to modern language design.

Go would have been a kille language in the late 90's but it seems to ignore
everything that we've learned about language design in the past decade.

~~~
pjmlp
Quite true, every single feature can be traced back to 80's and 90's languages
that for whatever reason did not manage to become mainstream.

Maybe one could make a table stating the language feature and which language
provided it for the first time.

~~~
pjmlp
Just an update on my comment.

Even if the language is like that, if it helps improving the situation where
young developers learn that strong typing does not have anything to do with
VMs, I find it quite positive.

