Hacker News new | past | comments | ask | show | jobs | submit login
Even in Go, concurrency is still not easy (utcc.utoronto.ca)
106 points by benhoyt on Sept 3, 2020 | hide | past | favorite | 122 comments



That's a classic deadlock bug - two producer/consumer relationships in opposite directions.

Have another goroutine receive and output the "found" results, rather than doing it in the first loop. Now you have a proper pipeline.

The big win with goroutines is that they're green threads. They can block, and you can have lots of them. With regular threads, too many will use up resources. With "async", anything that blocks or just takes too long locks up the whole system. Go has the best of both worlds, which is very useful for the case of a server holding open connections to a huge number of human-paced clients.


I've never worked with a green threads system in any real sort of environment where I was actually solving real problems at scale, but I always wonder, does the underlying runtime ever push them into actual real threads that can actually run on another core, or are they always the imaginary concurrency of using the gaps where another green thread needs to wait the comparative eternity for some I/O?


By default, a go program runs computations on as many threads as there are CPUs in your system (this can be adjusted via gomaxprocs). The execution of all goroutines is distributed amongst these threads. For maximum efficiency you want to avoid spawning much more threads than there are CPUs, as scheduling has an overhead and each thread consumes valuable memory for the stack space. The Go model of running many goroutines distributed across threads is a very efficient solution and also very flexible, as the program writer does not have to make any assumptions about the CPU count of the machine the program runs on.


Yes, the runtime pushes them to actual threads as needed.


know nothing about go, but surely the solution is using something like select?


That only solves work blocked on I/O, and it only solves a subset of that because you can't move all I/O to select/poll/epoll/kqueue. One example is regular files, which cannot be made non-blocking.

There are some async APIs for regular files and we've got io_submit coming in, but it's still not a complete solution and never will be.

On the other hand, spawning threads works everywhere and is cheap enough.


You may also be blocked for non I/O reasons. This is a typical problem in user interfaces. Code designed to deal with 10 items suddenly has to deal with 10,000, and that takes seconds. During this period, your "async" code is stalled. Typical example - text edit box implemented in Javascript trying to deal with a large block of text.


> One example is regular files, which cannot be made non-blocking.

AFAIK this is true on Linux but it’s not true on Windows, which allows “overlapped I/O” on files. Not sure about other operating systems.


With io_uring Linux now also has async file I/O. Though async file I/O in both Linux and Windows is still implemented using a pool of kernel threads, in contrast to network I/O.


I mean go's select or wathever the channel multiplexer is called.


Ah, you're saying that you would,

    select {
        case send <- x:
            ...
        case y := <-recv:
            ...
    }
Ambiguous because the select() syscall is also relevant to the discussion.


Yes something like that. Me sayings i know nothing about go didn't help. I at least knew about select.


Hmm, I don't know. I had this limited concurrency issue on the first Go program I ever wrote, and I'm very new to concurrent stuff, certainly if you only count the past ten years or so.

I figured this out in Go about 30 minutes after I realized I needed to limit concurrency, and I'm a dumbass, basically.

So, I'm not sure that "still not easy" is accurate, especially given that I have a lot of trouble conceptualizing async and await, as used in C#. In fact, I don't think I've ever successfully used that paradigm in C#. I've been told that C# async and await are "easy" and this article says Go concurrency is "still not easy" and I've had the exact opposite experience in both cases.

In fact I really don't understand why async and await is even a thing anymore, given the paradigm that Go uses and the example it sets. It is almost supernatural in its ease, for me.


The paradigm that Go uses only fits a very narrow case of concurrency, and actually it is quite old.

Modula-2, Concurrent Pascal, Active Oberon already had similar constructs.

Async/await are configurable, not only can you replace the task scheduler algorithms, if the type being applied on does provide some magic methods, the runtime will use them instead of the default ones.

This allows for very powerful concurrency optimizations and workflows, that still look like plain async/await calls.


Async/await can easily be implemented through channels. I view channel as providing superset of functionality then async/await at least. Granted there are other paradigms like lock free synchronization which can not be directly be translated to channel.


Something that looks like async/await and has a rather bad performance can be implemented through channels.

Nestable stackless coroutines cannot. By definition they require compiler support.

Your proposed solution will not have any of the advantages of stackless coroutines (effect typing of async functions and perfectly optimal memory use) while suffering all the performance penalty of running on top of a channel.


Channels have close to zero performance penalty. If you are talking about goroutines, you need to come up with data or code. I found goroutines to be really cheap to create.


Not true: https://www.jtolio.com/2016/03/go-channels-are-bad-and-you-s...

I'm pretty sure the implementation has seen some improvements since then. But using a channel for async await is the same as using a global mutex every time you await. That is bound to be contentious on high load.


I think today channels cost a few hundred nanoseconds. For most applications, this is 'close to no overhead'. If you are doing a really tight loop of course or have threads that bump into eachother a million times per second, you're going to want a different primitive.

But most programmers aren't going to see any overhead using channels instead of something else.


Please remember the approach is "every context switch should go through a centralized channel". This including waiting on all other channels.

I need to see some benchmark to this level to be proven the overhead is negligible.

And of course, there is still the question of WHY would you ever want to do that. What can be gained from simulating an async-await model on top of a channel.


So how do you change the scheduling algorithm though channels?


async/await and channel are independent of scheduling. Nodejs doesn't have multithread but has async/await. I find goroutines lightweight enough in my personal testing and didn't tried tweaking it.


On .NET and C++ they surely are not, given that you can fine tune them either by providing new schedulers, magic methods on awaitable objects that can be either value or reference types.

So not only you can control the scheduling algorithm, you can control how the async/await state gets managed in memory and also introduce additional parameters that influence the decision when to switch tasks across owner threads.

Try to do this in Go via channels, while offering just the channel syntax to the caller.

EDIT: The only thing in common between JavaScript, Python, .NET, C++ and Kotlin async/await is the name of the primitives.


Custom stackless coroutine schedulers are also available in Kotlin.

Python also also allows you to customize the event loop, I think, but overriding AbstractEventLoop. A M:N implementation would probably be inefficient though because of the GIL.

Basically almost all stackless implementations I know allow for custom scheduling. The fact that some of them can't do M:N and are restricted to 1:N is just a limitation of dynamic programming languages. There is a global interpreter state which prevents you from using multiple OS threads efficiently. You either rely on GIL (CPython and Ruby MRI) or rely on just allow only one thread (Node, Lua).

So GP is right in that not _all_ async/await implementations will allow you to fully customize your threads but I think the main point here is that stackless coroutines are practically _a prerequisite_ for enabling custom scheduling (like C# et. al allow). You cannot have this custom scheduling with stackful coroutines, because that would require the scheduler to have access to low level memory management for controlling the stack.

I've never seen a language which allows that. I guess this could be doable on C++, but I really don't see the point, since it will require a lot more effort than the current stackless C++ coroutines.


That is my whole point, this is not doable with Go channels alone as primitive.


Yes, that's not a good primitive. You need some delimited form of call/cc.


> I've never seen a language which allows that.

It is not really hard. You can use the exact same scheduler you would use for stackless coroutines.

For example in C++ land you can use boost.asio to schedule both stackless coroutines and boost.fibers (or whatever have you). Boost.asio predates both and doesn't care.


> I really don't understand why async and await is even a thing anymore

Programming with threads is hard because they can context-switch at arbitrary points. Goroutines are nothing more than (lightweight) threads.

Stackful coroutines (as in Lua) aren’t much better, because they can context-switch at any function call.

Code that uses async-await is different - it can only context-switch at points explicitly marked with `await`.


async/await in the general sense doesn't have anything to do with the preemption policy of a particular language implementation. For instance, C# implement async/await as keywords on top of its existing concurrency model.

That's really what it comes down to: cooperative multi-tasking vs preemptive multi-tasking.

Windows 3.1 had cooperative multi-tasking. Everything went great until one program didn't yield control, and then the whole system ground to a halt. That should sound familiar to any Node.js developers with their event loop.

Cooperative multi-tasking also extremely complicates code that actually uses the CPU. If you look inside libuv, it even has to jump through hoops with its encryption functions by effectively calling yield() in between rounds of calculation.


>If you look inside libuv, it even has to jump through hoops with its encryption functions by effectively calling yield() in between rounds of calculation.

I would have assumed the Yield was there to help with protect against timing pokes...


Since stackless coroutine schedulers can often be customized, you could (and are encouraged to) launch coroutine code doing heavy performance on a thread based scheduler.


> Programming with threads is hard because they can context-switch at arbitrary points. Goroutines are nothing more than (lightweight) threads.

Goroutines don’t switch at arbitrary points, they get swapped out during IO, networking or sleeping (they’re cooperative not pre-emtive). So they’re not like threads in that fashion they’re more closer to JavaScripts event loop (I think both node and go use libuv?). The only difference is the await syntax isn’t needed because Go “awaits” by default, that is the coroutine will block and yield the thread while it waits.


> Goroutines don’t switch at arbitrary points

That was only ever an implementation detail, though - you always had to program as though goroutines were preemptive. And sure enough, from Go 1.14, “Goroutines are now asynchronously preemptible” [0].

> Go “awaits” by default

Yes, the main selling point of async-await isn’t cooperative vs preemptive, it’s the explicit context switches compared to the implicit context switches of threads, goroutines and stackful coroutines.

[0] https://golang.org/doc/go1.14


I’ll bet your Go code, if it isn’t trivial, has race conditions all over and you just don’t realize it.


It's not trivial and I found a lot of races, yes, but I refactored over a couple days to completely isolate work within a goroutine so that data is only passed in and out of them via channels, and with that all races went away. The tool has been used in production extensively for a couple of years now, converting 3D model files from one format to another, and it's running great, and very fast as well.

So, you're right, but it has been addressed.


Can't that be said for most languages? Is there anything special that other languages do to make race conditions less likely than they are in Go?


I believe it's correct that that can be "said for most languages".

Since you're asking, Rust's ownership/type system, for example, has checks in place that prevent at least some classes of concurrency errors.

For example, you can't directly modify a variable from multiple threads, due to the so-called "move" rules:

    let mut myvar = vec![];

    thread::spawn(move || {
        myvar.push(1);
    });

    myvar.push(1); // can't do this; variable has been moved


I think Go invites mistakes like this being made over and over, because (in my subjective opinion as someone who writes a lot of Go at work but isn't extremely happy about that) you have to write out the basic low-level concurrency patterns over and over, because the language tries to resist abstraction.

In other langs I'd expect to see this code use a generic map-reduce/parallel map helper function or something like that, but in Go people seem to be happier writing for-loops and spawning goroutines by hand every time.


A lot of people are making tangential points in this discussion (“concurrency is always hard!”) but you’ve put your finger on the key point.

Concurrency is hard, and one fairly successful approach is to provide battle-tested libraries for these high-level use cases (“run M functions with max parallelism N”). Go gives you low-level concurrency tools, but actively resists the kind of abstraction that would let you make reusable high-level tools, so that robust library doesn’t exist.

It’s a strange lack, as it’s meant to be a concurrency-focused language, and the standard library is absolutely terrific in other areas (eg networking and encryption).


It just shows golang designers lack of experience in language design (it's stuck in the 70's). They may have written a lot of code, but it doesn't automatically make them good language designers.


I'm not really sure about this. I think the language they designed has the properties they desired, in that they consciously forewent a lot of potential benefits because some notion of simplicity is more important to them. So in that sense I think they were successful at designing a language.

Like some other comment here says, you run into bugs like this early on and they're not necessarily show-stoppers. Maybe keeping the language simple and ensuring that most code is readable if you only know the language and the stdlib (and not some fancy concurrency library) is worth the occasional bug?

Of course that doesn't necessarily mean that Go is the language I want to use for every use case under the sun, but that probably wasn't a goal of Go to begin with.


> I think the language they designed has the properties they desired

That's a tautology though :) Of course any language designer will design a language according to the properties he/she desires.

> you run into bugs like this early on

Not necessarily. Race conditions are hard to find. It's established that "simplicity" is not a straight forward metric to measure. Having a simplistic language just means that complexity is being pushed on the programmer, since most non-trivial programs are complex by definition.

Regardless of what the goals of golang were (they kept changing their definition of "systems language"), the fact that it's being pushed for almost everything, due to hype and fad driven development, and the inexperience of many programmers, is an issue. It's almost ironic that golang is used more outside of google than in.


> That's a tautology though :) Of course any language designer will design a language according to the properties he/she desires.

Eh, you can just fail, too, no? Like I could sit down and set myself the goal of designing a language that achieves goal X while staying under complexity limit Y (by whatever measurement), and then end up having to compromise on either of them because I can't quite figure out a good way to achieve a particularly tricky part of X without adding a lot more complexity than I initially expected. Or I could decide I want to design a language that's portable across many CPU architectures but inadvertently bake in a lot of x86isms that make it awkward/inefficient to implement elsewhere.

> Not necessarily. Race conditions are hard to find. It's established that "simplicity" is not a straight forward metric to measure. Having a simplistic language just means that complexity is being pushed on the programmer, since most non-trivial programs are complex by definition.

Ah, I didn't mean that concurrency bugs are quickly found or solved. I definitely agree with the just-moving-the-complexity-around part. I just meant that soon after getting started with Go, you'll probably have run into a bunch of concurrency bugs already and developed the kind of mental scarring and automatic averse reactions to certain concurrency situations that'll let you cope and still get things done despite it being clearly suboptimal.

> the fact that it's being pushed for almost everything, due to hype and fad driven development, and the inexperience of many programmers, is an issue.

But this is a kind of success too, isn't it? Presenting your solution to a problem in a way that appeals to enough "inexperienced" people to create the hype/fad isn't trivial, so they must have been doing something right. I don't think it's just a case of "$bigcorp does it, so it's cool", plenty of $bigcorp languages or technologies don't really catch on, let alone on Go's scale.

I think there really is a niche where Go is the best solution (or at least a significant local maximum), even if that is only "'inexperienced' programmers can get started with, say, highly concurrent http services", it's still _something_.

(Again, disclaimer: I use Go at work and I like to give my (non-google) employer and my coworkers enough credit to think that it's not just because of hype/fad, so I may be biased.)


> Is there anything special that other languages do to make race conditions less likely than they are in Go?

Yes, other languages provide you with more tools than the basic nuts and bolts. For example a language might provide you with a "parallel for" primitive that is well tested.

The issue is not exactly that Go's concurrency stuff is bad, it's just that it doesn't give you much help so you have to implement it all yourself. Also the lack of generics means it is difficult to write libraries for this stuff.


The frustrating part is that the standard lib is just good enough that you also don’t tend to get good adoption for third party libraries, which could easily provide better abstractions.


Some languages emphasize persistent data structures, or even enforce immutability, which means you can only introduce race conditions via explicit concurrency. That's a lot less surface area for bugs.


Yep, Ada/SPARK: https://docs.adacore.com/spark2014-docs/html/ug/en/source/co...

Search for "race condition", or just "race".

This section in particular is about preventing data races: https://docs.adacore.com/spark2014-docs/html/ug/en/source/co...

This one is about preventing race conditions: https://docs.adacore.com/spark2014-docs/html/ug/en/source/co...


Clojure uses immutable data structures and software transactional memory:

https://clojure.org/about/concurrent_programming


I love clojure, and I’ve written some high concurrency production apps in it. Concurrency is still hard in clojure, and that’s assuming all of your state and logic can live in a single process. That assumption usually doesn’t hold, and then concurrency gets way harder.


Go has a built in race detector that most teams run during CI.

[1] https://golang.org/doc/articles/race_detector.html


This roughly detects data races, not all race conditions.


And not all possible data races.


This does not run if you have more than 10K go routines :(


Calling the stackful coroutine model as offered by Go superior to stackless coroutines ("async/await") is highly controversial.

Stackful coroutines are not new - it might surprise you to know that they are older than async/await, existing in full form probably introduced at least in 1967, with the first documented implementations starting around 1958. Stackful coroutines were implemented in a wide array of popular languages in the 1980s as pjmlp noted.

Stackless coroutines need better clarification, since the paper that defined this term [2] refers to a limited subset of stackless coroutines that we would nowadays just call "generators". To the best of my knowledge, when the paper came out generators were the only type of stackless coroutine in existence, and thus came the perception that stackless coroutines cannot be nested (they can) and that stackful coroutines are strictly superior to stackless coroutines (they aren't). You can see an example discussion here: https://news.ycombinator.com/item?id=16318535

The real difference is not nesting, but the fact that stackful coroutines manage a growable stack dynamically at runtime, while stackless coroutines pass the job on to the compiler to evaluate all possible nesting permutations, and create an optimally memory-efficient stack machine that represents the coroutine.

To summarize the pros and cons of each, I think you can say:

Stackful pros:

* [No colored functions](https://journal.stuffwithstuff.com/2015/02/01/what-color-is-...) - summed up it says: I can write asynchronous I/O code the same way I write multi-threaded code or non-blocking code. It's definitely more ergonomic, but also means your code can hold more surprises for you. Not everybody thinks 'Colored Functions' are an absolute advantage, otherwise we wouldn't have Effect Typing. * Less GC allocations: stack is [re-]allocated in bulk every time time a lightweight starts (go statement) or whenever it needs to grow. Stackless coroutines typically allocate a smaller object on every call to an async function.

Stackless pros: * Clear yield points: you always know the points where your coroutine could suspend and control would be moved to another coroutine: whenever you see an await statement[3]. This is why function colors are needed - to be explicit about the control flow. * Less memory waste and less overall allocation size: the "compile-time stacks" generated by stackless coroutines are perfectly efficient. There is no unnecessary memory allocated.

I don't think it is clear that one model is strictly superior to the other, but I do believe that if you're aiming for ease of correct use (rather than ease of use), the stackless model is clearly better. It is definitely somewhat harder to wrap your mind around, but it is also somewhat harder to get a deadlock or have a hard-to-detect data race in this model.

[1] Simula 67 (and probably Simula I too), and first implemented in Assembly as early as 1958 by Melvin Conway of Conway's Law fame.

[2] http://www.inf.puc-rio.br/~roberto/docs/MCC15-04.pdf by Roberto Ierusalimschy, the author of Lua.

[3] It's slightly different in an awaitless language like Kotlin - here you actually have to check whether the function you're calling is suspendable or not.


> Calling the stackful coroutine model as offered by Go superior to stackless coroutines ("async/await") is highly controversial.

Yes. Preference among programmers for one model or the other appears close to 50/50.

More languages seem to support stackless coroutines than stackful ones, although that could just be because they’re easier to retro-fit to an existing compiler/VM. It’ll be interesting to see which model new languages choose.


The coroutine and communication by channels paradigm which Go introduced to me (I recognize that it is a very old paradigm, but it was new to me at the time) is just so very easy for me to reason about.


Communicating through channels is completely orthogonal to the type of coroutines (stackless or stackful). I would argue that Kotlin does channels strictly better than Go, since Kotlin channels do not have any of the problem of Go channels[1]:

1. No blocking forever on nil channels (WTF?) 2. Close is idempotent: No panic when closing a channel twice. Now you don't require 1 or 2 extra channels to measure proper closing of every channel :) 3. Send and receive behavior on closed channel is consistent - a easy-to-handle exception on both cases. Go panics on one case and returns a nil value in the other. 4. You can check the closed status of channel. 5. There are non-blocking versions of send and receive.

I haven't tried Rust channels yet, but I assume they also deal with these issues by having a sane API.

Go was not a pioneer at all here - it just had a Google-level PR to push its "revolutionary" ideas. The concepts of channels is much older than Go. It's not a secret that channels are based on the CSP model by Tony Hoare, and Rob Pike has a long history of researching CSP and integrating CSP constructs into his languages while he was at Bell Labs[2].

Channels are not the possible construct here. I'd argue that earlier languages implemented channel-like constructs in a much safer and saner way: e.g. Erlang's actors or Ada's Rendezvous (which dates all the way bac to 1983!).

[1] https://www.jtolio.com/2016/03/go-channels-are-bad-and-you-s... [2] https://swtch.com/~rsc/thread/


async/await isn't really concurrency. It's syntactic sugar over callback hell, which can be used to write concurrent software in the "spawn something, wait for it to finish later" model. It has absolutely nothing to help you passing data between ongoing threads, accidental unlocked access to shared data etc. etc.


This used to be my assumption long ago, but this is incorrect for almost every implementation async await.

The compiler does not syntactically converts your awaits to callbacks - it converts everything into state machines.


C# converts to state-machines, F# converts to nested callbacks. (Although this may change, or be optional.)


Interesting. The only other instance I remember of (sort-of) awaits being converted into callbacks are LiveScript "backcalls":

https://livescript.net/#functions-backcalls


I think that's just "easy"/"not easy" from a different baseline, not a comparison of C# vs Go. C# async/await is easy compared to something like callback-based asynchronous IO. Go concurrency is "not easy" compared to serial code.


Concurrency is hard and we have very poor support for testing correctness of concurrent and distributed systems. Language abstractions help but they aren't nearly enough (as evidenced by this post). My team at Microsoft leverages Coyote to check the safety of our services against such subtle race conditions. We blogged about using it to reliably reproduce and fix a very subtle bug in a bounded buffer implementation over at https://cloudblogs.microsoft.com/opensource/2020/07/14/extre...

If you're using .NET in your projects, you can start taking advantage of such tools _today_. I would like for such tools and testing techniques to become more and more common place in the industry as concurrent and distributed systems are _hard_ and we should use all the help we can get.


Go comes with thread sanitizer, which you can enable with go test -race ... If your unit test exercises a race condition, this will blow up your test with stack traces of the data race.

It sounds a bit like Coyote, which also looks very useful for C# applications.


Neat to learn about thread sanitizer. It sounds similar to another tool from Microsoft Research called Torch (https://www.microsoft.com/en-us/research/project/torch/) which automatically instruments binaries to detect data races. Coyote is similar in some ways but different in others. Coyote serializes the execution of the entire program (running one task at a time), exploring one set of interleavings and then rewinding, and then exploring another set of interleavings, hoping to hit hard-to-find safety and liveness bugs. In addition to finding concurrency bugs in one isolated process, we use it to find bugs in our distributed system by effectively running our entire distributed system in one process and having Coyote explore the various states our system can be in. It sounded mind-boggingly cool when I first came across this way of testing distributed systems through Foundation DB (https://www.youtube.com/watch?v=4fFDFbi3toc); we're emulating this kind of testing in our distributed system through Coyote. And unlike Foundation DB which had to develop their own variant of C++ to be able to do this kind of testing (kudos to them for doing it), Coyote allows us to do it on regular C# programs written using async/await asynchrony and benefit from decades of Microsoft Research in exploring large state spaces effectively.


link to the tla+ model of the situation and the proof that proposed solution would work: https://lobste.rs/s/ntati1/even_go_concurrency_is_still_not_...


Did Go ever make concurrency easy?

Any modern language allows you to create threads (semantically equivalent to goroutines) and pass messages between them. Most of them even have high-level concurrency support such as parallel map, structured concurrency, reactive extensions, type-safe generic collections...

We could say that Golang makes concurrency using thread semantics scale to a large number of threads. When this paradigm is appropriate, then indeed Go is nice. The select statement is nice. But this is not always the right approach.

Yes, it is true that go needs slightly fewer characters to launch a naked thread than in C# or Java. In that sense, it is easier. Anyone who considers that a major selling point is a walking race condition.

As a tongue in cheek analogy, what if we consider my new language G, in which G is an alias for goto. G is going to finally make control flow easy. I've done away with complicated control flow constructs such as for or while - users wouldn't understand them anyway. Users are encouraged to use the G keyword liberally. After all, it's only 1 character. Indeed, by that metric, it's even simpler than go's go keyword. What could go wrong?


ESL here. What do you mean by “ walking race condition”?


A "walking X" is a person who is or is going to create an X. For example, an unpleasant boss might be a "walking turnover problem".


Thank you for your detailed explanation. I really appreciate it.


My personal style here is to use a number of goroutines equal to the desired concurrency. I know goroutines are cheap and there is no particular reason to conserve them most of the time, but this feels more natural to me than using a semaphore--although it is less flexible.

Simplified:

    var wg sync.WaitGroup
    wg.Add(jobs)
    ch := make(chan workItem, jobs)
    for i := 0; i < jobs; i++ {
        go func() {
            defer wg.Done()
            for item := range ch {
                processItem(item)
            }
        }()
    }
    for _, item := range items {
        ch <- item
    }
    close(ch)
    wg.Wait()


Here's how I like to do it:

https://play.golang.com/p/AB6exUIkNRg

Contents pasted below.

  package main

  import (
   "fmt"
   "sync"

  )

  func main() {
   wg := &sync.WaitGroup{}
   e := NewParallelExecutor(uint64(10))

  for i := 0; i < 100; i++ {
    wg.Add(1)
    func(_i int) {
     //make closure to capture i correctly
     e.Submit(func() {
      fmt.Println(_i)
      wg.Done()
     })
    }(i)

  }

  wg.Wait()
   fmt.Println("All done.")
  }

  type Executor interface {
   Submit(func())
  }

  type parallelExecutor struct {
   queue []func()
   inCh chan func()
   doneCh chan struct{}
  }

  func NewParallelExecutor(limit uint64) Executor {
   pe := &parallelExecutor{
    queue: make([]func(), 0, limit),
    inCh: make(chan func()),
    doneCh: make(chan struct{}),
   }

  go func() {
    running := uint64(0)
    for {
     select {
     case <- pe.doneCh:
      running -= 1
     case f := <- pe.inCh:
      pe.queue = append(pe.queue, f)
     }
     if len(pe.queue) > 0 && running < limit {
      running += 1
      fmt.Println(fmt.Sprintf("Goroutines in flight, waiting: %d | %d", running, len(pe.queue)))
      f := pe.queue[0]
      pe.queue = pe.queue[1:]
      go func(_f func()) {
       _f()
       pe.doneCh <- struct{}{}
      }(f)
     }
    }
   }()

  return pe
  }

  func (p *parallelExecutor) Submit(f func()) {
   p.inCh <- f
}


Go just sells itself well.

Languages that _actually_ make concurrency easy, such as Haskell, Scala, OCAML or distributed concurreny easy (Erlang/Elixir) are way ahead of Go but are not as hyped and are not marketing themselves as well.


You think Go has a bigger hype machine than Haskell and Erlang? The number of blog posts that have been written about how great Erlang is is comparable in total size to the amount of Erlang code that exists.


> You think Go has a bigger hype machine than Haskell and Erlang?

In absolute terms? Yes In terms relative to the userbase? No In terms relative to how powerful or well-designed the language is? Certainly


What don't you like about go's design? I see this sentiment all the time that go is poorly designed. What I don't see all the time is why people have this idea. Is it because its not as strongly typed or mathematically obsessed as other languages? Is it because it doesn't add much that is deeply novel like e.g. Rust?


Not person you're replying to but I'll bite.

I think Go has some pretty good thoughts behind it's design. That is not to say that it is perfect by a long shot. Lack of generics is the main reason I've not actually tried to write any.

But I've looked at some, and even helped debug a tiny bit.

And that was the -nice- thing about it; the patterns are simple enough that you can learn it quickly.

The drawback is that you have to write a LOT of code (or Interface hackery) to make complex things.

And, on one hand, the syntax is still the same. On the other hand it's still more code to write.

I think about C# a lot when I think about Go's simplicity; I'd argue there's a sweet-spot in C# where you COULD write code that is expressive and easy to understand but still more flexible. However, at this point there are 3.5 different ways I can think of off the top of my head to go through an Array/List in the language.

In Go there's usually not too many ways to do the needful. That's an advantage for newcomers, but the proverbial ceiling isn't glass; it's hard iron, and you can't go above it.


I think people dislike Go because it is missing the fancy feelings. I am more convinced every year that a lot of programmers just love complexity. Simple things will be blown up and abstracted 3 times. This is still possible with Go of course but it is not as easy as with other languages.

Go is like a rusty old hammer that will just work and there are not too many ways to hold it. I like Go for this. I think it has a great design. I can do most things from the top of my head. Code reviews are less painful. And I like that the language evolves more slowly and does not pile every crap idea on top of itself. I also know that a language doesn't have to fill every niche..

In the end Go is a productivity language. If you want to get stuff done it is an excellent choice. If you want to muse about the beauty or correctness of your code there probably are better choices.


Surprisingly I have been getting stuff done 30 years before Go got invented, with more expressive languages.


Wouldn't have thought it to be necessary to say explicitly but here it comes:

You can also get stuff done with other languages. Big surprise.


> Go is like a rusty old hammer that will just work and there are not too many ways to hold it.

And there are non-rusty hammers that also "just work" and don't exactly introduce an insurmountable dilemma when it comes to holding them, as well as having pretty basic variations to the hammer head that actually make it easier to "get stuff done" (such as a claw or a peen).

But the people using the rusty hammer like to pretend as though anything other than the most basic form of a hammer (double flat head on a handle) is just "fancy feelings" driven musing about "beauty or correctness".


This is the type of comment I'm talking about. You clearly have strong feelings about Go, and have shared a metaphor to express your distaste for it, but I have learned nothing about why you feel this way. Would you mind sharing? What is this "insurmountable dilemma" or the "basic variations to the hammer head" you're referencing?

As its written, your comment reads more like language warfare than discussion


> You clearly have strong feelings about Go, and have shared a metaphor to express your distaste for it

I was extending an analogy that was already presented. The context is, funnily enough, right above my comment if you'd like to read it.

And I don't have strong feelings about Go; its existence is ultimately irrelevant to me as a dev. I do have strong feelings about a certain cadre of Go enthusiasts, however.

> What is this "insurmountable dilemma"

It doesn't exist. That's the point. Said enthusiasts like to pretend as if everybody else is absolutely drowning in a dilemma of which way to achieve X in their language of choice, while Go is for "getting stuff done"™, because obviously no-one ever did anything apart from bikeshedding prior to Go's appearance.

> or the "basic variations to the hammer head"

Any number of programming language features/concepts which some Go enthusiasts like to decry as unbearably fanciful complexity. If it's not in Go then it must be Bad regardless of being objectively superior to whatever Go offers as a replacement. Parametric polymorphism is perhaps the most infamous of these, and even with the Go team moving forward with a specification for generics there are still people that complain about the language losing its "simplicity".

> As its written, your comment reads more like language warfare than discussion

Of course. And relegating people's preferences in their work tools to "musing about beauty and correctness" (presented as caring less about productivity) is "discussion" not language warfare.


I initially just wanted to let your post stay uncommented, but I also don't want people to get the wrong idea, so here is my comment to your last sentence:

> In the end Go is a productivity language. If you want to get stuff done it is an excellent choice.

Maybe for mediocre developers. And there are a lot of them and that is totally okay - everyone has been one at some point. But more experienced and developers who do their job with passion and strive to improve, will be limited by the language very quickly which makes them much less productive. Therefore, with good developers, go will reduce productivity in comparison to some other languages.


What a nonsense. If you are limited by a language then maybe you just chose the wrong language for the job. I guess your argument about the vast ecosystem of go libraries and tools is that they are all created by mediocre programmers?

Another note: most of us are "mediocre" and many of those who think they are somehow great, often turn out to be just "mediocre" themselves when put into a certain situation.

And maybe you are truly a genius, only work alone and Go really limits you. You are free to choose whatever language you want. But don't project your outlier experience onto everybody or even onto teams of people with vastly different skill levels.


> Another note: most of us are "mediocre" and many of those who think they are somehow great, often turn out to be just "mediocre" themselves when put into a certain situation.

Yeah I fully agree - there is nothing bad about that at all. But that does not change what I'm trying to say: once you grow more skills have more experience (which happens to most people), go starts to limit you a lot more than other languages. No "genius" needed. And it's not only me thinking that if you look at the other responses here.

So, hype go as an easy language to learn? Fine by me!

Hype it as a language that is more productive than others? I have to disagree.


Dude it is not because mathematical obsession or something we dislike Go.

Consider checking whether an element `e` is in collection xs

JS: xs.includes(e)

Python: e in xs

Java: xs.indexOf(e) != 1

C++: std::find(xs.begin(), xs.end(), e) != xs.end()

It is already pretty verbose in C++.

Go: contains := false for _,x := range xs { if x == e { contains = true break } }

Some fanboys claim Go is readable. Readability is not about conveying what each line does in micro level, it is about conveying __Intent__.

Go is nothing more than C as far as facilities to convey intent are concerned.


I didn't say that I don't like go's design, but there are no features in go that justify the hype it gets. That's all.

Rust for instance is not among greatest when it comes to language features either (the borrow checker is quite a thing though), but it combines lowlevel/performance with FP and language features that enable a high level of abstraction. I'm not really a Rust fun either, but that is a personal opinion and has nothing to do with my impression of if the hype is justified.


Concurrency isn’t easy in any language.


That's a point worth emphasizing. Functional programming is advocated as a means of simplifying programming. When you program with pure functions which always return the same results for the same arguments, things become like mathematics, expressions can be simplified and reasoned about. Yes.

But I don't see how such a purely mathematical viewpoint could fit nicely into the real-time world where processes evolve concurrently and can take different times to do things on different executions. In the real world things are mutable, and concurrent programming must take that into account. Therefore it is "complicated".


Erlang comes pretty close to achieving this. When/if people get past the “weird” syntax it’s a very straightforward model that minimizes a lot of the hazards, though locks are definitely still possible.


I really like Fred (ferd) Hebert's take on erlang and concurrency and distribution problems (Earlier in the book he points out how concurrency and distribution are a similar type of problem).

https://learnyousomeerlang.com/distribunomicon

> distributed programming is like being left alone in the dark, with monsters everywhere. It's scary, you don't know what to do or what's coming at you. Bad news: distributed Erlang is still leaving you alone in the dark to fight the scary monsters. It won't do any of that kind of hard work for you. Good news: instead of being alone with nothing but pocket change and a poor sense of aim to kill the monsters, Erlang gives you a flashlight, a machete, and a pretty kick-ass mustache to feel more confident

> This is the standard 'tools, not solutions' approach seen before in OTP; you rarely get full-blown software and applications, but you get many components to build systems with. You'll have tools that tell you when parts of the system go up or down, tools to do a bunch of stuff over the network, but hardly any silver bullet that takes care of fixing things for you.

As for the weird syntax, I use Elixir (https://elixir-lang.org/) on the daily and its pretty great. See here for a different write up I did https://news.ycombinator.com/item?id=24173635


The actor model struggles when you run up against something that is naturally sequential and the order matters.

If you want a 1000 households to visit a set of shops in a repeatable Random order until the shops runs out of stock, that’s a List sort with a seed and a sequential loop

Make the 1000 households concurrent and sending messages and it becomes a major exercise in clocking, scaling and scheduling.


Erlang processes (Actors) solve this kind of problem naturally, with code that is concise, asynchronous and parallelized. For example, look at solutions for 'Sleeping Barber'. It's possible to write a full solution in <40 SLOC.

Starting 1000 processes is trivial and fast in Erlang. See the first few pages of Joe's presentation from 20 years ago: process creation time ~10us (up to 30k processes); message time ~1us [1]. The code is in his book and the email thread [2]:

[1] https://www.rabbitmq.com/resources/armstrong.pdf

[2] https://erlang.org/pipermail/erlang-questions/2007-July/0280...


It’s rather more tricky than the sleeping barber. It’s more that their are 100 hairdressers, each household has a set of seven of them and each hairdresser only has a fixed amount of hair colour so not everybody will be served, and you need to be able to repeat the visits deterministically based on a random seed so the only whole process is verifiable.


If anything I learned from writing concurrent code before, and after doing it in Clojure - when talking about uncomplicated concurrency, you have to begin with immutability. Without immutability by default, that conversation is a non-starter.


That's a fair point. Having said that, it's also interesting to realize that most distributed systems (which are concurrent by their very nature) don't have that luxury. Our micro-services interact with databases, event queues, blob stores etc and each of those external entities is shared mutable state. Furthermore, services can crash at any time and can't start from a clean slate (unlike in-process concurrency which doesn't have to worry about that). My meta-point is that while I agree with your sentiment, the reality of modern day services is that you are _forced_ to reckon with mutable complexity when designing distributed concurrent services (and more and more of us are doing that with the shift to micro-services in the industry)


I like Scala but it inherits Java's issues with shared mutable state and limits on OS threads and stacks. Kotlin coroutines look good but I think they're very new. I haven't learned how Haskell might reconcile concurrency and laziness (is forcing a value serialized?)


I am a professional Scala developer and I can count the number of times where I have created or used shared mutable in the last two years on one hand. No one forces you to do it and the Scala community has developed excellent tools to mitigate these problems when using Java libraries.

Same for OS threads. We use green threads in Scala for a long time now, there is a great amount of library support. Here is some example: https://zio.dev/docs/overview/overview_basic_concurrency


I do think that Go has a great set of tools to write concurrent programs. But I think it is a fallacy to believe that as a consequence "concurrency is easy" in a general way. These tools still require the programmer to be aware of the challenges of concurrency. They do make it more easy to deal with them.

It is the same story as with garbage collection. Garbage collection prevents some kinds of errors and in general makes it way more easy to deal with dynamic memory allocations. However, garbage collections does not mean you don't have to think about allocation patters and object lifetime for example.

Go does make concurrency much easier by the tools it provides. Most importantly are goroutines, which are very lightweight, so within reasonable limits, you don't have to be concerned about the number of goroutines you spawen - but as the example shows, you shouldn't try to spawn more goroutines as there are file handles available to your process, if every goroutine allocates a file handle. Not only are goroutines very lightweight with small growable stacks, but as they are part of the language specification, the compiler can generate code which helps with scheduling, in most cases you have efficient cooperative sheduling, so thread-based preemption is less frequent.

On top of that, channels provide a very easy to use abstraction for communication between goroutines. There are a lot of use cases, where goroutines and channels give you very easy and safe concurrency. It does not save you from understanding concurrency issues and especially deadlocks. Which in a general sense is impossible to achieve, because to prove that a program is deadlock-free would require to prove that all goroutines return and that is equivalent to the halting problem, which is unsolveable.

It should also not be underestimated, that due to goroutines and channels being part of the language spec, these are very commonly used features. Concurrency is present in most Go programs. As a consequence, when one reads a lot of Go code, there will be plenty examples of their usage. And any library needs to be thread-safe as the likelyhood is very high that it does get called from goroutines.


The problem is defined, but all of the proposed solutions don't seem fitting. When using a semaphore, it's usually for a reason. I think the solution with the least overhead and perhaps most idiomatic is to spin the found reading loop in a goroutine BEFORE you start any at all, and leaving the rest of the code as is. This prevents having x number of idling goroutines, where x can be billions(but not in this case). Though again, in this case, using a semaphore and perhaps goroutines at all seems spurious.


I’ve been wondering for like 15 years when a language is actually going to show up where I can just write my code procedurally and the compiler automatically figures out what it can parallelize.

Like if I have 3 definitions in a row that set variables to the result of methods that don’t share memory, that seems pretty obviously parallelizable. Why isn’t anyone doing these sorts of optimizations?

I would have thought it would have been solved by now, but I was wrong.


That sound 'cheap' but if you think of the clock rate of a CPU, doing those tasks in parallel will take time to setup, process and teardown. That example doesn't really work well.

Fitting problems into grids of processing like SIMD or GPU's is tricky because not a lot of problems fit those well like images. Most of what you're asking for is about understanding business flow which a language will never do.


Because it doesn't help performance. It only really helps if you execute calculation on different threads - but then you need to prepare for these threads to fail (e.g. be killed by the OS) and to handle this.


kdb+/q automatically parallizes operations in the most recent release, but since it's closed source, it's unclear how far it actually goes.

It would be cool if the compiler could construct a DAG of operations and automatically offload independent computations onto different cores.

I think this would necessitate a purely functional language, though.


Maybe I'm dumb but I don't understand the bug. I understand why the main code blocks if no tokens are available in the found channel but why would writing to `found` block the code (and why only in the case when there's a lot of go processes)?


We're blocking in `limitCh <- struct{}{}` if we're trying to spawn more goroutines (because `pss` is big, representing there being a lot of processes we're inspecting, not many go processes) than there is room in `limitCh` (used as a semaphore). The newly spawned goroutines never allow the main goroutine to proceed because they never terminate, and they can't terminate because no one is reading from `found` yet, so they're blocking on `found <- P`.


It deadlocks if len(pss) > concurrencyProcesses.


Even in that case, wouldn't the main loop block for a short time but then unblock after the goroutine returns (and the defer function is executed)?

I thought the whole point of limitCh is to support the case where len(pss) > concurrencyProcesses but in this case you're telling me that it's breaking things?


The inner goroutine cannot return because it blocks at found<-P, because found is not buffered and there is no reader.

This would work if the consumer was started in a goroutine first.


Thanks. The part about starting the consumer in a goroutine actually made more sense as a solution than either of the given solutions.

I guess I completely forgot how channels work and didn't realize that until there's a reader, the channel will block (which makes perfect sense in hindsight).


Yes, that's the whole point, but it's completely broken. I've certainly made the same mistake. :D


A more common way to write limited concurrency is to use a worker pool: https://blog.golang.org/pipelines


The following is my solution:

See the playground: https://play.golang.org/p/pXJaGQ0efe8

Source Code is available on GitHub: https://github.com/go-training/training/blob/2ddb95d08c654a6...


The protoactor library for golang: https://github.com/AsynkronIT/protoactor-go.

I can’t recommend it high enough. It makes working with highly concurrent go so much easier and is a blessing for anybody with prior experience in Erlang or Akka.


When I first came to Go, I'd heard a lot of people speak profusely about CSP and how Go made concurrency easy. I was disappointed at just how not easy it was.

For example, Go has no generic support for atomic primitives, arrays, slices, or maps. It has sync/atomic, but it's not generic. I always reach for Uber's atomic library, which has typesafe atomic wrappers such as atomic.Bool. They're so common, you'd think a CSP-aware language would provide keywords declare things atomic:

  var b atomic[bool]
  if cas(b, false, true) {
    ...
  }
"Classical" Go didn't have errgroups, contexts or cancellation. To build something truly robust, "modern" Go ends up involving all three (plus atomics, of course). That's because almost all complex situations need to build what are effectively nested trees of goroutines that all need to quickly abort and unwind on errors or panics.

But as you use more and more of these primitives, your code (or at least my code) becomes more and more obscured by the layers of error-handling, cancel-detection, retrying, and so on. I really wish some of this was built into the language, especially cancellation. For example, channels don't support contexts. So if you have a loop like this:

  for evt := range taskCh {
    handle(evt)
  }
...then you have to rewrite it to support cancellation. So you have a few options. One is to select on both the channel and context:

  for {
    select {
      case evt := <-taskCh:
        handle(evt)
      case <-ctx.Done():
        return ctx.Err()
    }
  }
Because of this, you can no longer use a for loop. Immediately the code got more bloated and less readable.

Another option is to spawn a goroutine whose only job is to abort the channel:

  go func() {
    <-ctx.Done()
    close(ch)
  }()
  for evt := range taskCh {
    handle(evt)
  }
  return ctx.Err()
  
That's a bit better (and you could move the for loop into its own function for clarify, without changing anything else). But it's still worse than a simple for loop!

As a real-world example, here [1] is some code I've been working on lately. A controller starts N workers that need to process a task queue. If the controller is stopped, all the workers need to stop. So I use the pattern above. But I can't use a for loop! That's because I use a task queue abstraction that needs to support features that Go channels don't.

I can't use a for loop or select block to block on the task queue, because only channels support that. So that's another problem with Go: Often, you can use raw channels as your data processing primitive if things are simple enough, but in practice, channels are best for coordination, and you have to build your own primitives — and yet, by building those primitives, you lose language expressiveness.

To address the concrete example, I'd love to do this instead:

  // Range over task queue (custom object with blocking
  // semantics), automatically cancel if ctx is cancelled
  for task := range taskQueue with ctx {
    handle(task)
  }
[1] https://gist.github.com/atombender/6bcff2c2d8fec32bc80ce1f57...


What would a test for this look like, having a lot of processes?


That's a great question. Stress testing, which is what you are suggesting helps, but is not super effective and often misses bugs. You need tools which can precisely control the task/go-routine scheduling during testing and systematically explore the various interleavings which can happen in the system. We generally don't have good tool support for such testing. There are promising tools emerging however; here is a case study of one such tool and how it was used to reliably reproduce and fix a subtle concurrency bug: https://cloudblogs.microsoft.com/opensource/2020/07/14/extre...


Maybe make `concurrencyProcesses` configurable and test with a bunch of different values relative to the input size?


try erlang!


Using goroutines in golang is not dissimilar from threads in other languages (except that you can spawn much more of them), with all the downfalls and gotchas. As a matter of fact, it's even worse than other languages with proper concurrency libraries and data structures (e.g. it has nothing remotely close to Java's `java.util.concurrent` package).

What we see with golang is a phenomenon where people just parrot what some well known figures said at some point in time, without proper evidence or any basis (and in some cases, even when the evidence is counter to those claims).

I'm looking forward to Java's green thread implementation (project Loom), as it has proper ways to manage cancellation and deadlines and hierarchies, all of which have are quite verbose and error prone, or not supported at all in golang.


But with Kotlin flow it is!




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: