Hacker News new | past | comments | ask | show | jobs | submit login
Updating the Go Memory Model (swtch.com)
248 points by gbrown_ 16 days ago | hide | past | favorite | 188 comments

> Note that this means that races on multiword data structures can lead to inconsistent values not corresponding to a single write. When the values depend on the consistency of internal (pointer, length) or (pointer, type) pairs, as is the case for interface values, maps, slices, and strings in most Go implementations, such races can in turn lead to arbitrary memory corruption.

That second sentence is one of the painful things about the Go memory model--Go is type and memory safe if you avoid explicitly unsafe things, except for data races on those structures.

I wonder what you can practically do to mitigate that at this point; doesn't seem like you go back and change things to be like Java. Can there be some best-effort detection for races in those specific places that's fast enough to run in prod (like I think they did for maps)? I also recall some Go mailing-list post I can't find suggesting that an SSE 128-bit move, while not guaranteed not to tear, seemed to tear only rarely in their tests on the chips of the time; is that true and is there anything there? I imagine fully preventing the problem with atomic reads/writes of these pairs from the heap is a clear no-go performance-wise, and might involve ABI-changing alignment guarantees.

[Edit: SSE2 behavior is discussed here https://stackoverflow.com/questions/7646018/sse-instructions... which is mentioned in a comment at https://research.swtch.com/gorace which seemed to inspire it. Chance of a race seemed to depend on CPU model at the time, bad on the old Core Duo, better on some server chips another answerer tried. Good on my laptop FWIW!]

Probably hard for anything to happen here because any performance regression is hard to swallow, the race detector is close enough for many, and anything you do won't be a 100% solution anyway. Which is a bummer.

I came here to quote and comments on this paragraph. The poke at C/C++ undefined behaviour, claiming Go has no such thing. Except this sentence is exactly undefined behaviour: after the data race, due to possible corruption, anything could happen.

There is also equivalently smugness about the language in general. It is one thing to say there are no undefined behaviour related to the memory model, but given a compiler optimizer, you need to do hard analysis that there cannot be once optimized, with all operations re-ordering that can happen.

The simplicity of the Go memory model, which is vague on purpose, is taken as making it safe. Is it? It mostly mean it is vague, which provide a convenient cover to hide under.

Edit: toward the end, the author talks about adding documentation about compiler optimizations that would be explicitly forbidden. So some of my earlier comments would be addressed.

> There is also equivalently smugness about the language in general.

This really turns me off about Go. And it seems to attract "tech-splainers" that defend the language choices that are made by talking down to people.

I don't know. A lot of the criticism of Go is just wild over-exaggeration. Is "tech-splaining" just setting the record straight? For example, the guy upthread is insinuating that Go is roughly on-par with C++ as they both have nonzero amounts of undefined behavior. Having used both languages extensively, I can say for certain that you will run into a lot more UB in C++ than in Go, and it will be a lot harder to debug when it happens. Similarly, the guy downthread who is making a mountain out of Go's nullable pointers (and all of the other people who complain about Go's type system)--yeah, they are strictly worse than Rust's enums, but the overwhelming majority of all software ever written has been built with languages that have nullable pointers and a good chunk of that is with languages that have fully dynamic type systems (not only is the type system going to allow your null pointer access, but it will allow all manner of type errors!). Is it "smugness" to put criticism into perspective?

If you look in the past history of language features that are being added or might be added (e.g. generics) you can find a whole lot of past explanation of why the language doesn't need those features in the name of simplicity and why the reader just doesn't understand the brilliance of Rob Pike.

Reminds me a lot of the Mac forums where every poor feature in an Apple product is explained with a "let me help you understand why you're thinking about it wrong" kind of answer -- up until Apple fixes the bug / implements the feature to the applause of the same people who were talking down why it would ever need to get fixed.

And just the tone of the article here turns me off in the way it begins with a bunch of quotes and philosophy. I actually agree entirely with “Don't communicate by sharing memory. Share memory by communicating" but I figured that out myself in probably 2003 writing crappy perl scripts that utilized parallelism but I wanted to aggressively avoid all the concurrency pitfalls. Actor models and Erlang later made completely intuitive sense to me. The principle is entirely correct, but its fucking weird that a programming language needs to have a list of "proverbs".

> If you look in the past history of language features that are being added or might be added (e.g. generics) you can find a whole lot of past explanation of why the language doesn't need those features in the name of simplicity and why the reader just doesn't understand the brilliance of Rob Pike.

Isn't this just nutpicking[0]? You have that with every language. I can criticize C++ on proggit right now and 3 or 4 people will respond "C++ has every language feature, just use the ones you want/like/etc and ignore the others, your problems are invalid!". Similarly, on an OCaml forum I can find a dozen people who tell me if Jane Street hasn't run into my problem before and solved it, then it's not a real problem and I'm dumb for trying to do it. I can post here or on r/rust (or proggit) and people will tell me that Rust is strictly faster/easier to develop with than any GC language because in the worst case you can always throw `Rc<T>` on everything. I can post on r/python about how hard it is to optimize Python and people will tell me I'm dumb and I can just use multiprocessing, rewrite the slow bits in C/Cython/etc, use numpy/pandas/etc. I can merely register for an account on a Java forum and be berated for my low intelligence. :)

> The principle is entirely correct, but its fucking weird that a programming language needs to have a list of "proverbs".

I don't feel as strongly as you do, but I agree that proverbs and analogies are pretty low quality and invite more confusion than they address. Probably only a rung above analogies and a rung below quotations.

[0]: https://rationalwiki.org/wiki/Nutpicking

> Isn't this just nutpicking[0]?

From that article, I'm specifically addressing "Does this movement _promote_ crazies?"

I'd argue the sloganeering attracts them like moths to a flame.

I don't know what is meant by "sloganeering" but if writing a full, nuanced article (e.g., TFA) fits the bill then I'm not sure I agree with your conclusion...

In other words, I agree that "sloganeering" attracts the nuts, but I'm not sure I agree that TFA is sloganeering.

proverbs == sloganeering

the focus on "simplicity" also is both a good thing, and a great way to shut down any discussion.

> proverbs == sloganeering

If the TFA amounted to "don't communicate by sharing data; share data by communicating <mic drop>" then yeah, I would be on your side, but the author went to the trouble of writing a 4K word essay to support his points. That you and others reduce it to mere slogans isn't a valid criticism of TFA IMHO.

To your point, there are actual people in and outside of the Go community who do this kind of lazy argumentation. For example, someone upthread said (and I'm hardly paraphrasing), "Hoare called nullable pointers a billion dollar mistake and Go has nullable pointers so... <mic drop>".

> the focus on "simplicity" also is both a good thing, and a great way to shut down any discussion.

Pretty sure you could levy the same criticism against the Java folks for "configurability" or the C++ folks for "performance/control" and the Haskell folks for "type safety". It still sounds like you're nutpicking rather than observing something unique to the Go community.

I don't think generics are a good example, as from what I understand their stance was "We haven't find a good way to put them in Go, but we understand that you might want them. We'll try some things and see which fits with Go the best.". That got warped to "you don't need generics anyways" by some overly zealous people. Maybe a better example would be the type system, with Rob Pike saying something about taxonomy being boring?

I myself like the "proverbs". It's short, simple ways that get me to understand what frame of mind I should be in when writing or reading a particular language.

Sometimes people don't actually want help, they just want to complain. Other people misinterpret this as a question or a debate, so their explanatory responses are received negatively. You'll find that this also applies to C++, Rust, every programming language, and anything in general.

you need to do hard analysis that there cannot be once optimized, with all operations re-ordering that can happen

The previous post demonstrated that this is extremely difficult, beyond the ability of cutting edge research for mainstream languages.


I'm a bit tired of reading motivated blog posts like this that bring up enough related work that you know they know solutions exist to the problems they're bringing up, but for some reason felt it was not necessary to bring up those solutions. This line in particular is plain false:

> None of the languages have found a way to formally disallow paradoxes like out-of-thin-air values, but all informally disallow them.

The post neglects to mention the Promising Semantics paper of 2017 that resolves the out of thin air problem to (as far as I know) everyone's satisfaction, despite pointing out the previous work that brought up the problem. Similar things are true for ARM's memory model, etc.--this is all stuff that's been mostly resolved within the last few years as proof techniques have caught up with compilers and the hardware. Ironically, the thing that's been hardest to formalize by far in a useful way (outside of C++ consume) is--surprise, surprise--sequentially consistent fences!

It also handwaves away as some unimportant point the reason why compilers provide things like relaxed accesses--it's not just (or even primarily) about the hardware, but about enabling useful compiler optimizations. Even if all hardware switched to sequentially consistent semantics, don't expect languages that aim for top performance to abandon weak memory. And personally, I think it's ironic that at a time when even Intel is struggling to maintain coherent caches and TSO, and modern GPU APIs don't provide sequential consistency at all, people are trying to act like hardware vendors will realize "the error of their ways" and go back to seqcst.

I had not looked at the Promising Semantics paper of 2017. Thank you for the reference.

That said, what I have learned from watching this space for a decade is that even formal proofs are no match for the complexity of this general problem. You have to get the definitions right, you have to prove the right statements, and so on. There is still plenty of room for error. And even a correct, verified proof is not really an insight as to why things work.

Experts were telling us in 2009 that OOTA had been resolved to everyone's satisfaction, only to discover that it wasn't, really. Maybe Promising Semantics is the final answer, but maybe not. We need to get to the point where everything is obvious and easy to explain, and we're just not even close to that yet.

Looking at citations of Promising Semantics in Google Scholar I found this Pomsets with Preconditions paper from OOPSLA 2020: https://dl.acm.org/doi/abs/10.1145/3428262. It contains this sentence:

"As confirmed by [Chakraborty and Vafeiadis 2018; Kang et al. 2018], the promising semantics [Kang et al. 2017] and related models [Chakraborty and Vafeiadis 2019; Jagadeesan et al. 2010; Manson et al. 2005] all allow OOTA behaviors of OOTA4."

I take that to mean the jury still seems to be out on excluding all OOTA problems. Maybe the canonical example has been taken care of, but not other variants. And again we don't really understand it.

Disregarding memory models, OOPSLA is not a good source for PL research.

This is very different and it seems a lot of people misunderstand this.

Undefined behavior is not "data structure being corrupted". When data structures are corrupted, program still functions as written.

Undefined behavior is "things fly off the ceiling", no logic applies and programs start behaving essentially randomly because of the optimizations.

Undefined behavior is not broken data. That is still a well defined behavior.

Behavior of a program after data (as in value) races can be defined and constrained.

If you have wild pointer writes (aka arbitrary memory corruption) and almost any non-trivial control flow, the results are pretty much undefined. As in, it is pretty much impossible to specify or reason about what a program will do.

The article:

> ... such races can in turn lead to arbitrary memory corruption.

Tearing writes themselves are not undefined behavior, but use them wrong and you're off into undefined territory, with no way back.

> Undefined behavior is .. because of the optimizations.

Optimizations are popular, but it doesn't matter why the behavior is undefined.

Also C/C++ was gradually developed so it can be understood. Go had the benefit of a lot of hind sight and still ended up like this

It's not "undefined behaviour". It's "implementation-defined behaviour".

Undefined behaviour would be, if the compiler thinks that it always happens, it (the compiler) can do anything, including e.g. transforming the whole program into

    int main() {
        return -1;
In Go's case, the compiler performs no such shenanigans - it just compiles your code as-is, and it's your responsibility to make sure it's correct - but exactly because it's not undefined behaviour, the analysis is much simpler than in C/C++ - there's no funny business going on in the compiler, and therefore all effects of the data race are local (in code) (i.e. no "nasal deamons").

> I wonder what you can practically do to mitigate that at this point; doesn't seem like you go back and change things to be like Java.

Wouldn't there be an obvious trade off there, requiring an extra heap allocation and indirection every time a slice is used in a data structure? I've always thought of Go as lower-level than Java (since it has explicit pointers, more control over memory layout, etc), which I think makes it a more useful language overall even if not the best fit for every application (since if you do want Java, it already exists and is mature).

In other words, I think both Java's and Go's decisions make sense here.

FWIW, I meant race semantics similar to Java's--weird things happen but type/memory safety remains. Not the whole Java runtime model with object headers, etc.

It's obviously, uh, quite an Exercise for the Reader(tm) to figure out the best path to avoid memory unsafety given races in line with other Go priorities (as a starting point, imagine a ton of expensive atomics accessing certain pairs on the heap, downgraded to regular loads/stores if the compiler can prove it's safe). But I was trying to say it seems harder to get there now that certain things are set in stone and regressing perf of anything is going to be bad for some user out there.

Sure, I was imagining implementing better race semantics the way Java/C# do, by referencing certain multi-word data by pointer to avoid partial update problems. As it is now slices are 192-bit values (on 64 bit systems) embedded directly in objects that can be shared across threads, I don't know how you would efficiently update that atomically.

Just in theory: pointer/capacity pair is atomic, then you read len non-atomically and check it's less than cap before using it. Slice header today puts len before cap, but, like, if we're dreaming.

You'd probably hit other walls. I'm def. not trying to say it's straightforward!

Is Java really memory safe in the face of races? The case I'm thinking about is this:

Shared state:

ArrayList A with intial buffer B and length L

Thread 1:

Grows A and replaces B with buffer C, length with M

Thread 2:

(Earlier loaded B into a register)

Code says write to A[M-1]. JVM checks M-1 against M but uses B as it was already stored in a register. Now you have a buffer overflow.

Except, people on this thread seem to think this is impossible. How does Java prevent this?

The buffer itself, being a Java array, is checked. You'll get an exception but you won't get undefined behaviour.

Of course. That makes sense. Thanks!

Yeah this can be a hard problem to solve.

In most managed languages to 'properly' deal with it you typically need to accept locking on some level (like with ConcurrentDictionary in C# or ConcurrentHashMap in Java).

One semi-elegant (yet internally complex) solution I've seen; in C# there is the language-ext library. All the collection types are structs (value types where things like .Add() return a new struct) however the struct itself just holds an implementation to a class. This works well from a user standpoint but the internals of the code is... interesting to say the least and can still lead to a lot of allocations if you're not careful.

> doesn't seem like you go back and change things to be like Java

Java is like Go in that regard. That's why you have to use ConcurrentHashMap instead of HashMap for instance.

Java's trick is maintaining memory/type safety in the presence of races. You might crash, and your application might not see the values it expects, but you never get a buffer-overflow-like situation where Java's internal structures might get corrupted in arbitrary ways, for example.

I see your point now. Yes, Java won't segfault, unlike Go on slices/maps/interface values. But even without segfaults, undefined behaviors due to race conditions can still be pretty in Java (causing infinite loops, memory leaks, etc.). Maybe it's better to let the program crash with a segfault.


"Undefined behaviour" has a specific technical meaning. In Go, data races can result in UB. In Java, they can't.

Java may be non-deterministic, or exhibit unintended behaviour in response to a race condition. UB is strictly worse because it could do either of those things... or indeed anything else, segfaulting being the least of your concerns.

This is the "technical meaning" of "undefined behaviour" according to Wikipedia:

> Undefined behavior (UB) is the result of executing a program whose behavior is prescribed to be unpredictable, in the language specification to which the computer code adheres.

Can we predict the behavior of a Java program with a data race?

A section of Java code with a data race can’t break anything that the same section of code couldn’t have been rewritten to intentionally break without using a data race. For example: it might throw an exception, loop infinitely, or return valid values that the programmer didn’t expect; but it cannot corrupt the internal state of the JVM, violate type-safety, fiddle with private fields, or access data that isn’t reachable from its own variables. Data races cannot be used to violate a security barrier between mutually untrusting sections of Java code.

You're right. That's a big advantage of the JVM. It's memory safe even in presence of data races.

That post from Russ Cox (again) explains quite well why Go is not memory safe in presence of data races, and what should be changed to fix this: https://research.swtch.com/gorace

That's less a definition, and more a description of what UB looks like.

Here's a better definition: https://en.cppreference.com/w/cpp/language/ub

This depends on what "unpredictable" means. Do I know precisely what this java program will compute? No. But I do know some things that it won't do. Its behavior is not completely unbounded.

That's a really bad standalone definition. It would apply to randomly generating numbers, or most attempts at allocating memory and then sorting by address.

Others said this different ways, but I think the scenarios where you don't get an immediate segfault at the bad read/write are the trickiest ones--hard to debug, at least. One flavor is, say, writing past the end of a slice b/c you read the pointer from one slice and len/capacity from another. If it segfaults, it might be later when the clobbered object is next used.

A few years back someone at Dropbox mentioned tricky prod-only races were a real problem for them sometimes, for instance: https://about.sourcegraph.com/go/go-reliability-and-durabili...

That's where I start dreaming about best-effort detection of races in production binaries or even just reducing the chances of torn reads. Years back there might have been other options, like explicit private vs. shared heaps with more controls on the shared heap, but there seems to be a more restricted set of choices now.

I agree that most of the time the segfault will happen later during program execution, which makes debugging very tricky. From that point of view, Java preserving memory safety even in presence of data races is a big win. I guess that's also what makes Rust so appealing for programs needing memory safety with concurrency and no GC.

No it is not better. Java will preserve the integrity of the runtime. Still memory safety issues in go due to races are probably very hard to exploit so it might not make a huge difference in practice.

Yes, the JVM will keep executing the program, but I'm still not convinced this is strictly better than segfaulting if a thread is stuck in an infinite loop using CPU for nothing, and another is leaking memory.

The problem is, UB is not guaranteed to segfault. A guaranteed segfault when a bug is hit would be great - but randomly getting one of (segfault, silent memory corruption, invalid instructions, ...) is not great, and most people would consider that to be much different than what Java gives you.

It is better than invoking random code because the runtime mistakes some data for a vtable pointer.

I agree that a data race on an interface value in Go, is worse than an infinite loop or a memory leak, because it can lead to calling the wrong method on a value, which is hard to debug.

...and easy to exploit

Not easy, but definitely possible:


> We have to admit that exploiting this requires a fairly specific situation in which there is a data race we could trigger and some structs with function pointers around.


I would note that Java is designed to run untrusted code, so fat pointers would be unacceptable as the attacker could easily craft the code required to trigger the race. Go does not claim to provide sandboxing of untrusted programs.

Java will throw a ConcurrentModificationException. Go can cause undefined behavior.

This is not guaranteed at all:

> Note that the fail-fast behavior of an iterator cannot be guaranteed as it is, generally speaking, impossible to make any hard guarantees in the presence of unsynchronized concurrent modification. Fail-fast iterators throw ConcurrentModificationException on a best-effort basis. Therefore, it would be wrong to write a program that depended on this exception for its correctness: the fail-fast behavior of iterators should be used only to detect bugs.

Source: https://docs.oracle.com/javase/8/docs/api/java/util/Vector.h...

This is conceptually similar to the race detector in Go, which is also a debugging tool, but without strict guarantees.

Java might throw a ConcurrentModificationException. The docs are quite clear that it's not guaranteed and best-effort only.

> Go is type and memory safe if you avoid explicitly unsafe things, except for data races on those structures

Note that this isn't actually a real exception. maps are implemented in the go stdlib using "unsafe", in addition to compiler magic for the generics part.

I believe that it holds that if you don't use "unsafe" then your program is memory and type safe, it just happens that the go stdlib and runtime use "unsafe", so you can't safely use them either.

I get that this is a distinction without a difference.

> it just happens that the go stdlib and runtime use "unsafe", so you can't safely use them either.

It's still the job of the code author to make it so you can safely use that code, despite internal use of unsafe.

Data races aren't limited to maps, so I'm not sure it's even a distinction at all :)

I think data-races that can cause undefined behavior are limited to maps and slices and other structures that internally use unsafe.

data-races that can't cause undefined behavior are a different class, closer to what java has, and iiuc that's what you get without unsafe.

I don't think the distinction matters, but I think it exists.

The slice is simply unsynchronized, so a user of the slice may pull out one array but a length that was for a different array.


They use `unsafe.Pointer`s under the hood, but I'm pretty sure that implementation detail is unrelated to the race condition (the issue with the race condition is that writes of these types aren't atomic so a concurrent thread that is reading the value could observe it in a partially-updated state (e.g., the pointer field has been updated but not the length field).

As far as I understand all interface pointers in go are fat pointers to (vtable, object) and as such can't be updated atomically [1]. Hence any data race on assignment to an interface pointer can lead to UB.

[1] unless there is an additional indirection, I don't really know much about go.

I think Go was wise to begin by telling programmers if they care about these details they're being too clever and will regret it. Detailed memory models so far over-promise and under-deliver and the main justification for them is trying to squeeze out that last few percent of performance which is not what Go is for.


> The first, exemplified by C and C++, is that programs with data races are invalid: a compiler may break them in arbitrarily surprising ways.

The problem isn't just the compiler. If you write code with a data race the CPU might break your program in arbitrarily surprising ways too. Go programs don't rely on your C++ compiler, but they do run on your CPU and not a virtual machine.

Is Go really going to be able to deliver defined semantics here in the face of that?

> The second, exemplified by Java and JavaScript, is that programs with data races have defined semantics, limiting the possible impact of a race and making programs more reliable and easier to debug.

This was definitely the intention. But, do the results match that intention? It seems like in fact Java programmers are still confused by what happened when there's a data race, even though the semantics of a data race in Java are more limited, they're beyond the conception of the programmer anyway.

Do you go less insane if you see only two Great Old Ones than if you were to see a host of them? Or does the distinction not make a difference?

Later this document proposes the new Go model should say:

> such races can in turn lead to arbitrary memory corruption.

That's a much worse outcome than in Java. So I'd be astonished if Go programmers are able to meaningfully debug a program in which a data race has smashed unrelated data structures. In fact, I'm dubious as to what difference is even left between "arbitrary memory corruption" and C++ undefined behaviour. Experts will sometimes draw on their experience to guess correctly what went wrong. Specialist instrumentation might help find your root cause. But these are the same in C++

> Java programmers are still confused by what happened when there's a data race

IMO, the most important thing is that JVM runtime isn't being confused into executing attacker's code.

Most programmers are unable to precisely predict behaviour of their programs anyway, — they get accustomed to being confused.

If this is the definition then JavaScript doesn't meet it very well, given how often there are browser exploits.

That's not how this works. This discussion is about preventing a something like a server from being exploited. How many stack smashes have you seen against a server written in a memory-safe language?

> In other words, Go encourages avoiding subtle bugs by avoiding subtle code.

I find this philosophy strange when they talk about the communication primitives. Channels are extremely subtle in my experience. A channel can be buffered or unbuffered, it can be closed, and there’s different behavior of read/write whether it’s in a select or not. And then you often end up with multiple channels as well. Reviewing code with channels is often more confusing than code that uses mutexes in my opinion.

It took me a loooong time to fully wrap my head around channels. They’re definitely not explicit and full of subtle yet devastating bugs when used inappropriately (which is all to easy to do too). Some of the boilerplate code around using them is just ugly too.

That said, there are occasions when channels have proven invaluable. I think the real issue is they were branded as a killer feature but in reality their usefulness is a little more niche than a mutex.

> I think the real issue is they were branded as a killer feature but in reality their usefulness is a little more niche than a mutex.

Agree - and many people are surprised to learn both: a) how fast mutex ops are in practice, and b) how channels use mutexes under the covers.

Mutexes and channels really overlap in some functionalities, but not totally overlap, okay?

They could have been a much more powerful primitive, but not having generics means tons of boilerplate; all basic channel patterns involve a lot of copy pasting.

Once generics have been out there for a while and the patterns of use in the community are clear/stable, I think there's a decent chance they'll revisit channels to help clean that up. It certainly is frustrating.

Channels in Go lack priority support which made them unusable to express some patterns.

Fortunately one can code a replacement for channels in Go using mutexes and semaphores. And that exercise in turn allows one to see that in many cases the channels are just a bad model. Things like having a single polymorphic priority message queue per go routine suits many cases better than dealing with multiple channels.

And this has been known since nineties if not eighties when Ada had to add mutexes after early designs based purely on channels. So it is puzzling why Go made channels such a central feature when the priority queue just works judging by Erlang or many successful C/C++ libraries.

Correct me if I am wrong, isn't a channel basically same as blocking queues in Java? I know go but not much advanced.

Seems like there are a few differences but nothing I personally would leave normal indexing, kind of working generics, Maven and a choice of 3 world class IDEs behind for: https://stackoverflow.com/questions/10690403/go-channel-vs-j...

To each their own though. I hear a lot of people prefer it.

More or less, but with the ability to "poison" them by closing which causes downstream go-routines to exit. You can do the same thing in java by checking for a poison object and then terminate your loop.

The close is built into the language itself which is useful, along with the other built in language functionality. I wrote about it here, with an example of how to do it https://boyter.org/posts/file-read-challange/


I don't code Java, but I have the same sentiment about channels in golang to queue's in python.

The killer feature over a mutex, in my opinion, is that mutexes encourage sharing by default which is terrible for both program correctness and performance.

That depends massively on the kind of problems you’re trying to solve. I’ve written some software where channels were the right tool, and others where mutexes were the correct choice.

This is a shame, because if done correctly, “share memory by communicating” eliminates a whole class of bugs.

I agree that the ergonomics of Go channels leave a lot to be desired. Also, channels are still a primitive: you still need to build abstractions to do anything nontrivially useful with them. I think something like a good coroutine abstraction, or thread nursery abstraction would be ideal.

You should not blame Go channels for your inexperience of using them.

For some scenarios, the implementation code will be quite complicated if only mutexes are used, whereas the implementation will be quite simple by using channels.

I think your frustration comes from you wanted to use channels for any scenarios. This is not recommended too. https://github.com/golang/go/wiki/MutexOrChannel

It fails to mention the performance penalty with channels.

For some scenarios channels and mutexes are both suitable for, mutextes are a bit more performant than channels. If you do care about the performance penalty, just choose mutexes. But the for scenarios mutextes are incapable of, still using mutexes is surely not a good idea.

This would be a good thing to add to the reference.

Dunno. Tell people "channels are about 3 times slower than mutexes" and watch them scramble the hell out of their program trying to avoid channels, when the reality is "in general, both of them are fast enough that they will not be the bottleneck on your program".

In general (not just Go!), don't send tiny amounts of work across any sort of internal boundary (goroutine, thread, spark, green thread, whatever they call it). The amount of work you send should significantly exceed the costs of sending it. Follow this advice and the perf differences between mutexes and channels will almost certainly not be relevant; fail to follow it and neither of them will be fast enough. I don't think there's actually a lot of programs in the wild where the performance difference between the two would be the make-or-break difference. Non-zero, but not many. In practice it's pretty clear developers will spend much more time worrying over it than is justified.

(A common benchmark I see new programmers apply to any language that claims to be good at multithreading is to try to "parallelize" the act of adding a few million integers together by sending individual integers or addition problems out to threads, and wondering why it's 10-100 times slower even though all my CPUs are at 100%, so why does multithreading suck so hard in this language? The problem is that no matter how cheap the send operation is, it's going to be dozens or hundreds of operations, whereas a single integer addition is generally a single cycle, or possibly, amortized to less than that depending on how good your compiler is. You just need to be sure to send units of work larger than the cost of sending them, which is generally not that hard. This isn't just Go, I've seen this charge leveled at Haskell, Erlang, and Python's multiprocessing too, and I'm sure others have seen it in their own communities. It's a very common mistake.)

The amount of work you send should significantly exceed the costs of sending it.

Jerf, you did it again! This is gold! Succinct and to the point.

I'm going to think on this one.

This implies that you know the "costs of sending it."

For channels and other things in the local OS process, you should have a good idea. Give or take being in major swap problems, etc., but I think of that as an "already lost" situation; at that point the problem isn't the expense of moving things between threads or local processes anyhow.

When you get out into discussing network matters, it gets a lot fuzzier. You may "know" that you're only .025ms away from something in some specific system, but your TCP-based API doesn't necessarily "know" that and will happily use the exact same code to do something that takes 4 orders of magnitude larger. That's enough difference to be problematic.

'Send'ing things via a Go channel is all in the same process, between goroutines (sometimes between threads).

Because there is no generic penalty, and worrying ahead of having a problem to solve sounds like a terminal case of premature optimization.

Not sure if it's the case nowadays, but Go channels were historically around 4x slower than the equivalent mutex-based code. This is a Big Deal™ if you're writing performance sensitive code or in a hot loop.



Comparing a FIFO internally wrapped in mutexes and conds, designed to pass messages, to a single mutex makes very little sense. Of course a channel is slower than one of the bare primitives used to construct it!

The proper question to ask is whether your application (or your programmer) lends itself with the CSP paradigm or not, and whether this allows you to write simpler, more robust code.

Extremely performance sensitive code will always need to be written In lower level paradigms, using e.g. atomics directly - this doesn't matter for most code though. And to be frank, extremely performance sensitive code is past the capability of Go for quite a few reasons.

I've had the opposite happen, where because of mindsets like this, that it doesn't get properly discussed and going in blind having to deal with it later in a case where it was clear that it would matter.

The mindset is still correct - it depends too heavily on your usecase to make a generic statement.

For anything performance sensitive, the only thing that matters is to profile your code.

I agree with this. Channels have not lived up to their promises, and I mostly avoid them in Go. They're handy sometimes when you want to just block some goroutine (read from a channel that is never written to) or for a few other use cases, but mutexes are my go-to primitive.

it always shocks me when someone says that channels are not immediately intuitive. they make so much sense to me, and they did from the instant I read about them.

combined with goroutines, channels are absolutely one of the best features of this language, to me.

async & await, on the other hand, still confuse me and trip me up today, even after using them for years, because of very weird implementation minutia and edge cases that I always seem to stumble into.

I wish async and await would disappear from the face of the earth and I wish coroutines and channels were in every language I use.

I've been using Go avidly since 2012. I still run into deadlocks and panics (writing to a closed channel) when I try to use channels beyond the simplest use cases. Error handling in a parallel context is also hairy with channels. I've learned to minimize my use of channels and prefer mutexes as my default concurrency primitive.

It really only means that CSP looks nice on paper but (at least as implemented in Go) doesn't work out well in practice. There's lots to like about the language even if the CSP theory didn't pan out.

My understanding is that if you follow the rule of using generator functions (creator of the channel is responsible for writing and the closing) it’s impossible to write to a closed channel.

When would that pattern not be useful?

The rule that the writer should be responsible for closing a channel is a good one to keep in mind, but it is often the case that you want to launch an indeterminate number of generators and collect the results from all of them. For example, you may want to make one connection to each server in the config, or to process each file in a directory in parallel. Because the arms of a `select{}` are determined at compile time, it cannot be used to select over the variable generator-owned channels, and you have four more difficult options:

* Use `reflect.Value.Select`[1]. Having to reach for reflect feels ugly for such a common case, and the performance of the reflect-select is much lower than the native select.

* Create a single channel owned by the reader, pass it to each writer, and arrange for this channel to be closed when the final writer exits, through a waitgroup. There is an example under "Parallel digestion" in the Go Concurrency Patterns blog post[2]. Note the little details to get right. We must launch a separate goroutine to monitor the waitgroup / channel closure. If we accidentally do it in-line at the wrong level, everything will work fine if the total number of items written to `c` is less than `c`'s capacity, but will hang once a worker becomes blocked on `c`. Additionally, the waitgroup is threaded directly into the writers, which may be more difficult if those are implemented in some other generic package.

* Wrap the above pattern up into a `merge` function, such as the one under "Fan-in, Fan-out" in the Go Concurrency Patterns post[2]. The lack of generics means we will have to copy-paste this function everywhere we want to use it. Additionally, this launches a goroutine for every channel being watched, which strikes people as "expensive" for such a simple operation.

* We can construct a function that takes two channels and launches a goroutine that selects between the two and writes to a merged output channel. By constructing a tree of these we can merge an arbitrary number of channels. This is really just an optimization of the above.

None of these options are particularly intuitive. Too often I've instead seen developers create a single channel owned by the reader and either:

* Assume it is never closed and the reader doesn't terminate until the application does

* Rely on some external mechanism to know when to stop reading. If the reader can stop reading without confirming that the writers have stopped writing, this can lead to the writers becoming blocked on sending into this channel, which may prevent them from performing necessary cleanup actions (signaling `.Done()` on a waitgroup, for instance) that cause hangs in other areas.

* Thread a cancellation ctx through every reader and writer. This ensures that nothing hangs, but can result in messages that are sitting in the the channels being dropped. If other areas of code have an assumption like, "every accepted request will receive a response", this can break that.

In addition, many developers have a gut instinct to add some amount of buffering to their channels, which usually results in these backpressure / channel issues being papered over during low-load unit tests, only to rear their head during higher load integration tests or in production, when the debugging story is much more difficult.

[1]: https://pkg.go.dev/reflect#Select

[2]: https://blog.golang.org/pipelines

> Use `reflect.Value.Select`

Never really a good idea, and never necessary.

> create a single channel owned by the reader

Channels cannot be effectively owned by their reader(s), the contortions you have to bend the code into to make that work never really make sense. That's just a constraint of the type, but it's hardly a problem -- it makes the thing easier to model. So this isn't really an option on the table.

> a Merge function

Yes! The answer. And goroutine per channel is kind of the point of using them! Nothing inefficient about it.

> a function that takes two channels . . .

Now there's some inefficiency! ;) No reason to do this, given Merge.


> None of these options are particularly intuitive.

The merge option seems perfectly intuitive to me, assuming you understand channels have to be owned by a singular writer.

This kind of trouble is exactly why Rust's channels feel much more intuitive to me than Go's.

With channels in Rust, the channel is closed when either all senders or all receivers are dropped. This means that doing the default obvious thing is also correct, for a much larger set of tasks than made easy by Go's API choices, and it stays correct under refactoring.

> I still run into deadlocks and panics (writing to a closed channel) when I try to use channels beyond the simplest use cases.


The rule is that channels are owned by a single goroutine, who's uniquely responsible for sending on them, and closing them. That's basically it. Do that and everything works fine, in my experience.

same, I like go and I like channels in theory but they are too primitive in practice. I am much more likely to use waitgroups and errgroups than anything with raw channels

Channels as a high-level abstraction are pretty simple, but API and implementation details matter.

Go's implementation of channels are very simple to use for some very simple use-cases, but Go's API choices mean there are a lot of subtle, non-obvious details you need to learn and keep in mind to do anything nontrivial.

I really like https://medium.com/justforfunc/why-are-there-nil-channels-in... as an example. Reading from two channels safely should be a simple task, but just doing the intuitive thing will look like it works for many uses until it starts fabricating zero values, or blocks forever, or spins the CPU at 100% doing no work.

I really love channels, but I really hate working with Go's channels.

The channel API in that other language well-known for its good concurrency support is much simpler to learn and use for me, without as many subtle sharp edges, but that's possibly a bit off-topic, so I've removed a detailed comparison.

What you described are all well defined in the docs.

Sure, it's all out there, and it's possible to build useful software using Go's channels.

I was specifically trying to explain why I say that Go's channels are not intuitive, because they require studying and memorizing these other arbitrary complications.

I'm also curious about what docs you're referring to, exactly. Here's what I found when looking for golang channel docs:


If the documentation's described behaviour, along with code patterns to accommodate that behaviour, are intuitive to you after reading this, then you have a very different perspective on the world than I do.

I also found these: https://tour.golang.org/concurrency/2 https://golang.org/doc/effective_go#channels

Unless I've missed it in my reading, I don't see any of these clearly stating that the single-return form of channel receive will fabricate zero values when misused, or describing how you need to replace a closed channel with a nil when selecting on multiple channels to avoid spinning the CPU when it's been closed.

I agree that this stuff is learnable. I have learned it, and so have you. I agree that there are learning resources out there that help with learning the nuances of using Go's channels well.

Hopefully this can help you feel less shocked the next time someone says that Go's channels are not intuitive. If you disagree, can you explain more about how Go's channel management choices are more intuitive than the alternatives to you?

[Edit: I found the documentation on producing a zero value when reading from a closed channel here: https://golang.org/ref/spec#Receive_operator]

> Unless I've missed it in my reading, I don't see any of these clearly stating that the single-return form of channel receive will fabricate zero values when misused, or describing how you need to replace a closed channel with a nil when selecting on multiple channels to avoid spinning the CPU when it's been closed.

You certainly missed this: https://golang.org/ref/spec#Close

You might also need read https://blog.golang.org/pipelines and https://blog.golang.org/concurrency-timeouts

If you need a good summary, you could read my articles: * https://go101.org/article/channel.html * https://go101.org/article/channel-use-cases.html * https://go101.org/article/channel-closing.html

I too thought Go’s channels are intuitive.. until I found Channel Axioms: https://dave.cheney.net/2014/03/19/channel-axioms

I keep going back to that page whenever I need it, because I could never remember what the axioms are.

these are 2nd nature to me now, and if I ever use another language that has channels, and does NOT have these axioms, them I'm basically screwed.

The main issue is channels are a very low level primitive (not far from mutexes and atomics). When generics come out we there will emerge a library that handles a lot of the common channel idioms that people are implementing (perhaps somewhat incorrectly).

Thanks like fanouts, multidispatch, sinks, chaining, simple concurrency, etc will likely emerge in a lib in the future.

This definitely the biggest thing I'm looking forward to in Go's post-generics world. It's frustrating to write the same concurrency abstractions like fan out, error handling etc for every type and reason about channel behavior carefully every single time.

Hard agree about channels being a mess in Go. I avoid channels as much as possible in favor of mutexes but the problem is the community is so aligned towards using channels you get funny looks when you solve problems with mutexes.

Whever I tried to use channels I regretted it at the end. I believe the channels in Go are the worst part of the language.

I use mutexes.

How could it be? Programming with channels is not only intuitive but fun for many cases: https://go101.org/article/channel-use-cases.html

Yes, channels are not always best solutions for any case. There are cases mutexes are more suitable. Just choose the best solution for specified cases. Always sticking to one is not a good attitude.

I agree with this.

We used to do a coding exercise at a previous job that was really well suited to mutexes and whenever someone tried to use channels they ended up with a much more complex solution.

Equally, there are lots of cases which lend themselves to channels really well. Waiting on multiple things at once is hard to without channels and select for example.

I agree (well, maybe not the worst, but certainly one of the worst), but in retrospect it's hard to say whether Go would have taken off the way it did, had it not pushed the concurrency angle so hard. I think Go's focus on software engineering (as well as its "stubbornness" in general) is far more valuable than its concurrency features, but there's no question the latter is more exciting and likely to drive growth.

> A channel can be buffered or unbuffered

Yes, with predictable results.

> it can be closed

Yes, by the owner/sender, who then understands it as invalid. The effect of receiving on a closed channel is well-defined and predictable.

> and there's different behavior of [recv/send] whether it's in a select or not

I don't think that's true? What do you mean?

> And you often end up with multiple channels as well

Sure! That's part of their power.

> Reviewing code with channels is often more confusing than code that uses mutexes

I totally agree that goroutines and channels are often over-used by new Go programmers. If what you're trying to do is simply protect concurrent access to some shared state, a mutex is the far better choice.

Not to mention that there is no way to ensure immutability when passing messages in channels, and the language doesn't help you there. This is a recipe for race conditions.

Furthermore, you can easily have hierarchies of "goroutines", everything has to be painstakingly built from scratch and is very error prone.

> Another aspect of Go being a useful programming environment is having well-defined semantics for the most common programming mistakes, ... Quoting Tony Hoare

One of the most common programming mistakes of all, dubbed by Hoare himself as the "billion dollar mistake": null pointers, yet it was put into go without any consideration.

> One of the most common programming mistakes of all, dubbed by Hoare himself as the "billion dollar mistake": null pointers, yet it was put into go without any consideration.

I get the distinct impression that people who invoke this boilerplate Hoare quote argument don't understand that null pointer bugs are "a billion dollar mistake" because nullable pointers have been idiomatic in every major programming language for the last 40+ years, not because they are individually particularly expensive.

Yeah, Rust-like enums would probably be strictly better than null pointers, but people talk about null pointers like they are going to doom your project when in reality they're mere papercuts--a small subset of all type errors--and there are whole projects that are written in fully dynamically typed languages! Indeed, there are worse problems than type errors in a language--you could have an impoverished ecosystem, poor tools (especially build tools), or abysmal performance, and many of the most popular programming languages have two of the three and null pointers. Go is fortunate to have only null pointers (papercuts) working against it.

I've never bothered to look up the hoare thing, but null as a member of all types increases the amount of time spent studying any given API (and of course, there are lots of null pointer bugs from people who didn't bother to study the API sufficiently).

Yeah, as previously mentioned, I fully agree that nullable pointers kind of suck and exhaustive pattern matching on enums is strictly better. I'm taking issue narrowly with abusing the "billion dollar mistake" quote to exaggerate the severity of the problem.

It's a big problem in Java at least. In Go you have value types by default and they try to make zero a meaningful value.

It's not a big problem when you look at the problems that plague even the most popular programming languages. At this point I can only refer you to my previous posts because I'm repeating myself. :)

Which problems do Java and C# have that golang doesn't? Incidentally, both of them have solutions to the billion dollar mistake, which golang doesn't.

Which is arguably even worse, because it goes on undetected and the code ends up producing incorrect results silently.

The thing that always got me was that closing a channel on one end (I think the receiving end, but I can't remember exactly since I haven't programmed Go for a while) causes an error if the other side is open, but doing that on the other end (I think the front) doesn't give that issue. Sure, there's an explanation for that, but I feel like you could just as easily make a justification for acting consistently on both ends.

In Go, sending a value through a closed channel panics the program, and receiving a value from a closed channel immediately returns the zero value for the channel (and a false, if you use the two-return-values form of the receive operator).

There's no difference in what side closes the channel, I think, just in how a closed channel behaves when you try to send or receive on it.

Personally, I think it's terrifying that the "easy" option just lies to you instead of crashing the program, but I kind of see how it fits with some of Go's other design decisions that I'd also have made differently (Option/Result instead of pervasive nullability).

Don't know Go, but this way would be logical. "I've done sending" and "I refuse to handle more messages" aren't that symmetrical.

Exactungly, Go's flavour of channels are full of subtle bugs. I could count on N hands the number of times I've had to make a channel to drain a channel.

Channel are pipe if you want to use a mutex you have to re-invent a queue with a mutex which is very different, a mutext alone does not just replace a channel.

The thing that has bitten me the most is somewhat related: the lack of any uniform “deep copy” or immutability support. Sure I can make a `chan Foo` but if Foo is a struct with an embedded pointer in it you just aliased a pointer you probably didn’t mean to.

If there was supported for optional value types rather than using pointers that would help too.

He’s stating the goal of the design, not an assertion that they got everything right. Obviously there are lots of places where the design fell short. But the philosophy behind the approach to future design is unchanged. Channels were an attempt to add something very new to the language and so it’s not a surprise that the design ended up being problematic. I’m sure Russ has thoughts on that, too.

But the vast majority of Go code doesn’t use channels. The fact that lots of people have trouble with them doesn’t change the fact that the vast majority of Go code does follow this principle.

I believe the guidance addresses this.

> Programs that modify data being simultaneously accessed by multiple goroutines must serialize such access.

> To serialize access, protect the data with channel operations or other synchronization primitives such as those in the sync and sync/atomic packages.

Go gives us `sync.WaitGroup` and `errgroup.Group` as abstractions to manage concurrent operations. I always end up guiding developers towards using these unless they absolutely can't (like responding to OS signals requires interacting with channels).

For OS signals now they added NotifyContext() which uses context instead of channels.

I find channels massively overcomplicate many things. Cool in theory, not so useful in practice.

Anymore, I only use them as a semaphore to keep less than X routines running at once.

You can write Go for years without ever touching channels. I agree though, they are extremely easy to make mistakes with.

Channels and an alt operator are much more powerful than locks but eh I can't remember the edge cases in Go's channels so you might be right,

to me the overriding problem here is that selects need to be carefully refactored when you add in other interactions. of course.

but this means there are as you say subtle non-local effects...and these kick in exactly when you don't want them - when you are trying to thread in a new feature in an existing codebase.

I prefer to just ignore channels. then that statement makes sense :)

I agree - I find Go's error handling encourages subtle bugs.

Go's errors have basically all the problems of exception handling in previous generation languages (and a few novel ones).

This explains it (tl;dr nobody bothers to formally define what errors they return, and the blindly propagate such unspecified errors, and it leads to ambiguity and unintentional program behavior)




Swift considered this and still decided to have untyped error returns. The issues are:

- wrapping every underlying error in your own error type is not helpful

- but defining every underlying error in your own type confuses your implementation (and especially dependencies) with your interface

- and most errors can't be handled in code anyway

There are a few errors (mostly in file operations) that are individually handled as normal parts of life, but otherwise they really are just there to send back up to the user. It's more important to know where an error can happen than what it is.

The only place I would agree with you is when someone accidentally shadows an `err` variable. Although, I'd probably attribute this more to shadowing than the error handling itself.

Other than that, I can't think of any way the error handling subtly introduces bugs.

Go maps are wonderful examples.

If you index into a map, then you get a tuple (val, err), where `err` is non nil if the value exists, and `val` is the value in the map or "the default construction of the value".

Go also silently discards tuple members - e.g. `v = map[4]` is valid, and silently discards the error.

This kind of means that all go maps can accidentally be defaultdicts (in Python terminology), but it's also an excellent way to introduce bugs without meaning to.

The onus is on the programmer to understand what errors might be returned, which isn't always easily gleaned, especially when calling a function that calls other functions that call other functions, each pushing the error up the stack. The error interface does not provide much information on its own.

The mistake of the programmer missing an error that should be handled in a particular situation could be considered a subtle bug, I suppose. This is different to languages that have compiler-enforced error handling, where if you forget to handle a certain case the code won't compile.

None of that is unique to Go, and, indeed, the question of "what errors may be returned" in general is not solved in any language that I know of satisfactorily, especially once you include any form of polymorphism. Compiler-enforced error handling in general only works on the direct possible errors from the function you just called, not the transitive closure of everything that could have happened, again, especially once you include any sort of polymorphism, meaning that you don't even know at the time you're writing the error handler what the possibilities are, and depending on the language, it may not even be possible to statically know. Don't mistake "option" or "sum types" for a solution to this problem; they aren't.

(The only language I know where this is nominally "solved" is Java if you confine yourself to its checked exceptions, but nobody considers that solution "satisfactory", and indeed can be taken as a rough-and-ready proof that there may not even be a "static" solution to this problem that is usable. I'm beginning to think of it as "Errors don't compose, because they contain every possible pathology that would prevent values from composing." At this point, based on the consistent failures trying to create composable error solutions despite substantial effort poured into it across multiple languages, I am going to operate under the assumption there is no such solution until someone produces one.)

Can you give an example?

> it can be closed

..at most once!

Am I the only one who finds Go concurrency model to be extremely bug prone and difficult to work with? Writing an efficient pipeline of goroutines connected with channels that doesn't deadlock is a verbose and subtle mess with a whole bunch of code duplication.

For a lot of the things that I want to achieve, functional abstractions like parallel map and reduce would be much more understandable, and easier to use.

That whole talk about philosophy distracts from the actual tools that Go gives you to manage concurrency which in my opinion can be improved a lot.

I think that the "share by communicating" thing has led some people to think that these goroutine labyrinths are somehow The Way with Go - that's certainly how I felt when I started out with it. I absolutely agree that once things get really pipeliney it can be incredibly hard to debug and there are almost always lockups, and it can start to feel like select-oriented programming. I've found that the more cautious I've become with channels and goroutines the better luck I've had with Go. When reviewing code these days go into high alert if I see a channel crossing a function boundary or a naked (i.e. without an errgroup or WaitGroup) go statement.

I think different people read "Go concurrency model" to mean different things.

If we're talking about green threads (goroutines), I'm a huge fan. Having had a lot of success using Gevent in the Python world, it was great coming over to the Go world where that pattern was baked into the language.

If we're talking about channels, I'm really not a fan. Not nearly as easy to use as the equivalent Queue library over in Python, despite being baked into the language. My experience aligns with this post: https://www.jtolio.com/2016/03/go-channels-are-bad-and-you-s... Almost every time someone's suggested a Go channel, I've found it simpler to use a mutex or waitgroup.

I'm referring to the "share memory by communicating" idea mentioned in the article. I don't think channels are bad in general, but the way they work in Go leaves a lot to be desired.

"Writing an efficient pipeline of goroutines connected with channels that doesn't deadlock is a verbose and subtle mess with a whole bunch of code duplication."

It takes 30 lines of code and it's very easy.

Yeah until you need to cleanly shut everything down and avoid deadlocks or unbounded memory use.

It could definitely do with some improvements especially around closing channels and telling goroutines to stop.

This has to be depended on the pipeline you want to have.

It depends but saying it's very complicated and it's easy to deadlock is pure bs.

Where is the complexity in Go to have a pool of workers looking for jobs on a channel exactly?

> Where is the complexity in Go to have a pool of workers looking for jobs on a channel exactly?

This is just one kind of pipeline. If this is all you have done, you may have the impression that concurrency (in Go) is easy.

BTW, this is worth reading:

Understanding Real-World Concurrency Bugs in Go https://songlh.github.io/paper/go-study.pdf

I know that paper and it's indeed a good one but I still don't agree with OP that "Go concurrency model to be extremely bug prone".

I agree. Pointing out the difficulty of concurrency as a general premise doesn't invalidate Go's model. For the use cases it was designed for (asynchronous services) it's model is probably the best. For use cases it wasn't designed for, it probably sucks.

> pool of workers looking for jobs on a channel

It really feels like the design was driven by this one example only.

> To serialize access, protect the data with channel operations or other synchronization primitives such as those in the sync and sync/atomic packages.

> If you must read the rest of this document to understand the behavior of your program, you are being too clever. Don't be clever.

I'm surprised that they list sync/atomic as a thing to use. As I recall its behaviour isn't even defined--I followed a long mailing list thread trying to find out that ended with 'these are the behaviours we know we want but it's too complicated to document so let's just keep this to ourselves'.

Keep reading?

Yes, that was about the original Go documentation. The suggestion still doesn't mention that the difficultly with sync/atomic isn't the memory model per-se but rather the lack of yielding to another goroutine.

Something like `runtime.Gosched()`? https://pkg.go.dev/runtime#Gosched

1. The compiler's #1 job is removing code (aka optimizing): it does this by rearranging code, merging code together, and other forms of optimizations. That is to say: "i=0; i++; i++" wants to optimize to "i=2;", but in a multithreaded context, it means that you'll "never" see "i==0", or "i==1". Turns out that "losing" these states can be an issue if you're building multithreaded primitives (lock-free code, atomics, etc. etc.)

2. The CPU and L1 cache also "effectively moves" code in relaxed-architectures like ARM / Power9 / DEC Alpha. So it turns out that #1 is true "even if the compiler wasn't involved".

3. Because of #2, might as well fix the compiler / CPU / L1 with the same set of primitives (the "memory model") that defines orderings.

4. Turns out that a large, but minority, number of programmers want to experiment with low-level multithreading primitives: researchers, speed-demon professionals, and others do want to go at "full speed" even at the cost of great complexity. Unifying the promises of the compiler + CPU + L1 cache all together in a SINGULAR model helps dramatically (that way: the programmer only fights the compiler. The compiler has to "translate" the CPU / L1 cache rearrangements into low-level barriers as appropriate)


It turns out that the biggest source of speed improvements exists in this realm of "rearranging code". That's why CPUs do out-of-order execution. That's why CPUs (like ARM or POWER9) want more relaxed execution, to allow the CPU to rearrange more code in more situations. That's why L1 cache exists (and similar: L1 wants to rearrange reads/writes even more aggressively for even faster operations).

If you make a memory barrier (nominally preventing the L1 cache from rearranging code), you SHOULD have the CPU and compiler respect the barrier as well. After all, if the programmer says "this should not be rearranged", then that probably applies to compiler, CPU, and L1 cache all the same.

Ten years on, was the C++11 memory model (which I've used) a success? Compared to the Linux kernel memory model (which I haven't used)? I heard compilers can't remove dead reads because they can synchronize in rare situations, and sequential consistency was defined in a broken way and later fixed in a standards revision, and memory_order_consume is impossible to correctly implement in a way that's actually more optimized than memory_order_acquire, and the C++ memory model doesn't translate well to GPUs.

Is this better than the state of affairs prior to standardized atomics (which I haven't experienced)? Is it better than Go "defining enough of a memory model to guide programmers and compiler writers" (which I haven't used)? Or informally defining a set of use patterns, and writing optimizations around those use patterns rather than a formal model for what code and what optimizations are permitted (resulting in optimization steps that are only incorrect in combination, like global value numbering causing miscompilations[1][2])?

[1]: https://github.com/rust-lang/rust/issues/45839

[2]: https://bugs.llvm.org/show_bug.cgi?id=35229


> the critical detail about [relaxed/unsynchronized] operations is not the memory ordering of the operations themselves but the fact that they have no effect on the synchronization of the rest of the program.

Many people wish C++ memory_order_relaxed had no effect on synchronization and could be optimized like a normal access. It's not. https://internals.rust-lang.org/t/unordered-as-a-solution-to...

> Ten years on, was the C++11 memory model (which I've used) a success?

Concurrent/parallel programming in C and C++ before the memory model was an absolute shitshow. You could either

a) scream "YOLO lol" and resort to abusing volatile (and secretly hoping that no one will ever actually execute your code on a CPU with more than one core [1]), or

b) carefully construct synchronization routines in assembly and try to make sure that the single compiler you support doesn't screw you over in its effort to make your program run super-fast (and slightly wrong) [2], or

c) use a library which handles b) for you.

[1] FreeRTOS does this, and it only supports a single core.

[2] Linux did (does?) this.

Yikes, that is scary. Perhaps Go was more of a clear improvement than dealing with this "shitshow", and C++ has become more usable for concurrent/parallel code in the years since.

I still find data races (sometimes crashes) in the wild on a regular basis. For example, RSS Guard accesses shared data unsynchronized when syncing settings from a server, so performing two types of syncs at once on 2 different threads will crash when they reallocate the same vector. Qt Creator intermittently crashes (or at least used to) in some tricky CMake handling code with multithreaded copy-on-write string lists. And I see apps now and then that perform unsynchronized writes to memory concurrently read in another thread, and it usually doesn't misbehave.

Acquire-Release seems to be an outstanding success. ARMv8 added new assembly language statements to support it... as did NVidia GPUs (clearly CUDA / PTXis moving towards Acquire-release semantics), compilers from all around, etc. etc. So many systems have implemented acquire-release that I'm certain it will be relevant into the future.

Consume-release is a failure, but it seems like it was "expected" to be a failure to some degree. Consume-release was apparently the model that ARMv7 / Older-POWER assembly designers were going for, but it turned out to be far too complicated to think about. No compiler seems to be using consume-release anywhere (instead turning consume into Acquire).

From my understanding, the Linux-kernel operations could be consume-release, but only if the compilers fully understood the implications. (But no one seems to fully understand them). Maybe a future standard will fix consume-release, but best to ignore it for now.

Anyway, ARMv8 and POWER9 have changed their assembly language to include Acq/Release level semantics.

Fully relaxed is... not a model at all and does the job spectacularly! Some people don't want any ordering what so ever, lol.

Seq-cst is basically Java's model and it works for those who don't care about optimizations (it will necessarily be slower than Acquire/release. But there's a few cases where acquire/release is a trap and Seq-cst is necessary). It doesn't work on GPUs though as GPUs don't have snooping caches / coherence IIRC. So the strongest you can get in CUDA-land is Acq-release.

I wish there was actual software support for Acq/Release-like semantics, but somewhat more relaxed by way of e.g. specifying two stores (data and pointer-to-date) to require in-order visibility, without enforcing a strong ordering of this store pair relative to other (semantically unrelated) stores.

Barrier-based abstractions could handle that, if they support more than one barrier. For loads, this would allow efficient dependent load reordering, by essentially enforcing the ordering only where needed for concurrency reasons (this mostly helps speculating loads before the address is confirmed, and not needing to snoop for invalidations of the cache line that contains the speculated address/killing the load), and similarly taking pressure of the store buffer by being less strict about the order in which it commits to L1D$.

RISC-V's propose WMM has such weak default ordering, but due to using fences, it's overly strict to the point where it performs worse on heavy concurrent code that's littered with atomics, compared to a TSO version (of the same softcore) that "just" prefetches exclusive access for writes. Even when turning RMW into relaxed semantics, so it's just due to the overly-strict load fence that effectively trashes all shared-state L1D cachelines.

> Fully relaxed is... not a model at all and does the job spectacularly! Some people don't want any ordering what so ever, lol.

Many people wish C++ memory_order_relaxed had no effect on synchronization and could be optimized like a normal access. It's not. https://internals.rust-lang.org/t/unordered-as-a-solution-to...

Somewhat related, how are other gophers handling dynamic configuration in their go apps? Such as if the app is tied to consul or vault to get some config value. How do you handle the propagation of config changes cleanly, especially if it impacts something "big" like your DB pool? Do you have mutexes everywhere you read from the config pointer you pass around?

I noodled on this a while ago and figured it was easier to just restart the app on config changes than try to change things on the fly =D

I don't use any of them services, but any config that has multiple readers and a single writer (who updates it when it changes) can just be protected by a sync.RWMutex.

The easiest thing to do would be to add a RWMutex to your Config struct. Then have public accessor methods which RLock the mutex while retrieving a field value.

Then when you want to update the config have something like a `Set(c Config)` method which allows you to overwrite the config stored in pointer. This Locks the RWMutex for writing of course.

viper, which nicely handles env vars, also can watch for config changes: https://github.com/spf13/viper#watching-and-re-reading-confi...

if you're rotating creds or need to open/close the DB, this will typically just add another select case to your main method, where you also block on e.g. signal catching to cleanly shutdown the app

Be aware that viper isn't thread safe. You'd still have to copy config by value and then put a mutex on that, then handle reads and writes separately.


atomic.Value can work pretty well as well, their example docs show how to do it as well: https://golang.org/pkg/sync/atomic/#example_Value_config

DBPool gets a bit more interesting. In Java world you can you providers that give you access to something, which is what middleware tends to do for Go. If you can have your middleware be able to swap the pool config its loading from, that should work just fine. atomic.Value make work here as well.

Just implement methods in your config struct in such a way that they are safe for concurrent access?! Using a mutex for example.

Why would you need it to lock mutexes everywhere?

> I noodled on this a while ago and figured it was easier to just restart the app on config changes than try to change things on the fly =D

I think this is generally the best approach, as it also applies some pressure for your application to be able to quickly shutdown and restart without affecting users - which is a desirable property to have for many other reasons too.

Usually http handlers or a database functions are pointer methods on some larger Service or DB struct type. If the config changes, you'd have a mutex within that struct and a method that safely modifies its state by locking and unlocking the mutex.

Why would you send pointer and not just a copy of a configuration? Configuration are usually simple things like url, threadpool size, strings etc ... there is no problem passing a copy of that.

Edit: not sure why I'm downvoted, I also think that reloading configuration is a bad idea anyway so you should pass immutable config.

I just appreciate that the Go Memory Model intro basically says "if you have to read this, you are probably wrong" and felt free to skip the rest.

What is there to appreciate from that? That sentence is basically saying "you're too dumb to understand and use this correctly so don't bother." I find it offensive, and so characteristic of Go creators' attitude toward Go programmers.

From the start we've been building the language we want to use, ourselves. And we do use it and are happy to use it.

"You're too dumb" is absolutely not our attitude toward Go programmers. I would find that offensive too.

Also, if we really believed people were too dumb to understand these things I wouldn't have written so much text to try to bring people along for the decision. I'd have just put in the text I wanted and walked away.

Okay. Then I guess matter is just the wording then. I find the Rustonomicon warning much more inviting:

> Should you wish a long and happy career of writing Rust programs, you should turn back now and forget you ever saw this book. It is not necessary. However if you intend to write unsafe code—or just want to dig into the guts of the language—this book contains lots of useful information.

What I like: (1) the advice not to read this document is presented using the word "should," but in Go's case, the doc first presumes the reader as being too clever and then directly commands the reader not to be too clever; (2) the Rust version acknowledges a basic human trait that is curiosity, inviting those curious enough to explore the guts of the language to dig in, but the Go version does not.

I'm sure you have great intention, but can you acknowledge the wording comes across as off-putting and condescending?

Good example of the Go patronizing tone. "We're not C++, with its garbage UB. Oh, but we have memory corruption abilities from data races".

Oh, right.

Yes, i know C++ has more interesting failures for invalid code, but it's pretty misrepresenting to say "Go sits somewhere in the middle", as if you can be a little bit pregnant.

Other than that, yeah seems good, but not earth shattering incremental improvement.

Go’s UB is in a rare edge case. I’ve been writing Go for a decade and I’ve never run into it. No doubt some have, but this is in stark contrast to C++ where UB is all over the place. The salient detail is that UB isn’t a binary like you propose. It’s not like “am I pregnant”, it’s more like “how much alcohol have I consumed”?

The tone of the article says otherwise. It takes great detours to not call what Go has UB.

It's exceptions all over again.

The point isn’t whether or not it’s UB. The point is that having a little bit of UB in a rare edge case is not ideal but most programmers will never run into it. While in C++ it’s so prevalent that it will affect every programmer and lead to many difficult to troubleshoot bugs. These things are not equivalent problems.

If I have a uint64 counter incremented by C code and read by Go code, is it safe to do so without using any locks and atomic?

Surely not safe.

Applications are open for YC Winter 2022

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact