Hacker News new | past | comments | ask | show | jobs | submit login
Context Control in Go (zenhorace.dev)
184 points by todsacerdoti 10 months ago | hide | past | favorite | 115 comments



This touches tangentially on a very interesting idea apart from contexts, at least for me as I've been recently learning Go:

> it’s an anti-pattern for libraries to start their own goroutines. Best practices dictate that you should perform your work synchronously and let the caller decide if they want it to be asynchronous.

This is something I had not discovered yet, probably because it's just a "common knowledge" thing that doesn't get explained in the Go tutorials, but regardless seems like a good idea in general.

It also coincidentally threads well with something I read yesterday: The bane of my existence: Supporting both async and sync code in Rust [1]. While not being well versed at all in Rust, I constantly had precisely this same thought: why not make a synchronous library by default, then let the application choose whether it wants to use it as-is, or to put an async runtime on top of it?

It only makes sense to me, and I would apply this best practice from Go if I was trying to make a Rust library. Especially given that in Rust there is no "official" standard async runtime, so I believe that authors ought to not assume which runtime end users should be forced to depend on.

[1]: https://news.ycombinator.com/item?id=39061839


I think it's totally fine for libraries to spin up goroutines internally. My interpretation is that the library's public interface should appear to be synchronous. As a contrived example:

  package main

  import (
    "context"
  
    "example.com/spider"
  )

  func main() {
    ctx, cancel := context.WithCancel(context.Background())
    defer cancel()

    _, _ = spider.Crawl(ctx, "https://en.wikipedia.org")
  }
In this case `Crawl` is a blocking call and, under the hood, it may well spin up a pool of goroutines to crawl a site. It's also really nice that context is available to tie the lifetimes of the main program (goroutine) to child goroutines without coloring functions (like with async-await).

I used to work with the Go pubsub client (https://pkg.go.dev/cloud.google.com/go/pubsub) a lot and that has a whole bunch of scheduling and batching functionality handled in goroutines and on the outside you're calling `topic.Publish(ctx, &pubsub.Message{Data: []byte("payload")})`.


If you're willing to plumb through all the knobs imaginable then this is a fine approach. But if spider.Crawl just ran unbounded or with a fixed bound it could trivially become a huge headache.

There are many patterns in Go that are preferable to just starting a bunch of concurrent work.

For example the pubsub client has options to disable batching and limit in flight connections.


> There are many patterns in Go that are preferable to just starting a bunch of concurrent work.

I have not worked in Go for several years, but i remember that when i did, more experienced people told me that this was exactly what you were supposed to do in Go. That where in another language you might set up a queue and a threadpool and so on, in Go, you should just spawn a load of goroutines, and let the runtime sort it out.

Is this no longer the canonical approach?


It's also one thing if you want to provide SingleThreadedApp and a MultiThreadedApp instances in your package, but that should be left up to the user -- or made explicit in package docs. (We're spinning up 100 go routines!)

Over a decade ago I was trying to debug why the UI written in Java was slow on solaris box. But that wasn't the primary reason.

Turns out there were over 3000 java threads running at any given time. The people who wrote Java code wrapped an event class around a thread, then just started kicking off events willy-nilly. So after about 5 minutes the os had thousands of these things to deal with. There wasn't really any processing left over for anyone else.


Indeed, it's down to the library API to give you the right knobs to control for concurrency. It says more about the quality of the library if it mismanages goroutines than it does about whether or not libraries should use goroutines at all.


It's kind of an "I know it when I see it" situation. To sit down and try to write a rigidly-specified set of rules on when it is and is not OK for a library to spin goroutines would be very difficult.

Yet the basic principle isn't that hard: Your code should generally be what is considered to be "sync", and it is up to the user to decide if they want that to be "async" by using a goroutine themselves.

This rule is primary for libraries that try to be "helpful" by, say, decoding an image unconditionally in a goroutine or something and providing a "promise" of some sort you can read the results from. Don't do that. If a Go programmer wants a "promise"-like behavior, any Go code can be so converted by an end-user at any time and the best thing the library can do in that case is just stay out of the way of the already-ever-present features that allow you to do that.

But on the flip side, I expect a library implementing a parallel map to have its own goroutines. As a parallel map user, I basically don't want to see them or have to think about them. At best, maybe the library has some knobs I can tune, but I don't want to be managing them. That would defeat the entire purpose of such a library. A deliberately recursive and parallel crawling library, documented to be as such as where that feature is its major utility, fits into this category. By calling ".Crawl" I am clearly asking for this functionality explicitly, by the nature of the contract of the library. Which is also a good use case for structured concurrency, which Go does not explicitly implement into its language but still makes for an easier and safer library than the alternative.


It seems like it would be better if the concurrency were pluggable somehow. Maybe Crawl takes some kind of worker-starting interface, with a suitable default implementation?

Then the job of the crawler is to find new units of work, not to schedule them. In theory it could be done single-threaded by pulling work from a queue.


That would be an inner platform: https://en.wikipedia.org/wiki/Inner-platform_effect

The Go scheduler is already taking units of work called goroutines and scheduling them. It's no big deal to ask the crawling system to have some limit on how many goroutines it'll use, the patterns for that are well-established, and also necessary because it's not all about the goroutines in this case. Crawling needs controls to limit how many requests/sec it makes to a given server, how deeply to recurse, what kind of recursion, etc. anyhow so it's not like it particularly sticks out to also have a concurrency parameter.


Fair enough. But I'm not sure a wrapper around starting a goroutine counts as an inner platform, because it's not doing much work, and it's not really work that the Go SDK does. Choosing when to start goroutines and how many to start is an application concern.

Depending on how it's done, it might be a decent way to structure a crawler?


You know what happens when your library spins a goroutine and that goroutine crashes? Your program crases, and you don't have the chance of putting any recovery on it.

Your library with buggy goroutines take down the whole program, and there is nothing you can do to fix it.


Isn't that the same if you make a completely synchronous library without any goroutine? If your library panics then the whole program crashes.


It's not the same because for any single goroutine, you can catch the panic at top level. But it only works if you wrote the top-level code for each goroutine. (For a completely synchronous program, you wrote the main function.)

If all the work is done with one function call, it might be pretty similar to a program crash, except that you can log or restart in main, and you could use it as part of a larger program that does other stuff too.


From what I understand, Go code is usually written with the assumption that panics are fatal, not recoverable as exceptions. Trying to recover from them, as though they were exceptions, will expose other bugs, like functions not using defer to do cleanup, releasing mutex and such.


>that context is available to tie the lifetimes of the main program (goroutine) to child goroutines without coloring functions (like with async-await)

Isn't it also a kind of function coloring: a function with a context argument vs. without?


I think it's fine for libraries to provide a toplevel `Crawl()` method which manages the goroutines as a convenience, but these libraries should expose the more parameterized methods as well so callers can have more fine-grained control.


> This is something I had not discovered yet

No doubt because the Go team disagrees. They have been abundantly clear that, from their point of view, goroutines should be used and even used as part of the public API when it makes sense.

That said, it is still probably really good advice for newcomers who won't have a good understanding for the cases where it does make sense, and especially because goroutines are the shiny thing every newcomer wants to play with and will try to find uses that don't make sense just to use it. As a rule, you don't want to force the caller into things they might not want. In the vast majority of cases, a synchronous API is what you will want to give them as it is the most versatile.

And, really, that's something that applies generally. For example, in the case of the common (T, error) pattern, you will want to ensure T is always useful even when there is an error. The caller may not care about the error condition. That's not you for, the library author, to decide. The fewer assumptions you can make about the caller the better.


> For example, in the case of the common (T, error) pattern, you will want to ensure T is always useful even when there is an error. The caller may not care about the error condition.

This applies to maybe 0.1% of functions. The overwhelming majority of functions, in the Go stdlib as well as real projects, that return (T, error) return an empty meaningless value for the error cases.


Not my experience. Some very early code did not recognize this, but since then pretty much everyone has come to agree that values should always be useful. If you are writing a function today, there is no reason to not observe this.

In practice, that typically means returning the zero value. To which idioms suggest that zero values should be useful. Rob Pike's Go Proverb[1] even states: "Make the zero value useful." Most commonly when returning (T, error) that zero value is nil. In Go, nil is a useful value!

If the caller wants to observe the error state, great. But it is needlessly limiting if you force it upon them. That is not for the library author to decide.

[1] https://go-proverbs.github.io


One problem I've found as a newcomer to Go (and I'm perfectly willing to accept that I just haven't developed the right "language mindset" yet) is that the zero value can be problematic—particularly for scalar types—because it's often a perfectly valid value in a model where you need a way to indicate an invalid value.

Obviously if there is a possibility of invalidity, you would expect the caller to check the error, but the fact that I always have to return something as the callee, and always have to make sure I'm not accidentally using the value in error conditions as the caller, is just asking for mistakes to me.

I appreciate that it's not the path Go has chosen to tread, but I find Result<T, Error> to be so much more of a foolproof pattern than (T, error), especially considering prevention-of-foot-shooting is an established Go design goal.

(Equally obviously you could use a pointer and return nil, but I find that muddles the semantics, because there are multiple reasons you might opt to use pointers besides the ability to express "no value".)


Given (T, error), what do you return for error when no error occurred? When error is "invalid"? The caller is, no doubt, expecting you to be consistent, so the answer for T no doubt lies therein.

There is nothing special about errors.


If the zero value is valid, I usually just use a pointer to the scalar type in question


Zero and nil values are almost always bogus. The Go language itself doesn't even respect that proverb: the zero value of a map is not a useful map.

There are some rare cases where a 0 value is actually meaningful in some way. But even for types where it is fully functional like integers, it's often not meaningful in the specific context it is used.


I've seen just two APIs that returned non-nil/non-default T (representing the partially completed work) with a non-nil error, and those were a constant source of bugs and errors. I've changed those to always return dummy empty T, and even though the retries now hurt performance more (they could not re-use partial completed result), it was a much more straight-forward code.


Practically speaking, the (T, error) pattern is pervasive because there isn't any other alternative. Go simply lacks sum types.

> Not my experience.

To what experience do you speak of? My 5,000+ hours in kubernetes and terraform space tells me Rob Pike's views are fan fiction at best.


Let's be real, Kubernetes is a Java project with code that just happens to share some resemblance to Go syntax. It's also one of the oldest projects using Go, long predating the "make zero values useful" proverb, so it is not surprising that it doesn't follow the idioms recognized today. Idioms cannot be conceived in advance. They emerge from actual use after finding out what works and what doesn't.

What new code being written today is violating that pattern?


I usually return zero values just because its easy, not because its useful. I don't expect the caller to use the return value if err !=nil and haven't heard anything to the contrary on my team. If Go were a more powerful language, we would be returning Either[A,B] not multiple return values, which would guarantee that you rely on one or the other, not some weird in-between case.


> I don't expect the caller to use the return value if err !=nil and haven't heard anything to the contrary on my team.

Yet you admit to following the advice for error, returning the zero value for err and making it useful when you do. If you don't have a meaningful error state, why not just return junk? Clearly you recognize the value of making the return values useful, always. Why make exceptions?


No. And GP explictly said they don't tend to make it useful, but only do it when it's easy.

Making the return value of e.g. a database handle always "useful" is a ridiculously dangerous idea that can lead to application bugs further down the route becaause some list/get returned an empty value to continue the pattern of "useful" empty values.

The main reason there is ever a useful error next to a non-nill err is because go doesn't have a useful way to not do it.


if I need to return some person,error how do I return junk for the person? I just return person{}, error. I guess I could fill person out with a bunch of silly values but why would I do that work? If there was some easier way to make a person and it was filled with junk, I wouldn't hesitate to use it because the caller would never use the value.


Logically, in that case you would return nil, just like you do for error. There is no person to return. nil is how Go signifies the absence of something. nil is useful, as proven by error. Why make exceptions?

It’s funny how people forget how to write software as soon as the word error shows up. I don’t get it.


On top of the issue with nil not being a useful value for most types

Nil requires pointer values. I.e. it's impossible to know whether something is a pointer to allow for nil, or because a copy would be prohibitively expensive and therefore references are used, or even because it's into a mutable structure.

Go's overlapping of implicit nullability and by value/by reference marker make it entirely useless to build information into APIs / necessarily promotes the value into a different type to use.


Because nil panics on member accessors... It's the opposite of what you claim to be the standard in go.

Thanks for demonstrating that you forget how to write software around erros.


What are you returning for error in its “junk” state, then? Clearly not nil, else by your assertion your code will panic. error has member accessors you will call - Error() if nothing else.

Methinks you’ve not thought this through. What’s it about the word error that trips up programmers like this?


That's not what my team of gofers decided. Apparently zero is better than nil. I know how to write code but Go is its own thing. I mean the idea of returning multiple values is totally goofy in itself.


> Apparently zero is better than nil.

Zero often is better than nil. Consider something like atoi. If it fails, 0 is often exactly what you want. No need to care about any error state. Although the error state is there if your situation is different. The caller gets to choose.

But for something like a person that doesn't exist, nil is almost assuredly the appropriate representation. You didn't end up with an empty person, or a made up person, you ended up with no person. nil is how the absence of something is represented. Same reason you return nil when there is no error.

There seems to be no disagreement that nil is the proper return value for cases where there is no error. Why would no person be different?

> I mean the idea of returning multiple values is totally goofy in itself.

It is, but then again so is accepting multiple inputs. Neither is mathematically sound, but they have proven useful in practice.


> Consider something like atoi. If it fails, 0 is often exactly what you want. No need to care about any error state.

No, 0 is a bogus value if atoi() failed. 5 would be exactly as appropriate. If I'm parsing a form to find a user's age and they entered "old", their age is definitely not 0. I can't even imagine a scenario where I'd care what value atoi() returned if it returned an error.


> No, 0 is a bogus value

No. 0 is the integer's (what i in atoi identifies) zero value, which carries the expectation of being useful.

The problem here is that your example is using the wrong type. Age is not an integer, it is an age. Use an age type that defines the proper semantics for age when you what you have is an age.

Not even the best type system can stop a bad programmer choosing to use the wrong types. Stop being a bad programmer, I guess. No amount of tooling can fix that problem, I'm afraid.


0 is a perfectly good age, plenty of humans have been 0 years old. The point is simply that the first return value of atoi() if the second is non-nil is meaningless. Atoi would have been exactly as useful if it had been defined that atoi() returns 167 and an error if it can't interpret the string as an integer. Code which proceeds to use the first return value of atoi() if the second one is non-nil is wrong code, even if it happens to work for some convoluted scenarios.

Edit to note: atoi() actually doesn't always return 0 if it fails: if the second return is err.Err=ErrRange, then the first is the max value that can fit on 32 bits.


another example is when reading from a file using io.Reader. EOF is returned as an error, but you still need to check the slice for any new bytes that we read before the EOF. A lot of errors are actually good to have, and still return critical data despite being an "error".


yeah I get it. you're talking common sense but I'm coding in Go.


Yes, the biggest mistake Go made was introducing the error keyword.

It should have used banana. If it were (T, banana), nobody would have trouble with these concepts. There's just something about the word error that causes programmers to lose their mind for some reason.


> What new code being written today is violating that pattern?

You are putting the burden of proof on me now? How unfair, You didn't bring any. Go to CNCF and pick anything written in Go.

> Let's be real, Kubernetes is a Java project

Let's be real, Rob Pike is the flat earther of PLT. Sum types are Rob Pike's Foucault pendulum.


> You are putting the burden of proof on me now?

No. I don't give a shit about what you do. Where did you dream up this idea?

> Let's be real, Rob Pike is the flat earther of PLT.

No doubt, but when using the programming language of flat earthers, one has to accept that the particular world is, indeed, flat.

But the advice is undeniably sound. There is no programming language where you should leave someone hanging with junk values. You might avoid junk in other languages using some other means (e.g. sum types), but it is to be avoided all the same.


> it’s an anti-pattern for libraries to start their own goroutines. Best practices dictate that you should perform your work synchronously and let the caller decide if they want it to be asynchronous.

This line reminded me that we all need to beware of the "anti-pattern" police. Some develpers use this term effectively by explaining precisely why something is an "anti-pattern".

But more often it's used as a way to shut down conversation and any actual critical thinking. There's a lot of nuance behind what makes something an "anti-pattern", and simply declaring something an anti-pattern isn't enough. "This is an anti-pattern and this is why ..." is enough. But I still avoid using the term regardless.

FYI I'm not saying the author is the anti-pattern police, but it does sound like they've found themselves on the police's radar.


Hey. Author here. Yeah, the Go style guide at Google (the setting for the dcode review) is quite prescriptive. I don't mind labeling things "anti-patterns". It doesn't mean there's a law against using said pattern (if you're in a healthy org). It just means you should be able to justify why you go against the guideline. The whole reason I was comfortable attempting that approach in the first place is because I had successfully used that pattern before and was able to provide a rationale. It just happens that this case differed enough that using this pattern would've been unwise.

You're right that things shouldn't just be classed as an anti-pattern without explaining why. And that's enforced in most Go-heavy orgs at Google. If you're gonna say something is bad without providing an explanation and linking to the relevant sources, the comment is likely to be closed or argued.

I didn't provide an explanation for why this is considered an anti-pattern in this post because that wasn't the topic. But it appears most persons here are interested in that, so maybe I'll write a post on that next week :)


It might also come from the common sense of using the CSP model (goroutines and messages, basically). Being asynchronous or synchronous is not the property of the function; it's the caller's decision. So every function is "synchronous" and it's up to the caller to decide when to spin it into the background and what for.

As others commented, it doesn't preclude from you using goroutines in libraries if you _really need to_. It is just important to remember that it's not obvious to the caller.

In light of this "common sense", async/await concurrency concept doesn't make any sense. Why would function dictate how exactly it should be called by a caller? Is "watching TV" an async or sync action? Depends on the caller – whether they put all their attention into this action or doing it "in the background" while performing other tasks. It's not an inherent property of the "whatching TV" function. I have no idea why so many people think that async/await is a good idea for expressing concurrent systems.


I've always seen this as the exact opposite view - from go's concurrency model, every function is "synchronous" so the caller is not given a choice, if they want to run it asynchronously they have to create a new thread, then if they care about the result deal with inter-thread communication.

With async/await, you're explicitly giving control to the caller to decide, you can await this promise now and have the thread treat it as synchronous, you can spawn a new task to run it in the background, or you can join it with other promises where you don't care about order and await for the results of the group.


Interesting. I see two different aspects here:

1) Mental model. My claim comes from the firm belief that the more code is aligned with how we think, the easier it is to reason about the code. I naturally think about actions as they are not async or sync by nature – rather, it's me who's in charge of how the action is going to be executed (back to my "watching TV" example). Human attention here serves as an analogy to utilizing the logical CPU core during runtime.

2) Performance consideration. What you described indeed can work, too, but it comes at a cost. With Go, yes, you have to handle async results yourself (if you care about results), but you now understand the price of this and can make better judgments of the code and complexity and have better performance overall.


There are two different questions here:

1. Should the code be async-aware? That is, should it be able to yield to other tasks?

2. Should the code launch background tasks?

1 is a cross-cutting concern, in a cooperative multitasking environment, anything that is unable to yield is, well, preventing other tasks from running.

In Go, 1 is always true, and implemented by the runtime itself. In Rust, this means using async I/O, like preferring tokio over std.

2 is what the advice is about. The Rust equivalent here would be tokio::spawn.

Writing async-independent Rust code requires you to tackle 1, maybe by writing a trait for all the I/O actions that you require from your executor, which could return blocking futures if you select to run synchronously.

That said, blocking also prevents you from using async for structured concurrency in the library implementation, which may or may not be a big deal for your use-case.


>> it’s an anti-pattern for libraries to start their own goroutines

The specific argument here is that by writing sync functions, the library is more abstract because the caller can decide whether to run the function sync or async. I agree with this, but there are lots of areas where we could issue guidance to make libraries more abstract.

For example, instead of a library function which returns a pointer to allocated memory (e.g., `NewFoo() Foo`) we should write functions which take pointers to memory and the caller can figure out whether to allocate them on the stack or the heap (e.g., `NewFoo(out Foo)`). I'm not advocating for this as a general rule of thumb because writing that kind of code in Go would not be very ergonomic, but there's a lot of performance-sensitive code even in the standard library that is written that way.

Another example would be 'inversion of control', wherein functions take interface parameters and callers decide what implementation to pass in.


I agree with not forcing an execution model on your callers, if you can avoid it. I also try to extend the rule to be as flexible as possible for callers. For example, in C++, I don't like to see a function that returns a container. I prefer a function that accepts an output iterator as a parameter.


> most seasoned Go devs would leap out of their seats to tell you it’s an anti-pattern for libraries to start their own goroutines. Best practices dictate...

Then I'm afraid that either the author is not familiar with the standard library, or that it is not built according the "best practices".

Rant: the phrase "best practice" increasingly irritates me. Basically it has become a synonym for "my own opinion just trust me". It's like the Jedi hand gesture to force your beliefs onto someone to end a discussion.


> Basically it has become a synonym for

Its been a thing for a long time! As far as I know, Feynman first put it into words (regarding science, but applies equally to software engineering) in 1974:

> In the South Seas there is a cargo cult of people. During the war they saw airplanes land with lots of good materials, and they want the same thing to happen now. So they've arranged to imitate things like runways, to put fires along the sides of the runways, to make a wooden hut for a man to sit in, with two wooden pieces on his head like headphones and bars of bamboo sticking out like antennas—he's the controller—and they wait for the airplanes to land. They're doing everything right. The form is perfect. It looks exactly the way it looked before. But it doesn't work. No airplanes land. So I call these things cargo cult science, because they follow all the apparent precepts and forms of scientific investigation, but they're missing something essential, because the planes don't land.

We programmers usually call it "cargo culting", blindly following "best practices" and "design patterns" without any deeper understanding of why and when those should be applied.


To be honest for me when someone talks about "best practices" I expect it to be actual best practices that usually get documented somewhere that is a documentation resource well regarded by the "community" of that tool or language.

So if it happened (no idea) that the author here was taking this "best practice" out of their own ass... well, then yeah I'd agree that's just an opinion and not an actual, community-agreed-upon, very common and very well documented "best" practice in the sense that I usually regard as useful and reliable.

Nevertheless, I agree with the author that the concept of "don't return immediately from your API function, instead block until the function's work has been completely done, and only then return", seems to me as a valid and quite good idea. Regardless of how many internal goroutines might have been used in order to comply with this behavior, the external surface should look like a blocking call. If any caller doesn't want to block their thread on it, they in turn can always run it in a goroutine.


> I expect it to be actual best practices that usually get documented somewhere that is a documentation resource well regarded by the "community" of that tool or language.

It's documented in the Google go style guide: https://google.github.io/styleguide/go/decisions#synchronous...

Note that the advice is slightly different, not "don't use goroutines," but rather "any internal goroutines need to be cleaned up before the function returns"


Hi. Author here. Didn't mean to irritate you. I'm quite familiar with the standard library. And I'm familiar with rants by the Go team about all the early code and patterns they wish they could wave away with a magic wand :) And that the std lib does some things that no user of Go typically needs to do. I could write another post about how net/http goes against most of what this post advises, and for a good reason. But I didn't talk about that because my point would be more easily lost and no one reading my post is writing a their own http/grpc server. I agree that someone shouldn't just say "best practices" and move on without an explanation. I didn't delve into this in the post because that wasn't the topic, and I overestimated how common this knowledge is (it's well documented inside of Google, and at the time of writing, I thought it was also published externally).

This "best practice" seems to be what most of the comments on this thread discuss, so I'll likely go in-depth with a post on this next week. Cheers.


IIRC for the http package at least, the original author stated it was a mistake.


This has a lot of interesting accidental rebuttals to some "features" of go.

> If you’re not in an entry-point function and you need to call a function that takes a context, your function should accept a context and pass that along.

This is in contrast to the fact that Go's cheap threading means that you don't need to colour your functions with async or not async. But this quote that you sort of have to do this with context or no context.

It isn't quite as bad as you can "skip steps" such as passing a context to a callback directly rather than needing the function that calls the callback to support contexts. But still in general your functions do have colour if you want to use contexts properly.

> most seasoned Go devs would leap out of their seats to tell you it’s an anti-pattern for libraries to start their own goroutines.

If goroutines are so cheap then why not let the library spawn them. As long as the interface doesn't reveal if they are being used or not it shouldn't matter.


You can't extend the concept of "coloring" a function to all possible environment and parameters a function needs to execute. That's not because that's a useless concept; I actually find it a very important concept to be thinking about and I often explicitly think in terms of trying to minimize the size of such things in my code. But you can't extend the "coloring" concept that far because you've stretched it all out of shape at that point, into an entirely different concept. All code in all languages everywhere is going to have state that flows through some combination of function parameters through the program and have certain requirements for that state without which the functions (methods/whatever) will not run.

Coloration is a very particular very strong instance of such things that is so strong it causes its own special effects and imposes very special constraints on the code. Generally if you need to call something that wants a context but you don't have one, you just pass in the trivially-obtained "context.Background()" and move on. Nowhere near the level of blockage as a color issue.

"If goroutines are so cheap then why not let the library spawn them."

It's not about cost, it's about software engineering, and it's a particular antipattern you may not know about if you're not in the community. As I said in another post, many libraries "helpfully" spawn goroutines to do a thing and offer a promise-like interface to the results. This is the core antipattern being referred to, which I've seen quite a lot. The resulting API is complexified relative to simply having a function that takes parameters and returns results. If you write such a complex API, an end-user of that API can't uncomplexify it. However, if you write the simple, normal function, an end-user of your API who does want that additional functionality can trivially add it, and moreover, they can add it in whatever other combination of things they may want, e.g., perhaps your library is part of a three-step pipeline you choose to run in its own goroutine, or some other complex threading setup you need. It is better for a library to provide the simple "synchronous" API than to try to guess and possibly even as a result forstall the real setup you need.

It isn't a hard-and-fast rule that libraries must never spawn goroutines, it's a particular set of antipatterns being referred to.


> You can't extend the concept of "coloring" a function to all possible environment and parameters a function needs to execute.

It's not the particular abstraction, it's the concept of two different colors: functions that take or don't take context.Context. Ultimately we do have two colors. Seasoned go devs will have refactored some to the other by drilling through `ctx` or removing it and know exactly what "color" is.

> many libraries "helpfully" spawn goroutines to do a thing and offer a promise-like interface to the results. This is the core antipattern being referred to, which I've seen quite a lot. The resulting API is complexified relative to simply having a function that takes parameters and returns results. If you write such a complex API, an end-user of that API can't uncomplexify it. However, if you write the simple, normal function, an end-user of your API who does want that additional functionality can trivially add it, and moreover, they can add it in whatever other combination of things they may want

Do you not read that as "two colors"?


"Ultimately we do have two colors."

If you consider this a color, we don't have two colors. We have millions.

As I said, it is not that such a concept would be useless; I use it all the time. But it's not "color" any more.

"Do you not read that as "two colors"?"

That is in response to a completely different question about the cheapness of goroutines, not coloration.

Coloration is much stronger than you seem to be understanding. It is not "oh, this function requires that parameter and that one does not, so they must be different colors". It is that you can't correctly run an async function from within a synchronous one and vice versa in the languages in which these are considered completely different things (which does not include Go). While conversion is ultimately possible, it is expensive and high-consquence. If you have a hard time seeing that because, say, a sync function can "simply" bring up an async execution engine for its async calls and an async function can "simply" spawn an entire OS thread to run sync code and collect the results through a promise, consider nesting such an approach arbitrarily deeply as a deep call stack alternates between async and sync calls, which is 100% realistic. It becomes more clear how high-consequence this is if you remember that such programming language constructs must be able to compose essentially arbitrarily deeply.

Context parameters can be satisfied by no-context functions simply by passing "context.Background()", and the result is low-consequence. The code does not come apart at the seams, the code using contexts simply ends up not having any data come from the background (empty) context and the background context will never generate a cancellation event, which is apparently what the caller wants since they are asking for that more-or-less explicitly. If that is not what the caller wants, "ctx, cancelF := context.WithTimout(context.Background(), time.Second)" is also trivially available, correct, and low-consequence. If you define coloration down to this level, you completely lose the entire point of the original essay, which is the high-effort, high-consequence effects of bridging sync and async code. Defining colors down to "This function takes a file pointer, thus it is 'file pointer colored'" is profoundly missing the entire point. The idea of function color is useful precisely because it is limited to only certain high-cost conversions, not spread so thin as to cover literally every function parameter. Contexts aren't very special; it is literally easier come up with one if you don't have one ("context.Background()") than it is to come up with an integer, in which you must actually pick one. It isn't anywhere near special enough to justify being called a "color" any more than a file pointer, or a database connection, or any of hundreds of other resource types, most of which impose more constraints on the code than contexts.


You are the one trying to include "all things like file pointers" as colors, as a strawman. But nobody is trying to say that.

Context is tacked on to goroutines to control async stuff, unlike e.g. filepointers or database-handles.

Using context.Background() is fine and great for tests or e.g. some CLI program. But consider deeply nesting functions that all do context.Background().

Go just smudges everything as gray and calls it a day.

Edit: what I'm trying to say is that go has not completely solved the colored function problem, which I think is what you're implying.


What you're actually saying is "I don't understand the colored function problem".


You're not saying anything.


Maybe you should try the same.


That wasnt a strawman, that was an example to help illustrate a point.

Also it is a little fun that your smudge metaphor tacitly admits the other person is correct


> A straw man fallacy (sometimes written as strawman) is the informal fallacy of refuting an argument different from the one actually under discussion, while not recognizing or acknowledging the distinction.

(Emphasis mine)

Yes, go has an async runtime built in, congratulations. It's just not nearly as good as advertised.


Oh.. genuinely, thats what you have been doing through this thread


> This is in contrast to the fact that Go's cheap threading means that you don't need to colour your functions with async or not async. But this quote that you sort of have to do this with context or no context.

This doesn't have the usual problems associated with colored functions though (ie, calling async functions from non-async functions or vice-versa). If you don't need cancellation, pass a `context.Background()` and you're done.

> If goroutines are so cheap then why not let the library spawn them. As long as the interface doesn't reveal if they are being used or not it shouldn't matter.

Agreed. providing a synchronous API is what's important.


> you don't need to colour your functions with async or not async.

Context is useful for synchronous functions also, it has nothing to do with async.


Go routine are cheap this is not the point, the point is should libraries expose blocking or not blocking API. Who should create goroutine etc...


TBH: Context feels like a wart to me. It works, but it's not elegant. Golang has an aversion to thread^H^H^H^H^H^Hgoroutine local storage. Instead it provides this kludgey experience.

I really feel that Golang V.2 should invent a better, native, way of controlling threads of execution.


Hmm, I think you think of context differently from me. For me context is something you use to manage execution. I mostly use it to notify different parts of the program that "you can stop what you are doing now". For instance if you are processing a request and the client went away. Or you have run out of time.

You're talking about context as a way to distribute data? I do that as well. For instance to provide auth/session data to requests, but that's usually just limited to one path in my software that does this. (I agree it is clumsy, but not because it is the "wrong" thing to do, but rather the API feels a bit dodgy).

If you are talking about something like thread-local storage, that's really a very different thing from both the control aspect of context and the request data aspect.

What extra functionality do you want for goroutine control and why do you think Go needs it?


What annoys me about Context is there's no way to tell if it's honored by the callee.

And when I'm accepting Context I'm annoyed at having to write handlers for it all through the stack having no idea how/if people will use it.


> What annoys me about Context is there's no way to tell if it's honored by the callee.

Everybody has to be decent enough to do their part.

> And when I'm accepting Context I'm annoyed at having to write handlers for it all through the stack having no idea how/if people will use it.

Thank you for doing yours :)


> What annoys me about Context is there's no way to tell if it's honored by the callee.

On the occasions where I've needed that I've used a WaitGroup and done wg.Add(1) at the point where I start goroutines and then have a defer wg.Done() as the first thing in the goroutine. I don't think the functionality belongs in Context. And if you put it there, you'd just end up complicating things.

> And when I'm accepting Context I'm annoyed at having to write handlers for it all through the stack having no idea how/if people will use it.

How would you propose you do it instead?


> On the occasions where I've needed that I've used a WaitGroup and done wg.Add(1) at the point where I start goroutines and then have a defer wg.Done() as the first thing in the goroutine. I don't think the functionality belongs in Context. And if you put it there, you'd just end up complicating things.

I'm not sure this is the same thing. The point of Context is to propagate cancelations or timeouts across multiple layers of your app and libraries, it's not supposed to be useful for directly started goroutines.


Perhaps not, but if you have no idea what ought to be cancelled how would it help you to know that something has been cancelled?

What changes would you make to Context?


I really disagree with this. A function taking a Context is a really important signal to me about the semantics of that function. I also much prefer being able to see context values explicitly passed around, instead of values that magically appear out of the ether, without a clear code path to find out where they came from, what goroutine it's bound to, where that goroutine came from and its lifecycle, etc.


The context is just a big bag of stuff. you don't know what's really in it. Ends up almost any method that needs something from the big bag ends up having a context parameter, but you don't know why that method needs it.


Separating these concerns (cancellation vs. bag-of-request-scoped stuff) might make sense. I'm specifically talking about the cancellation side of contexts. I don't think there's a good answer to this other problem of whether those two concerns should be combined into one mechanism, the options that I know about all have mixed tradeoffs.

I still think an explicit bag-of-stuff is better than an implicit one though.


Believe me I've tried to write Go code that doesn't follow the Go conventions and can't get my PRs approved even with tiny differences. So the idea that I could separate these two concerns might be a good one, but in practice it would be impossible.


Treating a context value as a bag to fetch data out of is the first mistake. They should only ever be used to control things like deadlines and whether or not a part of a function executes. IMHO they should disallow attaching values to a context.


Yes/no/maybe? context is one of the few ways to get contextual logging and tracing to work in a almost general way in Go. But I have also used it to pass the authenticated user to the handlerfunc. I dont dig it, but it works and avoids the need to keep a static map of request pointers somewhere to figure out which user was in this request...


Right, you don't know what's in it, but you know you need to forward it if you start a goroutine.


I also need to forward it to nonroutines because there is a logger in the context and almost every function wants to use the logger. So essentially, we have to almost alway pass context as the first argument to any function unless its some private function trivial function.


Yes, but I will curse everyone who didnt pass the logger so we lost the logcontext :(


I always thought that context should just be a goroutine local thing always available, automatically inherited when `go` is executed (obviously with the option to set an explicit one).


I think of contexts as being Go's answer to dynamic variables in earlier languages, like Lisp, and less like thread local storage (like Pthreads). Much like how Go works with errors, explicit is favored over implicit -- being able to see the context pass through, and whether a function expects a context, tells you a lot about the function you are about to call.

If a function does not take a context, you know it probably cannot be interrupted, just like when a function does not return an error, you know it should not fail. In my work projects, this is also a cue that the function does not do any logging since we always carry a zerolog.Logger in our contexts enriched with trace information about the request and handler.

This also makes life easier for me as a reviewer -- I can see the context passing, I can spot when there is a bad pattern, like retaining a context, or failing to handle an error. It does not require me to maintain a detailed mental map of which functions employ dynamic variables or can throw exceptions.


I think Jonathan Blow's programming language, Jai, has an implicit context available to any function. However in Jai the context has a lot of implicit functionality, on the top of my head at least logging plumbing, and allocator plumbing.


An alternative would be a language that has structured concurrency built in. [1]

The rules around goroutines and context seem to point in the direction of structured concurrency. For example, if any goroutines started in a function get cleaned up before return then that's following the rules of structured concurrency.

Thread-local storage is bad because it's implicit and causes bugs when used with concurrency; if you farm out some work to another goroutine, it will break.

[1] https://en.wikipedia.org/wiki/Structured_concurrency


> Thread-local storage is bad because it's implicit and causes bugs when used with concurrency; if you farm out some work to another goroutine, it will break.

Can you elaborate on why being implicit is bad and how it causes bugs?

I understand that shared data (via pointers) may cause race conditions and other unexpected behavior, so let's say we require that the thread-local storage can only store values (with value semantics).

If you could point out any issues with that, I'd greatly appreciate it.


It's been a while, but the underlying issue is that threads aren't always one-to-one with server requests. You can have a request where some work is handled by multiple threads. Or, a single thread can do some work for multiple requests.

So one possible bug is that you have a function that implicitly depends on thread-local storage, and then you move some work to another thread and call the function there, and it doesn't work because its dependencies aren't there. You need to manually set up the thread-local storage of each new thread.

Another bug is that if a thread does work on multiple requests (say, a task queue), some thread-local storage could leak data from a different request.

In larger systems, it might even be worse: one request can be farmed out to multiple servers and then you need to pass the context along over the network when doing rpc. This only works for serializable data, but things like deadlines can be propagated, and a request id that ties it all together is useful for logging.

"Which request am I working on" is something that's transient and often doesn't map directly to OS-level objects. (Although it does map one-to-one in simple cases.)


So to paraphrase, the problem is around context passing between threads, which makes sense.

Sounds like that could be solved by a language surfacing the context at thread boundaries like the `go` statement, possibly channels.

Thanks for the answer!


Are we missing a thread pool/executor like abstraction for workers? If we really want callers to be able to control concurrency primitives deeper down the stack, we should coalesce on an executor paradigm that the library can use as it's work queue.


The only way to do that would be to introduce a concept that is orthogonal to functions (and their signatures) / errors, and that would mean an incredible increase in complexity. I doubt the Go authors will do that.


I think it splits people the same way that err values split people, which is that Go makes more things values in service of making the control flow plain.


Errors as values would be nice if they weren't so terrible ><

The tooling around them is nearly non-existent in stdlib, and if you want to understand errors from stdlib you will have to guess at string content.

And like, why is there not a stack trace by default?


It's not a hard rule that context should not be struct fields. See https://github.com/golang/go/issues/22602 (context: relax recommendation against putting Contexts in structs)

"Right now the context package documentation says

> Do not store Contexts inside a struct type; instead, pass a Context explicitly to each function that needs it. The Context should be the first parameter, typically named ctx: [...]

This advice seems overly restrictive. @bradfitz wrote in that issue:

> While we've told people not to add contexts to structs, I think that guidance is over-aggressive. The real advice is not to store contexts. They should be passed along like parameters. But if the struct is essentially just a parameter, it's okay. I think this concern can be addressed with package-level documentation and examples."


Rule 3: Don’t store contexts - What about this: https://github.com/golang/go/blob/master/src/net/http/client...

  req = &Request{
    Method:   redirectMethod,
    Response: resp,
    URL:      u,
    Header:   make(Header),
    Host:     host,
    Cancel:   ireq.Cancel,
    ctx:      ireq.ctx,
   }
Isn't this considered storing the context?


Yep - (my understanding is) the Go HTTP stdlib module predates the concept of context in Golang, so the implementation was bolted on to ensure backwards compatibility. NewRequestWithContext was only added in Go 1.13 [1]. Previously, requests were cancelled manually with CancelReqest [2]. This is an unfortunate wart of the language - it means it's very easy to accidentally spin up a new Request which doesn't inherit the parent context by calling NewRequest instead. And adding the context via the builder pattern means it's possible to introduce the storage bugs described in the article. My preferred way to consume a context would be to take it in when the work is actually about to be performed - e.g. client.Do(ctx, reqest)

1 - https://pkg.go.dev/net/http#NewRequestWithContext 2 - https://pkg.go.dev/net/http#Transport.CancelRequest


https://go.dev/blog/context-and-structs

>Exception to the rule: preserving backwards compatibility

See that section of the blog post. It talks about the different approaches they could have took and why they chose the one they did.


Yea I don’t really agree with the rule that goroutines shouldn’t be started in libraries either. For example, say you are building a library to send metrics to a metrics collector. For me it makes sense that your metrics library contains a buffer of metrics which it batches data together and sends to the metrics collector asynchronously. This would be implemented as having a library goroutine which batches and sends metrics data. I guess in theory your library could have a ‘Flush’ method and then if the application wants async flushing the application can start a goroutine which periodically calls flush. But then the application needs to know the ideal frequency to flush, how to handle failures to flush, how to backoff, etc. These things are probably better done by the library writer.


Maybe C# got me spoiled with CancellationToken which seems like a nicer API. And, perhaps, synchronization context as well if you are writing a GUI application and need a render thread, to make sure you yield to the right one. Though if that's not the preferred pattern, publishing a message to a channel on one end and then reading them from another is always an option.


Just wished ConfigureAwait was not true by default


I'd argue here that it's not a problem of context storage, it's a problem of not ignoring cancellation in certain situations. Since context, for better or worse, has two purposes, you may still want a lot of the request-scoped data for later operations after the initial timeout/deadline is done. And since context has a standardized bag of data, it's more future-proof to just keep using the context and its data rather than, say, extracting the data you know about today (like trace/span ids) and storing that for later use.

Fortunately, following appropriate patterns here got a lot easier with 1.21 and the addition of `context.WithoutCancel`. If you're going to store a context for later use, since there's potentially e.g. tracing data you still want to keep, make sure you appropriately `context.WithoutCancel` to keep the data without keeping the original deadline.


Question about contexts in general: is there a way to "guarantee" that contexts are used correctly by whoever is consuming my context?

For example, if I call http.NewRequestWithContext(), how do I, as the caller, know that http is doing the "right thing" with that value, rather than ignoring it?

In the OP's example, intuitively it seems like an explicit Stop() function gives the caller explicit control of when to stop and that anyone implementing a Worker (if Worker were an interface) would know that the Stop() function should do cleanup.

However, if I only pass in a context when calling Run(), wouldn't it be easy for someone to ignore a deadline?


Most of the time you accept a context because some downstream function requires it (i.e. I'm writing an HTTP client, and the `net/http` std lib functions require context for some part of it). You can have general confidence that the standard library will respect things like context deadlines, even if the wrappers that invoke that don't necessarily.


I don't really agree that it's an antipattern for a library to create a goroutine. If you consider that starting a worker is, in a sense, an entry point, you can even claim to otherwise conform to The Rules.

Why would I want to make my caller think about scheduling library internals? As long as I'm managing resources appropriately, and exposing knobs as necessary, what's the problem?


Is it correct to assume that the "cancellation" part of `Context` is similar to C#'s `CancellationToken`? Also, it looks like it allows to pass some "implicit parameters" to another function. If that's the case, why does Go have a single entity performing two roles: cancellation + implicit parameter passing? I would expect to have these things separated.


I also find it curious that a language with a preemptive scheduler requires manual "yield" points by constantly checking on the context if the current function should stop executing.


Go contexts are something that I found confusing for a while. I had assumed they were much more complicated than they actually are. I read the context section in Jon Bodner's Learning Go and realized they are actually pretty simple and I was just over thinking them.


I find that combining context and errgroup, with due care, lets us approximate Structured Concurrency to great benefit. I just think more care is due than most people give.

When a function returns, we should be able to trust that any goroutines it created have already terminated and will not have other side effects. This is important because Go doesn't enforce read/write thread-safety any other way, so we need it to be clear from the code when those reads/writes may happen.

It's also important because side effects from those lingering routines could have other consequences, e.g. the retry for an IO operation could overlap with a past attempt, violating invariants in ways that would be really hard to reproduce and debug.

This sounds really simple, why wouldn't you do it that way? Idiomatic use of errgroup encourages you to do it that way, but not everyone does it that way, and sadly not every project even uses errgroup in the first place. It's very common to see a routine observe cancellation and return immediately, even if it created its own goroutines which it can't guarantee have aborted yet.

Aside, it's also sadly extremely common to see people reinvent errgroup badly with "error channels" that at best don't join the other routines and at worst deadlock them because they block on sending errors that nobody is receiving any more.

That's why if you do this, you basically ban the `go` keyword and strictly use errgroup, even for routines which can't return errors. (WaitGroup can do this too, but it's harder to use right, because the Add/Done count have to add up exactly and there's no enforcement that they do).

If you do this right, then it shouldn't matter whether a function creates goroutines to help with its work, such as timeout channels or parallel processing or what have you. What matters is that the function still acts like a synchronous one from the outside.

The worst I've seen is when people know they have to use errgroup, but they create one large errgroup in main and pass it around as a mutable argument to everything to add more tasks to it. They don't understand that when it's used correctly, it also nests and encapsulates entirely, so it's never the argument to or return from a function.

Of course it gets more complicated if an object has long-running goroutines that outlive any particular function. Then you need to call more functions just to create those wait points. For example, it's not enough to cancel a database cursor as a context, you should still block on closing it, otherwise its own routines can still be running when you go back and start another operation. Again, sadly all too common.

Caveat 1: errgroup only returns the first error, which for many routines is just "cancelled". That's not useful, and it takes the place of what could have been a real error. I suppress errors like that, so that actual cleanup errors, if any, are the ones surfaced.

Caveat 2: errgroup doesn't trap panics, if you want the panic to be surfaced as a neatly packaged error, you have to install your own handler. Every project I have has its own simple version of this, and I've seen many other projects come to the same conclusion.


What about storing the context for lazy-loading? Like a client that's initialized in main() with a ctx, but starts async workers on first-use deeper in and across go routines?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: