Hacker News new | past | comments | ask | show | jobs | submit login
Error stack traces in Go with x/xerror (brandur.org)
100 points by sicromoft on Aug 23, 2021 | hide | past | favorite | 107 comments



At $DAYJOB we had a Go dependency on a package (maybe pkg/sftp?) that used github.com/pkg/errors to capture a stack trace with the error. These errors were ultimately used in a loop for flow control (maybe testing a lot of files/servers?) where collecting all the stack traces caused a lot of slowdown, heap garbage, and GC pressure.

I used go mod replace to strip the stack trace collection out of pkg/errors, with a fork that does a no-op for that function call, and it was a significant improvement for our use case.


Yes, I've gone back and forth on "all errors should be annotated" or "all errors should have a full stack trace" and frankly, I'm now in the "it's just an error" camp. Stack traces seem great but are kinda insane overhead. If you're bubbling up errors from a real deep place, having a $package.$method: %w wrap the error is nice, but beyond that, it's a headache.


Insane overhead in what way?


I believe the solution to this is to classify errors properly. Any "internal error" as in HTTP 500 Internal Error should be generating a stack trace. Most other expected errors (like your case) should not. I codified this practice in a library I created for Go called errcode [1] which is designed to attach error codes and other meta information where errors are generated.

[1] https://github.com/pingcap/errcode


> I used go mod replace to strip the stack trace collection out of pkg/errors, with a fork that does a no-op for that function call, and it was a significant improvement for our use case.

Is there a write-up or references on how one can achieve this? Sounds like a good practical use case for the replace directive.


Hmmm...

  and (2) a way to cut down on if err != nil { ... }
  boilerplate that pervades all Go code.
Am I weird in liking the explicit error handling? :/


I like explicit error handling, but I wish there were a concise way to handle errors from the same line. A one-line function call `foo()` becomes four times as long once you handle errors.

    result, err := foo()
    if err != nil {
        return nil, fmt.Errorf("Error message: %w", err)
    }
This pattern is just so commonplace, but it's four times as long.

Contrast with Rust, which handles errors as `foo()?` or `foo().context("Error message")?`*. It's still explicit — I like it better than silently-propagating exceptions — but my functions don't wind up being 75% error handling.

* Where the `context` function comes from an error-handling library.


I don't think you're weird - I understand that some people like it and I can kind of see some of the arguments as to why - but the error handling in Go is easily my least favorite thing about it. For two reasons:

1) The aforementioned boilerplate code that makes up a significant chunk of many Go projects and adds no value.

2) With explicit error handling, however deep your call stack is, you are relying on everything in that stack to have done the right thing with regards to error handling. With exceptions, you have to go out of your way to screw them up.

I can't tell you how many times I've gotten dumb error messages like "Invalid string" from a 3rd party library that take forever to debug, and accidentally swallowing an error is even worse. A simple (even crappy) exception message + a stack trace is much easier for me to use for debugging than even the best handcrafted error message in 99% of cases.

I write somewhat equivalent amounts of Python, Java, and Go lately, and each one has their good and bad parts, but there are few things I dislike as much as Go's error handling patterns.


> With explicit error handling, however deep your call stack is, you are relying on everything in that stack to have done the right thing with regards to error handling. With exceptions, you have to go out of your way to screw them up.

I don't think I agree with this argument. In many languages, there's no obvious indicator that any given function may or may not raise / throw an exception. And even if it doesn't throw an exception today, it might tomorrow, and it's easy to forget to update all callers.

Since Go makes errors a return value you have to actively discard the result (by replacing it with a underscore, e.g. `res, _ := getResults()`) or take the time to handle the error.

And just like an intermediary library in Go can swallow the error, so can an intermediary library in an exception language catch and discard an exception.

It seems to me that the result is more errors are properly handled in Go - because they are explicit - while uncaught exceptions often cause bugs that make it to production.

For context, I recently switched jobs from one where I wrote Python for 4.5 years to a job where I've been writing Go for about 5 months.


> With Go making errors a return value you have to actively discard the result (by replacing it with a underscore, e.g. `res, _ := getResults()`) or take the time to handle the error.

Right, but because `err` ends up getting re-used in so many cases, you can re-assign it and forget to do anything about it (which is what I've found in most cases where an error was inappropriately suppressed in Go).

In most cases, simply bubbling up the error to something that will generically handle all errors is the right thing to do, and in the case of exceptions, even if you don't know about one, that is what will happen. I just sampled a handful of the top Golang repos on Github, and found very few cases of anything more than the standard `if err != nil; return nil, err' pattern that just bubbles the error up.


Last time I wrote go you could write `doThingAndMaybeReturnErr()` without the `_ =`. Has this changed?

The alternative is Zig or Rust's thing, where you still have explicit error handling but you don't repeat the same four lines all the time. This is maybe helpful because in Go one has to read those lines carefully to see if anything unexpected is happening.


Linters such as golangci[1] which will warn if you ignore the return value. If you ever write Go again I highly recommend you look at golangci -- the security checks have been really helpful to me.

[1] https://golangci-lint.run/


> Since Go makes errors a return value you have to actively discard the result (by replacing it with a underscore, e.g. `res, _ := getResults()`) or take the time to handle the error.

LoadData(&data)

Does this function return nothing or an error ?


It returns nothing or the compiler/linter/ide will tell you.


Which compiler/linter/IDE is that? The Go compiler and linter certainly don't tell you:

https://play.golang.org/p/k-eTs3dC_wK

Note that Go playground also runs go vet, a static analysis tool.


You're probably right. I use Goland and I just think of it as a natural extension to Go. I shouldn't.


> Since Go makes errors a return value you have to actively discard the result (by replacing it with a underscore, e.g. `res, _ := getResults()`) or take the time to handle the error.

This is a bad thing because it’s not narrow. It’s like a language’s “catch” statement not allowing you to catch specific exceptions. Once someone has decided to throw away an error with the underscore assignment, all future errors the underlying might return are swallowed as well.

In other words, it takes even more boilerplate to have narrower error exemptions than to have the generic exemption.

> It seems to me that the result is more errors are properly handled in Go - because they are explicit - while uncaught exceptions often cause bugs that make it to production.

An error in go that is blindly returned all of the way up the stack has no difference with an uncaught exception.


There’s two types or reason for a program to error:

- the code has failed and the developer needs to be informed what to fix

- the environment has failed and the user needs to be informed

Exceptions are for the former, which Go still supports via Panic(). Errors are for the latter. Which is why all the boilerplate Err code creeps in.

I don’t like getting stack traces from applications when I’m a user. It always feels to me like the application is only half finished. When the error is environmental (eg file system permissions) a stack trace just muddies the water with unnecessary output.

This is why I like Go’s distinction between errors and exceptions.


This is great in theory but in reality no such distinction exists. For example, is trying to read a missing file a failure of the code, or a failure of the user? It could easily be either, the developer hardcoded the file path wrong during development, or the user selected a wrong path in a CLI, ... the distinction between errors and exceptions must made by the call site, not the definition site. The code calling readFile() knows whether the error is recoverable or not. readFile() itself does not, so the same mechanism should be used for both errors and exceptions.

All this was figured out decades ago, that's why exception systems were invented, but language designers keep making the same mistake over and over again in an effort to simplify the unsimplifiable.


In your example, it should be an error because it’s an environmental problem (could be file system permissions). Sure, it could be a developer fail where they hardcoded the wrong path (surely any kind of testing would pick this up?) but it’s still an error in the environment where had the path been correct it could fail due to something the user misconfigured. So it needs to be handled as a user facing error.

Exceptions do exist in Go, it’s called “Panic”. This whole “Go doesn’t have exceptions” meme is completely untrue. It’s just not the preferred way of error handling because, unsurprisingly, most application errors are going to be environmental. But if you select an out of bounds index in a slice or don’t handle a nil pointer correctly you’d get a panic — because those are clearly developer issues that warrant a stack trace rather than environmental problems that need a friendly user facing error message.


If you make it an environmental error then the programmer is likely to write an error handler that ignores the error because it can "never" happen and the compiler is forcing them handle it. You now need a linter to prevent this and a way to signal to the linter that "no I really intended to ignore that error, I promise"

Whereas exceptions fail safe, ignoring the error means not catching it, which means if the error does occur, it will load up the debugger.

This is the whole Java "checked exceptions" misfeature again.


This makes no sense. Plus naff developers can be just as lazy with exceptions if they wanted. A try/catch block can be abused extremely easily.


Except that people are too lazy to use exceptions that are checked at compile time.

The main benefit of Go is that the people that defined the library ecosystem actually decided to handle errors. It's the bare minimum, but it's better than just forgetting that the error exists and not documenting it either.

The language is ... not the best, but the libraries tend to be more robust.


Libraries can almost never handle errors - they can only signal them. And Go libraries are particularly bad at this, because they almost always return error strings, instead of meaningful error types, which means you can't easily handle a subset of errors. Even if you did, Go's errors suffer from the same problems as exceptions in many languages: there is no way to document what errors your function can actually return, just that it returns some error.

What language are you thinking of when saying Go libraries do better at error handling?


It’s not always laziness, in Java

  stream.map(f)
doesn’t allow f to declare any checked exceptions. We could catch and wrap everything but that obscures the useful code without any improvement in safety.


Your entire argument rests on the weird assumption that the User is responsible for the environment. In the vast majority of software used today, that is not the case.

In SaaS, mobile, and appliances, the environment is entirely or almost entirely controlled by the programmer - a missing file is the developer's problem, and the user can't do anything about it.

In enterprise software, there is a third entity, the Administrator, that has (almost) full control over the environment, and usually there are layers of 3rd party software that control it directly. A missing file or permission error is useless to the user, most likely it needs to be logged somewhere that Admins check, and the user simply told to contact their Admin.

Finally, even in personal desktop software, the environment is jointly owned by the user and the developer - the developer is normally responsible for setting up the initial environment through some kind of installer (msi, deb, make configure etc.), and many environment issues are bugs in the installer, not user errors.

Of course, the same library may very well be used in all of these vastly disparate deployment scenarios - the library can't decide what is the source of an error, so making it arbitrarily decide between two possible kinds of errors is wrong.

Note that even the reverse is not clear; an index out of bounds error could be a user problem - if they are trying to access the 7th element of a 5 element list. The fact that you'd normally do

  if userSelectedElement > len(arr) {
    return nil, fmt.Errorf("Array index out of bounds")
  }
  return arr[userSelectedElement], nil
Is only because of the convention you were proposing - `return arr[userSelectedElement]` does the same thing alone, if we disregard the convention (in a memory safe language, of course!).


Your argument is just reinforcing the reason for errors to be concise and to the point rather than displaying a stack track when it’s not the developers who are administering the application.

Trust me, searching through a stack trace on a centralised logging system isn’t fun.

> Finally, even in personal desktop software, the environment is jointly owned by the user and the developer - the developer is normally responsible for setting up the initial environment through some kind of installer (msi, deb, make configure etc.), and many environment issues are bugs in the installer, not user errors.

Desktop Linux, yeah. It’s seldom that simple in servers though. SELinux, custom config, custom iptables rules, network wide UIDs (eg shared storage volumes), there’s so much that can go wrong the moment you do enterprise.

> Note that even the reverse is not clear; an index out of bounds error could be a user problem - if they are trying to access the 7th element of a 5 element list. The fact that you'd normally do

If you’re writing software that doesn’t do input validation and bounds check then you’re a failure of a developer. Sorry but this is the bare minimum I’d expect a developer to do.


> Trust me, searching through a stack trace on a centralised logging system isn’t fun.

A stack trace is still better than a one line error with no context. It's a kind of 80% solution - it's not ideal (a perfect error includes only the relevant context), but getting it's much better for 0 effort than error codes/values give you for free.

> If you’re writing software that doesn’t do input validation and bounds check then you’re a failure of a developer. Sorry but this is the bare minimum I’d expect a developer to do.

So what is the profound difference between doing input validation in your own code vs letting the array accessor do it?


> A stack trace is still better than a one line error with no context. It's a kind of 80% solution - it's not ideal (a perfect error includes only the relevant context), but getting it's much better for 0 effort than error codes/values give you for free.

Yeah, a stack trace is better then an “undefined error” type message. But the point of forcing error messages over exceptions is you’re enabling you’re developers to write meaningful error messages. So your point is moot.

> So what is the profound difference between doing input validation in your own code vs letting the array accessor do it?

- Meaningful error messages (eg has the code failed because of a bug or because of invalid user input?),

- security hardening,

- reducing potential undefined behaviours,

- thorough unit testing,

- self documenting code (the code clearly defines what the happy path is and when it’s possible for a user to break out from that)

Etc


>Exceptions are for the former, which Go still supports via Panic().

The panics are there to be used only in cases where the further code execution is impossible. Not to inform the developer about an error.

Errors are a general solution to report an unexpected event during execution. The destination of such report depends on the part of code the program failed.


I get that what you’ve posted is the official description but it doesn’t really explain the distinction well IMO. Eg there are plenty of environmental reasons you’d need to cease execution but giving the user a stack trace isn’t helpful. So be describing the two as who the target recipient of the error is helps. Or at least I think it does.

Personally my one gripe with Gos errors is that EOF is handled as an error.


> Eg there are plenty of environmental reasons you’d need to cease execution but giving the user a stack trace isn’t helpful.

Not sure I got this part. If you want to stop the program and don't to print the usual panic messages into the console - you can log\print whatever you want and call os.Exit().

At the same time you can log stack trace without panicing.

>Or at least I think it does.

Maybe, it's just that "the code has failed and the developer needs to be informed what to fix" applies to errors just a much as to panics. The only difference is that an error is the situation during the execution where you can continue the job and a panic requires execution to stop (in most cases).

At the same time panic are 'developers only'.


You’re arguing the same points I’m making :)


> With exceptions, you have to go out of your way to screw them up.

I remember well java codebases littered with:

    try { .. } catch() { // Todo }
Usually cheaply inserted by the IDE.

I'm pretty sure nowadays there are linters that will ensure you have to do some extra work to actually check in such code, but still...


I've never had a Java IDE insert something like that?

But either way, you have to go out of your way to do that, which is sort of my point.

The one place where you do see dumb boilerplate and chances to screw up is in dealing with checked exceptions, which have been controversial since the very beginning. I think they are one of those things that seem like a great idea in theory, but end up not working out so well in practice, but (like people who appreciate Go's error handling model) there are people who disagree.


Exactly. It is kinda obvious that someone is doing something wrong with the code when they're doing a generic catch (sometimes it is fine, but this will at least raise some eyebrows).

However it is very easy to do the wrong thing using Golang. Just one random library doing `fmt.Errorf` without the `%w` verb is sufficient to lost all information from there on. I would much prefer some kinda of annotation that does the correct thing by default (wrapping the error) instead of the "every error should be explicitly" approach of Golang.


The above code is probably caused by checked exceptions. In particular, the user wants to implement some interface or override some method, but that method wasn't declared to throw a particular type of exception because the people who wrote it hate extensibility and never want their software to be used in a way they didn't anticipate, so the new implementation can't throw it either.


> I've never had a Java IDE insert something like that?

yeah this indeed happens with checked exception and careless developers just wanting to get stuff to compile (we'll handle that property later) and the IDE helpfully providing the boilerplate that resolves the compilation error (and assumes you'll fill the body)


I don't buy the argument that Go makes error handling explicit. It is too easy to forget to check for an error in Go. Did you know that `fmt.Printf` can also fail? Do you explicitly check for `err` when using it? Probably not, because what Go calls explicit error handling is merely a convention, not something supported by the language, e.g. [[nodiscard]] from C++ or #[must_use] from Rust. And who thought reassigning `err` multiple times was a good idea? That again strikes a lack of functionality in Go.

What saddens me is that these kinds of matters will never be fixed in Go. Go has a stubborn anti-feature mentality and while it does help preventing feature creep, it overall harms the language in the long run. For a language created and maintained by Google this is a huge missed opportunity.


I’m not sure the mentality is exactly “oops, oh well, leave it broken forever” [0]; it is intentionally ponderous. As a stop-gap, linters [1] can get excellent insight from ASTs.

The stability of Go is something I value a lot, so I don’t mind stop-gaps like this too much on balance.

[0] https://github.com/golang/go/issues/20148

[1] https://github.com/golangci/golangci-lint


There has been a go2 proposal for better error handling, so there is no issue trying to improve things or adding needed features. But things go slowly and must be discussed and vetted before being accepted because the language has few features and they will stick so the filter is quite high. See generics, it took a long time but its coming.


The proposal for better error handling was rejected, as far as I'm aware. Generics needlessly repeated the mistakes of Java and C# in coming out years too late.


I kind of agree, it's better to have generics from the start (we'll see if there are difficulties added), and the error handling was rejected. But the fact that those proposals have been proposed, discussed, studied and for one of them accepted show that the language can evolve.


> Did you know that `fmt.Printf` can also fail? Do you explicitly check for `err` when using it? Probably not, because what Go calls explicit error handling is merely a convention

I was curious about this and it’s true, I didn’t know these functions could err, TIL.

The Godoc states:

> It is conventional not to worry about any error returned by Printf


I hate it. All it does is add if err != nil { return nil, err } to every other line. I love Go, but I have never understood the obsession with generics over putting try into the language. This is the biggest pain in Go, by far.

The vast majority of errors only stop or rollback the current action. An incredibly small amount of code uses errors to detect stop conditions or to retry.


To be fair, generics can allow you to get rid of some of this boilerplate, in principle. Though I would still love to get some built-in error handling.


Then you just have Railroad Programming without the monadic binding operators and currying of the languages where that is the main mode of dealing with errors. That sounds awful.


I know people on HN hate python to some extent because of performance issues but I love the error handling in python. Trying to learn go on the side this is the biggest thing by far that just halted all of that. No error handling? I was baffled. I hope they add this into the language. There's no way I'm wasting time writing `if err != nil` all over my code. try/catch blocks please


I prefer how Rust wraps it in a type. Makes it seems like "part" of the language, rather than something bolted onto every function.


> Am I weird in liking the explicit error handling? :/

The issue is not explicit error handling it’s specifically Go’s, which is verbose and half-assed. Not entirely unlike java’s checked exceptions though unlike checked exceptions we have plenty of other (and I’d argue better) implementations of “explicit error handling”.

Go’s error handling is a relatively minor improvement on C’s, but we’ve gotten quite a ways beyond that since.


I just want Result types I can map over. That would cut out a bunch of clutter. Even if I could pair one fallible function with one infallible function, that's 50% less err != nil

   val, err := int_or_err(foo)
   if err != nil { return nil, err}
   return val.add(bar), nil
Vs

   result := int_or_err(foo)
   return val.map(add,bar) 
I'd even take some sort of

   result := int_or_err(foo)
   return val.map(add,bar).tuple()
to return a val,err pair in order to match conventional go.


It looks more concise but if you're like me not into functional programming much, .map() is magic.

Consistency is more important than conciseness. Clear is better than clever. Plus, a function invocation / .map() is heavier than an err != nil check / condition.

Don't get me wrong, I appreciate the Either pattern as an alternative to if err != nil, but I also appreciate the really dumb and straightforward approach of non-clever Go.


I think the main issue is not the boilerplate as is usually the source complaints, but the fact that not handling errors is so easy due to shadowing.


Both are problematic. Boilerplate does wonders for mistakes hidden in plain sight - especially when you have that one rare piece of code that actually does something with an error, and it completely slips by code review because anyone who works with Go has long learned to glaze over anything that begins with if err!=nil


Exception error handling is implicit because we don't know which line inside a try block can throw an exception.

However, Go's `if err != nil` is not more explicit than, for example, Rust's question mark operator. The former is more verbose than the latter, yes. But both are explicit in that we can know which line can return an error.

The `if err != nil` is probably okay if there are only a few lines that can return an error, but if most lines in a function can return an error, it will result in too much noise. A real world example where panic is abused as an exception-like approach because the "proper" error handling using `if err != nil` is way too verbose: https://pkg.go.dev/github.com/apple/foundationdb/bindings/go... (Actually Go's stdlib too sometimes abuses panic in a similar way.)


> Exception error handling is implicit because we don't know which line inside a try block can throw an exception

You just need to have some thought when writing your code:

  let a;
  try {
    a = someFailingFunc();
  } catch (e) {
    // handle
  }

  // use a
  // do work


The whole point of Exceptions is to avoid writing try/catch all over the place. Let's take an example of idiomatic exception-based code:

  func doRequests(url1, url2 string) resp throws HttpException {
    resp1 := http.DoRequest("POST", url1)
    resp2 := http.DoRequest("GET", url2 + resp1.ID)
    return resp2
  }
vs

  func doRequests(url1, url2 string) (resp, error) {
    resp1, err := http.DoRequest("POST", url1)
    if err != nil {
      return nil, fmt.Errorf("Error in req1: %w", err)
    }
    resp2, err := http.DoRequest("GET", url2 + resp1.ID)
    if err != nil {
      return nil, fmt.Errorf("Error in req2: %w", err)
    }
    return resp2, nil
  }
Of course, we can write the second example with try/catch as well, but the whole point of exceptions is to be able to write the first one when appropriate.


> The whole point of Exceptions is to avoid writing try/catch all over the place

Of course. I'm talking about when you actually handle your exceptions you should be specific about the lines you're handling.


I tend to agree, I dispute the premise that this is a problem a bit. I mean, I get not wanting to type the same thing repeatedly, but when reading code later it's really nice to have the logic explicitly in front of me and not hidden behind some other syntax or function. It could possibly be done better, but it doesn't need to be done better, IMO.


Explicit error handling and a lack of generics were a pain when I first started with Go, but after a few years I see them as amazing features.


there are arguments against it but I frankly like it being this explicit as a newbie in Go. So you are not alone :)


I’ve been writing Go for a decade and I’ve liked it from the beginning, but stack trace support is welcome.


Yep - my biggest gripe is seeing errors with no context around them - either a stack trace or...well...the context.Context.


I've just started writing C++ for work, and I have to say, I miss stack traces so much. Even with Go errors the wrapping usually made it easy enough, but with how long it takes to get gdb to launch from core dumps (~1m30 on our binaries at work), I really do miss that extra context.


100% Shameless plug. Stack capture and pretty printer for C++: https://github.com/bombela/backward-cpp


This looks super neat. Thanks!


It is not particularly hard to log stack traces in C++, for example from an abort signal handler or attaching them to exceptions.

There is no standard solution though, so you have to rely on 3rd party libraries or use platform specific solutions.


why not just enable them ? there are plenty of minimally invasive and permissively licensed options... that's like -Werror=return-stack-address it's an entire no-brainer


Well, namely it's not my decision to enable them for the entire codebase :) but I may very well try to talk some people into enabling this on debug builds.


I don't think so. I like that it strongly encourages you to actually add human readable context to errors too.

I think people's issue is that there's no "I don't care about errors, just show me a stack trace" option like you get with exceptions, or more or less with Rust's `?` if you don't use `.context()`.


It becomes painful to write in unit tests.

I really liked their `check` proposal. I hope they will bring it back to life.


Do you use stretchr/testify/assert? I find that it makes error handling in tests as natural as assertions on other properties.


Aside from the "assertion" libraries, which I have mixed feelings about, I often use table-driven tests to avoid this. That way each test is generally one line, like `{"1234", 1234}` for testing an Atoi function. Or, I've seen little test helper functions that "fatal" the test on error -- that way it only takes one line instead of three: https://play.golang.org/p/WIALie5fsgd


The thing I hate about table-driven tests is that they make it slightly harder to identify where an error has occurred, and much harder to debug the specific test case that triggered the error, especially in Go where dlv lacks value breakpoints (or it did last time I checked - if they've added that in the meantime, this becomes much less of a problem).


In my experience, the people who complain about this the most are the least likely (in go or other languages) to pay proper attention to handling errors, and are the most likely to pay way too much attention to how code looks. Not just tidy, but a certain style. They're used to not seeing code in places where errors are emitted (exceptions) and putting magic try/catch blocks that often lead to imprecise error handling.


I've learned to consider it an excellent feature, but is Go's implementation/idiom really better than anything that came before it?

I have marginally more experience with Rust (not that much of either) which instead gives Result types, tuples of 'ok' and error values. And, crucially for this thread, they must be explicitly consumed.

That enforced error handling is novel to Rust afaik, and after a bit of getting used to, an excellent feature I think.

But I'm not sure that the idiomatic (because that's all it is really, rather than language feature?) Go is materially different from deciding all your Python functions will return Tuple[TOk, TErr] and sticking to it, or even that different from returning ok types only and raising exceptions, really.


> That enforced error handling is novel to Rust afaik

Hardly, though Rust of course has merit in making it popular, to the point that people think it was introduced by the language.


Well that's why I said as far as I know. What else has something similar?


Anything with ADTs. I know it from the ML family of languages (Ocaml, Standard ML, F#, etc.)


I think one difference between Rust and OCaml is that in Rust non exhaustive pattern matches are an error, while in OCaml they are a warning.


> Anything with ADTs.

Well that's not true, that's completely orthogonal?


It's not. "Forced error handling" in Rust (particularly when contrasted with something like Go like we're doing here) is exemplified with the Option and Result types, and how the compiler will maintain type safety around the concepts of success and error. ADTs are what lets it do that, and the MLs do that as well. You can also see this in how Typescript for example uses ADTs for typesafe error/success state space enumeration.


But you could have ADTs without using them for 'forced error handling', and you could have 'forced error handling' without using ADTs for it - C could require 'consumption' (for a slightly less strict meaning of it) of return values. That's the bit I think is interesting/was new to me. You can't just call `maybe_works();`, you have to explicitly consume its 'result', even if just to propagate the error as `maybe_works()?;`.


Oh, ok. Yeah that in fact does exist in C and C++ the same way as in Rust. Rust doesn't actually by default require you to consume a return value; a lot of functions are just marked as #[must_use]. C and C++ do the same thing with [[nodiscard]] and __attribute__((warn_unused_result)).

There's a go vet pass as well. https://pkg.go.dev/github.com/golangci/govet#hdr-Unused_resu...


No, not really, it's an accurate summation of where the pattern is commonly found. ADTs, like Rust's Enum, enable this kind of error handling, and are made more ergonomic with monad behaviors.

Scala's Either[A, B] is another example.


Yes, but Haskells'/Scalas' Either is abstract while Rusts' Result is specific SFBAP (Sorry For Being A Pedant).


Can you clarify the distinction between "abstract" and "specific"?

Is Scala's Try[A] "specific"?


Sure. Result is specific, you get a value or an error. Either can be used the same way, but you don't have to. The convention is use Right for the value and Left for the error but you can use it in other ways too, eg Left for an Int and Right for a Float.


Haskell's Maybe type (monad?) comes to mind.


That's true, it's a while since I did only a little bit of Haskell, but yes. I suppose I didn't think of it because it feels less like 'having to handle' it in Haskell, since that's the norm anyway.

Whereas if you compare Rust to Python, C(++), or Go as I was - having to consume a returned 'result' is more notable.


I think you can just use the `Callers()` function from the "runtime" package to the get the call stack, though in general I think it's a bit overkill. I think what you should do instead is to wrap errors with '%w' [1], manually adding context to them as you pass them up your call stack. That leads to more readable code and won't incur any performance overhead. It's tempting to have the call stack available for debugging, but IMHO it creates too much overhead when doing it indiscriminately for all errors you generate, which for me also goes a bit against the "spirit" of Golang.

[1]: https://go.dev/blog/go1.13-errors


How many errors are you creating that you would start caring about overhead? And why do you think code is more readable if it's littered with guesses on what may be important to track down a problem, rather then the simple call stack?


In my typical Go code many errors are handled internally, e.g. "not found" type of errors returned by database methods. It would be pretty wasteful extracting full stack traces for these and in high-performance scenarios this can really hurt you.

I think stack traces should be mostly reserved for interactive debugging and should not be included in user-facing errors. Why should the users of your program care that foo() called bar() called baz() which then produced an error? They want to know what went wrong and how to fix it (if it's "their" fault), and that is much easier if they get proper, context-specific errors (e.g. "CLI argument 'limit' must be between 1-10"). And if you need a stack trace for debugging a problem you should simply use panic().


Because the call stack does not include the value of function call arguments. A good error message does include the call arguments values relevant to the error AND those of the relevant variables. But I agree it takes upfront effort and often also need adjustments when debugging an issue later (i.e. adding more info to the error message and rerun).



One extra tip when wrapping error return values is to use the wrapcheck (https://golangci-lint.run/usage/linters/#wrapcheck) linter. This will tell you when you're returning an error without wrapping it.


Like the article mentions, they didn't bring over stack traces (namely the `Formatter` interface) from xerrors. I wrote a library[1] around it that would generate true stack traces. I don't use it as much as I used to, because I don't want to depend on a package like xerrors I don't trust to remain maintained, but it was a fun exercise at the time, and very useful while I used it. I wish that we wouldn't have to depend on a tool like Sentry for bringing this about, like the author suggests.

[1] https://github.com/ollien/xtrace


github.com/pkg/errors also is a amazing option


Yeah, for sure! I mention it in the README of that library, but one of the motivations I had was to not require you to wrap the errors with my library, like pkg/errors requires you to.


People seem to fixate on stack traces, because other languages present nearly every error in stack trace form. I think you should think about why you want them and make sure you have a good reason before mindlessly adding them. I do collect stack traces in some Go code, because Sentry requires it for categorization, but in general, you can do a much better job yourself, with very little sorcery involved.

A common problem is that when multiple producers produce failing work items and send them to a consumer -- a stack trace will just show "panic -> consumer.doWork() -> created by consumer.startWork()". Gee, thanks. You need to track the source, so that you have an actionable error message. If the consumer is broken, fine, you maybe have enough information. If a producer is producing invalid work items, you won't have enough information to find which one. You'll want that.

The idea of an error object is for the code to make a decision about how to handle that error, and if it fails, escalate it to a human for analysis. The application should be able to distinguish between classes of failures, and the human should be able to understand the state of the program that caused the failure, so they can immediately begin fixing the failure. It's up to you to capture that state, and make sure that you consistently capture the state.

Rather than leaving it to chance, I have an opinionated procedure:

1) Every error should be wrapped. This is where all the context for the operator of your software comes from, and you have to do it every time to capture the state of the application at the time of the error.

2) The error need not say "error" or "failure" or "problem". It's an error, you know it failed. As an example, prefer "upgrade foos: %w" over "problem upgrading foos: %w". (The reason is that in a long chain, if everyone does this, it's just redundant: "problem frobbing baz: problem fooing bars: problem quuxing glork: i/o timeout". Compare that to "frob baz: foo bars: quux glork: i/o timeout".)

But if you're logging an error, I pretty much always put some sort of error-sounding words in there. Makes it clear to operators that may not be as zen about failures as you that this is the line that identifies something not working. "2021-08-23T20:45:00.123 PANIC problem connecting to database postgres://1.2.3.4/: no route to host". I'm open to an argument that if you're logging at level >= WARNING that the reader knows it's a problem, though. (I also tend to phrase them as "problem x-ing y" instead of "error x-ing y" or "x-ing y failed". Not going to prescribe that to others though, use the wording that you like, or that you think causes the right level of panic.)

3) Error wrapping shouldn't duplicate any information that the caller already has. The caller knows the arguments passed to the function, and the name of the function. If its error wrapping needs those things to produce an actionable error message, it will add them. It doesn't know what sub-function failed, and it doesn't know what internally-generated state there is, and those are going to be the interesting parts for the person debugging the problem. So if you're incrementing a counter, you might do a transaction, inside of which is a read and a write -- return "commit txn: %w", "rollback txn: %w", "read: %w", "write: %w", etc. The caller can't know which part failed, or that you decided to commit vs. roll back, but it does know the record ID, that the function is "update view count", etc.

The standard library violates this rule, probably because people "return err" instead of wrapping the error, and this gives them a shred of hope. And, it's my made-up rule, not the Go team's actual rule! Investigate those cases and don't add redundant information. (For example, os.ReadFile will have the filename in the error message, because it returns an fs.PathError, which contains that. net.Dial is another culprit. Make a list of these and break the rule in these cases.)

4) Any error that the program is going to handle programmatically should have a sentinel (`var ErrFoo = errors.New("foo")`), so that you can unambiguously handle the error correctly. (People seem to handle io.EOF quite well; emulate that.)

You can describe special cases of your error by wrapping it before returning it, `fmt.Errorf("bar the quux: %w", ErrFoo)`.

Finally, since I talked about logging, please talk about things that DID work in your logs. Your logs are the primary user interface for operators of your software, but often the most neglected interface point. When you see something like "problem connecting to foo\nproblem connecting to foo", you're going to think there's a problem connecting to foo. But if you write "problem connecting to foo (attempt 1/3)\nproblem connecting to foo (attempt 2/3)\nconnected to foo", then the operator knows not to investigate that. It worked, and the program expected it to take 3 attempts. Perfect. (Generally, for any long-running operation a log "starting XXX" and "finished XXX" are great. That way, you can start looking for missing "finished" messages, rather than relying on self-reported errors.)

(And, outside of HN comments, I would make that a structured log, so that someone can easily select(.attempt == .max_attempts) and things like that. It's ugly if you just read the output, but great if you have a tool to pretty-print the logs: https://github.com/jrockway/json-logs/releases/tag/v0.0.3)

Anyway, I guess where this rant goes is -- errors are not an afterthought. They're as much a part of your UI as all the buttons and widgets. They will happen to all software. They will happen to yours. Some poor sap who is not you will be responsible for fixing it. Give them everything you need, and you'll be rewarded with a pull request that makes your program slightly more reliable. Give them "unexpected error: success", and you'll have an interesting bug report to context-switch to over the course of the next month, killing anything cool you wanted to make while you track that down.


I like your argument in principle, but in my experience automatic exception stack traces in logs can get you ~80% of the way there with 0 effort, even more if there are enough happy path logs as well (e.g.

  $time, INFO, reading from /tmp/some-file
  $time, ERROR, FileNotFoundException at $whole-stack-trace
).

Conversely, the Go ecosystem is actually extremely bad with errors, using error strings almost everywhere, and adding very little context at that.


I think it's a wash. Go programs emit "fatal error: success" less than C programs, which is nice. An exception is fine if it has the information you need to debug the problem, but the names of the functions along the way aren't enough on their own. Consider: "program failed: Get "https://does-not-exist.invalid/": dial tcp: lookup does-not-exist.invalid on 8.8.8.8:53: no such host" vs. "IOException: no such host" and then a bunch of stack frames that don't tell you someone put "does-not-exist.invalid" in the config file. (Usually the operator won't type that, they'll type correct-hostnmae.internal instead of correct-hostname.internal, in a config file with 100 other non-one-letter-typo'd hostnames ;)


The way I see it, Proper error logging > Stack traces >> No stack traces. Proper error logging requires effort whatever you do. In some languages, even if you don't put in any effort, you still get stack traces (most exception-based languages), in others you get almost no context (C++, Go, Haskell etc).

If your team is spending time on good error messages, then it doesn't matter as much what language-level error support there is. If this time isn't being spent, I think you'll have a much nicer debugging experience in a system which at least collects stack traces.

Even for your example, let's say the stack trace is:

  NoSuchHostException in stdlib.LookupHost() line 341
  called from stdlib.HttpGet() line 6131
  called from applicationLogic.DoSomething() line 123
  called from main.main() line 3
In Go, with similar effort spent on error reporting along the way (if err != nil { return nil, err } vs automatic exception propagation) you'd find

  Failed to do something: "no such host"
I think there's a much better chance to figure out what happened in the first case.


> This end solution isn’t amazing, but it works. We’d of course far prefer if Go itself had a built-in equivalent.

Totally agree.

For a package like SciPipe, which we have managed to develop with zero dependencies, to maximize future reproducibility of scientific pipelines, it would very much hurt to bring in the first external (Go-) dependency, while at the same time, we really really would be much helped by stack traces.


From what I understand, go errors don't have stacktraces by default for performance reasons.

If all errors had stacktraces, everything would be much slower.


> At boundaries between our code and calls out to external packages, make an effort to always wrap the result with xerrors.Errorf. This ensures that we always capture a stack trace at the most proximate location of an error being generated as possible

I've been doing something similar for a while, using `errors.WithStack` from https://github.com/pkg/errors

The error can then be logged with https://github.com/rs/zerolog like this `log.Error().Stack().Err(err).Msg("")`

For human readable output (instead of the standard JSON) use a console writer, see https://github.com/mozey/logutil


Individuals appear to focus on stack follows, in light of the fact that different dialects present practically every blunder in stack follow structure. I figure you should ponder why you need them and ensure you have a valid justification before thoughtlessly adding them. I do gather stack follows in some Go code, since Sentry requires it for order, yet as a general rule, you can improve work yourself, with very little magic included. https://www.nection.io/




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: