Hacker News new | comments | show | ask | jobs | submit login
Show HN: A utility for running exhaustiveness checks on “sum types” in Go (github.com)
75 points by burntsushi 217 days ago | hide | past | web | 38 comments | favorite



Language-level sum types are something that I miss a lot when I'm writing Go. I find myself using this error-prone pattern whenever I'm trying to represent state:

    type PersonType string

    const (
        EmployeePersonType PersonType = "Employee"
        ManagerPersonType PersonType = "Manager"
        ClientPersonType PersonType = "Client"
    )
I'm not the only one[1]. Another option is to use iota.

    type PersonType int

    const (
        EmployeePersonType PersonType = iota
        ManagerPersonType
        ClientPersonType
    )
You'll then use generators such as stringer or jsonenums[2] to take care of translating your internal representations to something more system-level and printable. This gets very messy as now we need to write un/marshallers and code generators for SQL databases, NoSQL databases, JSON endpoints, etc. I'm likely to just keep doing what I'm doing.

Although my described approaches aren't great, I'm glad to see people thinking about how to improve it with tooling. For reference, this is the same thing in Haskell:

    data Person = Employee | Manager | Client
I really prefer this format because the set of possible Person type options is explicitly defined and easy to exhaustively check. You'll get no type coercion from arbitrary strings like "Daughter".

And you can also extra data types using type constructors:

    data ApptTime = NotScheduled | WalkIn | At UTCTime
At least by attempting to use sum types in Go, you get some compiler checks, which are a lot better than none.

Has anyone figured out a better way?

[1] https://github.com/kubernetes/kubernetes/blob/master/pkg/api...

[2] https://github.com/campoy/jsonenums


You may want to take a look at https://github.com/alvaroloes/enumer which can generate functions around your enum to give/accept string representations, plus JSON and SQL scanning.


Isn't trying to hack a language to have the type system you wish it had, rather than simply using the language you wanted in the first place, something of a futile exercise?

If you want a language with ADTs and exhaustive patterns out there, there's plenty to choose from. Using a language with a weak type system and then trying to wish that fact away seems pointless.


If you'd like to point us to a language exactly like Go in every way, except that it does have variant types built in, then by all means.

Excepting that, we all have to accept that languages have their pros and cons, and we are each responsible for weighing those pros and cons against our personal preferences and values.


Not everybody gets to choose the language they want to use at work unfortunately.


We get to choose, when there is freedom to choose to whom we work for.

Of course, changing jobs isn't always an option.


And someone can still love their job even if the language in common use isn't their personal preference.


That's the case for me at least. I was even involved in choosing Go at our company a few years back. A language doesn't need to be perfect in order to enjoy using it. :-)


The Haskell tax is high in my city as is the cost of living.


1. It's a bit difficult to change languages when you already wrote tens of thousands of lines of code for your project in one language.

2. Every language has its downsides. You can't mix and match features.


This is an old argument. If a language allows a switch statement where no case being executed is an option, there's an opportunity for bugs. If a language doesn't allow that, some people bitch about it being too pedantic.

A strong argument for insisting on covering all the cases (possibly with an explicit default) is that someday, someone may add a new enum value, type, or whatever, and code that fans out on that value may invisibly become invalid.


Cool! I didn't honestly expect to see more Go code from BurntSushi. Thought he was a Rust convert :)

(not that someone can't be both, of course)


> (not that someone can't be both, of course)

Indeed! I've been writing Go code daily for years. My open source life is just a bit more consumed by Rust these days. :-)


From the readme:

> The go-sumtype command recognizes this pattern, but it needs a small amount of help to recognize which interfaces should be treated as sum types

Why is this? Have you tried detecting sum types automatically? Maybe some heuristics like: package-level declaration of the interface, has an unexported method, has more than one type that implements it, maybe also the interface itself is exported.


It seems unwise to do exhaustiveness checks automatically on an interface that "might" be a sum type. The declaration is a single line and has the benefit of being completely unambiguous.


> (not that someone can't be both, of course)

The Go vs Rust narrative seems to have thankfully mostly gone away (although I'm sure it will never die out completely). The areas that they're targeted and the things that they are best for have now become clearer to the community at large, and they don't necessarily overlap that much.

That said, I don't use either of them at work, so I'd interested to hear a counter-argument.


> The Go vs Rust narrative seems to have thankfully mostly gone away

I am not aware of that. Has something changed recently? Though IMO Rust's strength could lead interesting usage in Data processing/analytics platform like Hadoop/Spark etc as opposed to Go like web services.


"Go vs. Rust" was initially only a thing because 1) Google initially positioned Go as a systems programming language, 2) early prototypes of Rust were very Go-like (with both a garbage collector and segmented stacks), and 3) web developers who were already used to seeing Chrome vs. Firefox found it an easy conceptual extension. But Go hasn't been marketed as a systems language for a long time and Rust removed all its Go-like features years ago; the "Go vs. Rust" that we still see is just residual fallout.


It's funny, as a Go and Rust fan, they're still in the same space to me conceptually.

I'm sticking with Go at the moment because moving my shop to Rust is quite difficult (for varying reasons), and because Go offers a nice compromise on ease and power. However i miss no-null types so bad. All that compile-time information is amazing.

I really hope Go adopts some features from Rust.


What are the areas that they target? I'm not too familiar with the communities surrounding either language.


Go tends to be more Web related - back end services communicating over networks, etc. It's taken over things that in the past would often have been done in things like Python but as scales increase performance becomes more important. Google's backend web services is teh classic example of usage.

Rust is more in the C/C++ role - 'systems programming'. This leans more towards things like command line tools, system daemons, OS kernels, embedded work on small processors etc. Doing performance intensive tasks in Firefox is probably Rust's most well known application.

That's not say that there are overlap areas where they can't both work well, but those aren't so much the areas of emphasis.


This looks very nice. I've been working with sum types (as far as they can be called that) in Go a lot recently, and it's frustrating how bad it is as algebraic data types in general.

Aside from the lack of exhaustiveness checks, one particularly annoying area is writing code for enumeration and transformation. I needed one set of types to have this:

    type Node {
      Walk(WalkFunc)
      TransformBottomUp(TransformFunc) (Node, error)
      TransformTopDown(TransformFunc) (Node, error)
    }
I implemented it as an interface precisely because of the lack of exhaustiveness checks, even though it feels wrong to attach code to my pure-data types.

But the code to do it is pure boilerplate. I considered doing something like having GetChildren() and SetChildren() and writing some generic transformer code, but they you have to deal with slices, which means a lot of allocation. I also considered having interfaces like Nonary, Unary, Binary, etc. that encapsulated the number of children, but that's terrible, too.

I might go with simple switch statements instead, and use this utility.


I think that switching on types in Go is gererally a total anti-pattern.

In case you need to handle errors, in the end the type doesn't matter at all for you. You are interested what the implications of the error are, so the library exposing the error should just expose functions like isTemporal(e error) bool. In case you need to inform the user about the exact error, no problem, the returned error has the Error() function which shows the message, no matter the type, use that.

In case you're not matching errors, then you shouldn't worry about the type. The function returns an interface because all that should matter to you is that interface with its common behavior.

Otherwise you're just fighting the language, which doesn't make any sense, as you can just switch to Rust then.

EDIT: Though I'm really happy that people make more tools for analyzing Go code. Good job!


> I think that switching on types in Go is gererally a total anti-pattern.

It's not. It's a standard way of implementing variant types. See https://golang.org/doc/faq#variant_types and https://golang.org/src/go/ast/walk.go?s=1311:1342#L41

True, it's not used often. Sum types aren't a great fit for Go (as the FAQ says), but sometimes they are exactly what you want.

> Otherwise you're just fighting the language, which doesn't make any sense, as you can just switch to Rust then.

I find these types of comments really dismissive and unhelpful. A single comment declaration and one Sunday's worth of hacking (thanks to the incredible tooling provided by Go) makes my experience using Go better. Why do I need to switch to Rust? It would take several months and significant social capital to do. (And I'm saying this as a member of the Rust library team!)


Actually, after reading the walk.go code I still wonder what the best way to do it would be. (Not saying it's bad now, though I'll still remain my devils advocate)

As I skimmed through the code, in my opinion you could easily have a Walk(v Visitor) method on each of those node types.

This way you'd have no need for casting, and in the Walk method you would be able to use type-specific functionality without the need to cast. The only disadvantage I see is that the Walking functionality would be spread out among the code in different places, but this isn't a problem with modern editors.

I'll note this down somewhere and if I have time I'll test it out sometime.



Anyways, my conclusion is. In most cases it IS an anti-pattern. Only if the, let's say "observing" function (the one that decides on the type), is really just observing the behavior nad trying not to interwind with the mechanics of the actual structures is it justified I think. In that case it provides greater cleanliness of your code.

In this case though, as this is the ast package, and walking is a standard functionality of an ast, I think this could just as well be implemented with interfaces.


You should tell that to the Go team so they can stop doing it everywhere.

Edit: I realize this came off as snarky, sorry.

I'm somewhat serious though. In my current Go installation I count 519 type switches in the standard library.

Some exist for perfectly valid factoring reasons. For example, the x509 package switches on the type of the public key it receives in order look up some x509-specific values for encoding. It would be a layering violation for the public keys themselves to have x509-specific encoding values attached, and Go doesn't permit the x509 package to declare new methods on types defined in another package. (They _could_ define a new interface which extends PublicKey, and declare a new local wrapper type for each supported public key type, embedding the external ones then attaching methods as needed - but this is sufficiently clunky that I've never seen it done in Go core).

But there are plenty of other cases, especially in the crypto packages, where the type switching seems gratuitous. For example the crypto.PublicKey interface is actually empty - 100% of public key functionality is implemented through some kind of type assertion or switching. Maybe there really is a good reason for this, but it's not obvious to me.


"I think that switching on types in Go is gererally a total anti-pattern."

It's not.

A thing to bear in mind is that this tool is checking interface types that are strictly internal to a package. They contain a package-private implementation method. It is a very common pattern for me to have a package implement a server internally that runs in a goroutine or set of goroutines that is internally communicated with along a "chan command", and does a type switch within its main loop to dispatch the commands. It is useful to have completeness checking on this sort of thing. It would also make sense to have AST nodes, header types, or any number of other internal usages handled this way. The addition of an external tool to do completeness checking goes 80% of the way towards resolving people's complaints with this pattern, as long as they don't mind an external tool. (Which I acknowledge is a matter of strong opinion.)

Your criticism seems to focus on how objects present themselves externally. From that point of view, it is correct and I generally agree with it... the only issue is that it's not relevant here, because we're talking strictly about internal objects.


Separately, the way Go is a very, very simple language that is picking up a lot of useful tooling makes me wonder about the possibility of designing a language that even more deliberately reifies that. What if you had something like a Lisp, but instead of the macros staying in the code you had a culture of resolving the macros into the simple language and publishing the end result instead of the macros? Bearing in the very purpose of such a structure would be to forcibly ensure that the core language remains simple and the shared libraries stay simple, the core idea here is to use a second-order effect to constrain the power of macros and macro-like things, so I'm not "surprised" by the way this makes macros or syntax checks less powerful, nor am I surprised that writing such a tool requires a higher "activation energy" than simply plopping a macro down. It is well known that macros don't stay simple and often grow quite complicated; some people attribute this and the fact that macros become their own language and thus no two Lisp libraries are necessarily written in the "same language" to why Lisp hasn't managed to conquer the world.

I'm noodling around with ideas here here, not promoting Go as this language; if it were designed for that there's probably some things it would do differently. I just find it sort of intriguing that if you don't mind using tools, a lot of the common criticisms of Go can be largely mitigated without necessarily paying the price of putting the complexity in the language itself; even if that doesn't personally appeal to you it's interesting to consider it as a deliberate design paradigm. Is this a good idea? It's almost like having a modular language where you only use the complex stuff where you really get an advantage from it but don't pay for that stuff where you don't. You couldn't use this to bodge in things like the Rust borrow checker that really needs the whole program to use it to obtain the benefits, and a lot of what Haskell does with the typechecker would be infeasible, but it's not hard to see things like "sum type completeness checking" and "generics via code templating" and even things as large as "optional typing" being feasible this way, without the core language authors having to arbitrate all these things, and without the whole community necessarily having to adopt these things to see advantages. Compare with, for instance, the rather large mess Python has become over the years by absorbing feature after feature into the core language. See also the gometalinter [1], in which the community aggregates a collection of smaller tools all written fully independently, with little-to-no coordination needed, all standalone, and then tied into what is, frankly, not a half bad language linter, especially if you play with the config a bit.

[1]: https://github.com/alecthomas/gometalinter


> resolving the macros into the simple language and publishing the end result instead of the macros?

That's basically called code generation. Visual Studio had it more than twenty years ago in the form of the "MFC App Wizard".

Problem is, what if the macro-generated code had a bug, and it was expanded in 77 different places (not to mention in subtly different ways)? We want to correct the macro in one place and re-expand it.

Writing code-generating code, but then just retaining the results of the generation, is a serious "anti pattern".


"Writing code-generating code, but then just retaining the results of the generation, is a serious "anti pattern"."

I wasn't clear. In the Go community, you don't retain just the results of the generation, you retain the source as well, you just publish the results.

Furthermore, when someone says they're musing about with ideas, "That's an antipattern" isn't a helpful contribution. Why is it an antipattern? Go deep, not just the superficial reasons. The reasons matter, because it may turn out the reasons don't apply, or that it was an antipattern because of specific conditions in the local landscape that may either not apply or could be made to not apply (e.g., functional programming is an antipattern in Go and the only way to work in Haskell, declarations of "antipattern" are almost always only locally appropriate).

Further, what prompted my musing in the first place is that you can convert "That's an antipattern" into the more scientific theory-type expansion of "If you do that, you will produce unmaintainable code." However, in the Go community, to some extent, we see people doing "that" but not ending up with unmaintainable code. (Please do not simply assert that it must be unmaintainable because that's what this antipattern produces, as that is begging the question. I see code that I would not describe as "unmaintainable" coming out of this process. Such as in the very thing that prompted this discussion; there is nothing about an external totality checker that makes the code unmaintainable.) Reality trumps theory. Since the theory that code generation is often a bad idea is indeed fairly well established, that tends to imply that rather than the theory being "wrong" it is incomplete. What if the community has oversimplified its view of code generation? What if there is a particular combination of difficulty and benefit that might allow us to harness it, as I said, to more truly modularize a language in a way that the entire language is not always pressing down on a programmer in every line?

My theories aren't matching reality, so I find myself interested in the why of that mismatch.


Another clear lang vs tool split present in Go is related to returning the error type. The language could enforce that return values of type error must always be assigned (and used). But it doesn't, and now there's the errcheck tool.

https://github.com/kisielk/errcheck

I agree that having things like errcheck and go-sumtype all wired up using gometalinter is pretty novel and only partially gross.


> The function returns an interface because all that should matter to you is that interface with its common behavior.

I disagree. If a function wants to limit what you care about then it should return a concrete type. Interfaces exist to increase flexibility, and that includes being able to assert an implementation to gain safe access to the further structured information that underlying object contains.

> Otherwise you're just fighting the language, which doesn't make any sense, as you can just switch to Rust then.

No, using a language feature is not fighting the language. On the contrary, enforcing a pattern of increasing API surface area for functionally checking properties (i.e. the Dave Cheney error method) that would be much more cleanly accessed as structured data is fighting the language and in my opinion borders on antipattern.


I made a similar comment on the reddit post. This seems like something handled better by interfaces.

In my experience, whenever I end up writing a large type switch on an interface I take it as a code smell. All those cases should be an interface method. Only in rare instances (base case serialization of builtin types for example) would I really think type switches make sense.


Golang irritates people for various reasons. For some reason I've begun to feel the language is was partly designed to put prima donnas who push theoretical languages in blogs all day rather than ship in their place.

These Haskell and Scala people talk talk talk all about correctness and syntax tricks. Meanwhile golang is nimble and just winning every race. I wish they'd learn from it rather than get so defensive.

Also, it's just a confirmation people get too attached to complicated features in languages as a crutch. YAGNI

Programming can be straight-forward and clear. Without it spiraling into a contest of clever tricks. You look clever when you get product out the door.

I've been programming 10 years professionally, and 15 years as a hobby. I've yet to hear what a sum type is until yesterday.

I got nothing wrong with people dabbling and hacking on the weekend, but sometimes it irks me when people split hairs over being "correct" and never actually get stuff done.

I think a lot of the "syntax tricks" is a mechanism used by people to feel superior. It's like, "hey if I can't actually build something useful, at least I can blog about how to write a monad transformer".

I'm not trying to be offensive, I'm just failing to see the value in this and trying to make sense of what I'm seeing. Where was the business issue that could only be solved through sum types in the language syntax? Enlighten me.


In our code, we have a closed interface type with many variants, not unlike an AST. We frequently add new variants to that interface. It is nice to have a tool tell us which parts of the code need updating to account for the new variant. This tool does that. It trivially reduces bugs and saves development time.

If you think people like me "don't get stuff done," then please, take a stroll through my Github. ;-)


Speaking from my own experience, I don't have the opportunity to use them, rather I have to use the language with less features in the work place. The articles are meant to persuade people into using languages with more features. I feel at home with FP, HKT, and ADTs, actually I'm out of my comfort zone without them.




Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | DMCA | Apply to YC | Contact

Search: