>The Tree example illustrates another general guideline: when you need something like a comparison function, prefer a function to a method.
>We could have defined the Tree type such that the element type is required to have a Compare or Less method. This would be done by writing a constraint that requires the method, meaning that any type argument used to instantiate the Tree type would need to have that method.
Yes yes a thousand times yes.
If you have an interface with a single method, or something with a single varying piece of behavior, please strongly consider accepting a function instead. Going from method to function is trivial - pass `thing.method`. Going the other way is impossible at runtime, it requires package-level code at compile-time.
This is true for many/most? languages, but it feels like a lot of Go libraries have forgotten it, and they're sometimes a huge unnecessary pain to deal with for that reason.
> If you have an interface with a single method, or something with a single varying piece of behavior, please strongly consider accepting a function instead.
Agreed. And if you prefer methods to functions, you can define methods on such a function type!
In my LINQ in Go (https://github.com/nukata/linq-in-go), I define a higher order function type:
type Enumerator[T any] func(yield func(element T))
and several methods on it, for example,
// Where creates an Enumerator which selects elements by appling
// predicate to each of them.
func (loop Enumerator[T]) Where(predicate func(T) bool) Enumerator[T] {
return func(yield func(T)) {
loop(func(element T) {
if predicate(element) {
yield(element)
}
})
}
}
Naturally, if you need another type parameter for a method, you must define it as a function. For example,
// Select creates an Enumerator which applies f to each of elements.
func Select[T any, R any](f func(T) R, loop Enumerator[T]) Enumerator[R] {
return func(yield func(R)) {
loop(func(element T) {
value := f(element)
yield(value)
})
}
}
Now, with the function which converts a slice to an Enumerator,
// From creates an Enumerator from a slice.
func From[T ~[]E, E any](x T) Enumerator[E] {
return func(yield func(E)) {
for _, element := range x {
yield(element)
}
}
}
> If you have an interface with a single method, or something with a single varying piece of behavior, please strongly consider accepting a function instead.
Java neatly solves this by means of Functional Interfaces. Interfaces that have a single method related to the generic type parameter can be called by using lambda syntax. Take for example sorting:
<T> void sort(T[] a, Comparator<? super T> c)
A `Comparator<T>` has a method `int compare(T o1, T o2)`, which allows the caller to invoke `Arrays.sort` by passing a lambda:
I don't think I'd call this "neatly"... more "a weird but pragmatic hack to improve an incredibly common pattern that had no concise option prior to java 8". And it kinda encourages continuing the "keep declaring a lambda that could have just been a function reference if the API had used a function instead" pattern, which is not great.
But I do very much like it. It improves Java's ergonomics quite a lot.
What I like about Java functional interfaces is that they give a place to specify an interface contract. For example, with Comparator, it specifies that instances must implement a total ordering. Purely structural function types don't have that affordance.
A slightly problematic point, however, is that you can assign a method reference from one functional interface to a variable of another functional interface based purely on structural typing, even if they happen to have incompatible interface contracts. For example, if you had a PartialOrder functional interface with the same parameter/return types as Comparator, then you could assign PartialOrder::compare to a Comparator variable although the semantics aren't compatible. Whereas nominal typing prevents you from assigning a PartialOrder to Comparator (no "duck typing"), Java circumvents the nominal type system here for method references, with the result that one may assign incompatible implementations without noticing.
Agreed. I almost never use sort.Sort in practice, sort.Slice is just so much more usable.
Also, based on the recent blog post[0] a method-based approach will be a tiny bit slower than the function-based approach. Not that it matters for most code though.
> This means that a time.Time pointer has the same GCShape as an uint64, a bytes.Buffer and a strings.Builder.
> In the current implementation of Generics in 1.18, every runtime invocation of a generic function will transparently receive as its first argument a static dictionary with metadata about the arguments being passed to the function.
> That’s right, after the monomorphization step, the generated function shape needs to take as a runtime input the virtual method tables for all its generic arguments.
What an odd design. I'm curious to read the justification for this. I assume it's faster compile times.
And also smaller binaries.
For comparison, our templates-heavy C++ project's binary is 1 GB in debug mode and 200 MB in release mode. It takes 10-20 seconds just to link.
Agreed. If I could retroactively alter Rust's design, I'd remove non-marker traits with methods, and instead have functions and data structures take method or vtable pointers. In case of static dispatch, the method pointers would be passed as compile-time const generic parameters. This allows picking different ordering operators for different trees or sorts defined on the same type, instead of requiring wrapper types defining custom comparison operators, or having individual types or functions add comparator parameters in an ad-hoc fashion (https://doc.rust-lang.org/std/primitive.slice.html#method.so...).
> Agreed. If I could retroactively alter Rust's design, I'd remove non-marker traits with methods, and instead have functions and data structures take method or vtable pointers.
Do you mean single-method traits? Because it already sounds like a pain in the ass for Hash or Ord, but it sounds absolutely horrendous for Iterator.
> This allows picking different ordering operators for different trees or sorts defined on the same type, instead of requiring wrapper types defining custom comparison operators
Sounds like an aggressive worsening of the default, and a solution in search of a problem really: if that’s a really common need you could have a generic wrapper rather than a specific one (like Reverse).
Your complaint about ad-hoc comparator seems odd as your method would essentially require always doing that, unless I’m misunderstanding what you mean.
No, the pointer to vtable can point to a struct value containing multiple function pointers.
> if that’s a really common need you could have a generic wrapper rather than a specific one (like Reverse).
The way I see generic wrappers, it's not the integer that's reversed, but the data structure defined over it, and requiring wrapping/unwrapping reversed integers/etc. is ceremony in the wrong location. With specific wrappers, you could argue it often has semantic significance representing how your specific type isn't just a regular integer, but a Uid or such.
> Your complaint about ad-hoc comparator seems odd as your method would essentially require always doing that, unless I’m misunderstanding what you mean.
My change to Rust would make all trait-bounded generic functions consistently allow supplying a trait implementation separate from the underlying type, effectively transforming this pattern from an ad-hoc change to a language construct more flexible than globally coherent traits. From there on, you could layer on optional ergonomic improvements, like implicit trait implementations or optional default traits (while keeping the first-class ability to supply any trait implementation desired).
Is the idea practical? Maybe. Currently structs can't supply default field values, even though traits can supply default methods. But the hard part is that default trait methods are effectively templates, not functions:
trait A {
fn f() -> i32;
fn g() -> i32 {
Self::f()
}
}
struct T1;
impl A for T1 {
fn f() -> i32 {
1
}
}
struct T2;
impl A for T2 {
fn f() -> i32 {
2
}
}
To preserve the current semantics (a debatable goal), instead of building method table structs out of free functions (the most minimal and elegant implementation, though less ergonomic) like `static T1_A = A<T1>{...}`, you'd still have to keep a compromise syntax of some sort, closer to current `impl` but allowing you to optionally specify a name for the impl.
Another issue to be addressed is that global coherence can serve as a lint against accidentally implementing the same trait twice for a type (as opposed to deliberately providing two different implementations with different names the user can select.)
The ergonomics of this are really bad without some kind of implicit parameters and trait resolution. Without trait methods, ad-hoc overloading becomes impossible.
You would have to explicitly specify what + operator and what == operator you use everywhere. After all, there could be multiple different additions or comparisons defined for a type, so which one do you mean by `+`?
Another problem is that a data structure's invariant can depend on the trait implementation. For example a tree is balanced with a comparison. This means that the comparison has to be stored in the struct, with the corresponding runtime overhead. With traits you know that there is only a single implementation for any particular type, so if the trait v-table is an argument to every method call there is no risk of different implementations being used at different times.
> This means that the comparison has to be stored in the struct, with the corresponding runtime overhead.
The runtime overhead is solvable in principle by making the comparison a const-generic parameter of the data structure type. But to do this properly requires dependent types, because the type of that const-generic parameter is taken from an earlier parameter in the same definition. It's a dependent sum type, sometimes called a dependent record.
From my understanding, compile-time dependent types (which is all that's needed to mimic statically dispatched generics) are easy, and C++ templates support them (I haven't tried Rust const generics but I hear they're quite limited). Runtime dependent types are much less common, and (given my limited understanding of dependent types) I could describe trait objects (&dyn Trait) as a dependent pair (ptr, Trait-shaped vtable containing *fn taking ptr's type) (though rustc hard-codes support for trait objects, and its job is easier since it never lets ptr change runtime type while its target is used for a &dyn Trait).
Rust's const generics today are limited to the built-in integral types (so, char, bool, and the various sizes of signed or unsigned integer) and the constant must actually be an obvious constant, so e.g. a literal, or something you could assign to a const in Rust. So yes that's much less flexible than C++.
Some of the flexibility in C++ here is desirable, much of it is either wanking or outright dangerous. C++ will allow a float generic for example, which is obviously a footgun because good luck explaining why Thing<NaN> and Thing<NaN> are different types...
Rust expects to sooner or later ship Const generics for user-defined types, but to exclude the nonsense of "float generic" the likely rule goes like this: Your const parameter type must derive PartialEq and Eq. It won't be enough to implement them yourself as your implementation might be nonsense (e.g. misfortunate::Maxwell implements Eq but no Maxwell is equal to anything, including itself) you'll need to use the derive macro which promises the compiler these things actually have working equivalence so that Thing<A> and Thing<B> are the same type if A and B are equivalent.
>Some of the flexibility in C++ here is desirable, much of it is either wanking or outright dangerous. C++ will allow a float generic for example, which is obviously a footgun because good luck explaining why Thing<NaN> and Thing<NaN> are different types...
It's not nonsense, there are legitimate use-cases for floating point template params. A classic example is something like generateInlinedVersionOfFunction<someInt, someFloat>() that generates a class or function with some particular values inlined for efficiency. Using nan as a template parameter is stupid, but it's not especially more dangerous than using nan in general, since any issues specific to its compile-time usage will manifest at compile time, not runtime.
It does nobody any favours to just dismiss advanced features Rust doesn't support as dangerous and unnecessary just because most Rust programmers haven't come across a use-case.
> It does nobody any favours to just dismiss advanced features Rust doesn't support as dangerous and unnecessary just because most Rust programmers haven't come across a use-case.
I think requiring decidable equality for const generics is a fine approach for a limited MVP like what Rust is going with at present. Sure there are more general settings but they would need dependent types and the user would have to write proofs of type equality/apartness anytime the issue came up in a compile. Seems like a bit of a non-starter for now, even though it's effectively what enables non-trivial logical features like homotopy types.
> A classic example is something like generateInlinedVersionOfFunction<someInt, someFloat>() that generates a class or function with some particular values inlined for efficiency.
This smells very bad. Seems like a better choice is a macro, and I understand that in C++ the macros are awful and worth avoiding, but Rust has decent declarative macros for this type of work.
You can always just convert your float to a bit-equivalent int and back for the type param if you really want it. I have never seen floats used for template parameters in the wild, other than party tricks like wrapping a float in a struct with an "epsilon" parameter used for the operator<. Which is all kinds of wrong.
AIUI, practical dependent types are always evaluated at compile time, but the difference with something like C++ generics is that the types can include references to program bindings that relate to runtime values. This means that dependent types can express proofs evaluated at compile time about the runtime properties of a program.
Why? If it’s a one-off occurence you can do it inside the closure, if it happens more often then surely a zero-overhead wrapper type is the ergonomic choice.
> Let’s start with a general guideline for programming Go: write Go programs by writing code, not by defining types
I find this interesting - I think I typically do the reverse: I generally start by thinking of my program in terms of types and then write code to perform operations/transformations on those types (obviously this doesn't work 100% of the time).
That being said, I have never written any Go, but I can't imagine I would approach it differently.
I generally think type first in Java and most other languages purely by convention - it is just a social and programmatic norm. That being said, Go's conciseness lends itself to creating working code quickly and generally the type system aims to just be "good enough", broadly trying to force you to get working code first and invent the types as they are opportune. This is also why (aside from fear of hurting compilation speed) that generics were only just added after many years; which some people, myself included before I actually used Go and loved it - think is needed to use a language for anything beyond "toy" purposes (you don't actually need generics most of the time).
In practice this works out really well (especially since with Go you never need to extend interfaces, you automatically are known to implement an interface implicitly by just possessing the same function signatures an interface possesses), and is liberating and freeing: I don't go down a rabbit hole of thinking I need to support future cases and define AbstractAnimal, AbstractMammal extends AbstractAnimal, Dog extends AbstractMammal. This also highlights the other side benefit - since you aren't thinking what each object's operations are you can use composition and define the traits that a Dog possesses, and lo and behold later, since those trait interfaces are decoupled you don't have to repeat them if you realize Snake can implement Hiss() just like Cat can even though they both might traditionally extend and inherit down orthogonal branches (where you would have to duplicate or do wonky shenanigans like attach it to AbstractAnimal and return UnsupportOperation where not actually implemented).
The bottom line is it forces you to not even have to think about YAGNi because you compose only what is needed to accomplish each task on the fly.
That’s not much different than just using interfaces/traits though (with the added benefit or hindrance depending on person of Graph::walk may be different than Animal::walk)
> git actually has a simple design, with stable and reasonably well-documented data structures. In fact, I'm a huge proponent of designing your code around the data, rather than the other way around, and I think it's one of the reasons git has been fairly successful […] I will, in fact, claim that the difference between a bad programmer and a good one is whether he considers his code or his data structures more important. Bad programmers worry about the code. Good programmers worry about data structures and their relationships.
This is also very much the norm for high-performance code, when you reach boundaries the code is often secondary to the data structure or memory layout.
I agree with the importance of the memory layout of data structures, but I read the original article a bit differently from you and the person you are replying to.
A common approach to programming is to model the problem, almost to anthropomorphize the components of a system. In this style of programming, the organization of data structures isn't about how to lay out bytes in memory, but about a role needed in the resolution of a problem. I almost think of it as having a bunch of little actors that we want to arrange in a certain way in a scene, shout `.action()`, and have the work be done.
This is a style of programming that I was really into just a few years ago, but when I compared myself to other programmers that I really looked up too, I realized that they didn't engage in this type of programming theater. Their code _did_ have types, but it was to group data together, not to tell a story; it was very concrete and practical. And time and time again they produced better programs than I did, often in less time than me, and they avoided questions that would plague me for days (e.g., whose responsibility is it to do X?)
So what I understood from the article is to avoid creating types upon types upon types to model the conceptual problem, and rather write code (which certainly includes data structures) that handles the data of the problem.
I feel that this hints at a fundamental division between two ways that different programmers approach problems.
I’m the opposite to you - I tend to think logic-first. So, for example, when thinking about a parser I’d start by writing code to read tokens. I’ve struggled to follow tutorials that instead start by defining types to represent an AST.
Different approaches to problem-solving also lead to different preferences in terms of programming languages. In particular, I’m perfectly happy working in dynamically-typed languages for small-to-medium projects. Many programmers seem to be opposed to dynamic languages on principle - I believe that’s related to thinking types-first.
I do quite a few interview each week, and one of our problems highlight the differences in these two approaches quite clearly.
Inevitably, those candidates that layout their types, and think about the structure before they start coding come up with convoluted interfaces, leave unused class/struct/type member variables, duplicate data, or are simply unable to complete the problem because they've coded themselves into a rabbit hole.
The best candidates decide what the public interface will look like, create the method signatures that they need and then start coding the logic. They add parameters/member variables only when they discover they're needed.
In the end they arrive at a minimal and feature complete solution, rarely getting stuck on the way there. I've changed my own style of thinking about a problem as a result of this. Now I code outside-in, instead of inside-out.
You should start with the most difficult (non-obvious) part first.
If the most difficult part of the problem is the algorithm, then start with that and figure out the types along the way.
If the most difficult part are the types / data representation then start with that and worry about the algorithms later.
If both are difficult, then they need to be considered together and come up with a holistic solution. These are the kinds of problems that often take a two or more rewrites before you come up with something you're happy with.
I think the problem isn't so much "where you start", but rather if you stop and backtrack along the way. What I mean is that if you have "convoluted interfaces, unused class/struct/type member variables, duplicate data" etc. then you started with some approach but never stopped for a minute along the way to redesign things using the knowledge you've learned from the first implementation pass.
I sometimes end up with a messy and crappy solution on my first try, but it does work (sort-of) and then I just start from scratch again, liberally copying stuff over but also deleting/changing a lot of stuff, as needed.
How can you approach a problem with logic before knowing what that logic will manipulate? For example you say you'd start writing code to read tokens but a token is a type
I suspect the difference depends on what your types can provide?
So in something like Haskell the types are strong enough that going with a types first approach makes a lot of sense. Often once you have the types laid out, many functions will only have a handful of sensible implementations that are even possible at all.
In something as primitive as C or Go (before generics) types don't tell you much at all. So laying them out first doesn't give you much at all. You might make a mental (or physical) note of what you are trying to achieve on what data, and then start coding.
In Haskell you could express a lot of the content of that note in the types; but in C and Go many or even most things are done via side-effects, and their type system doesn't track these at all.
`Token` isn’t a type that a parser has to define - it comes from the previous stage (the lexer), or could even just be `string`. When writing a parser it’s already defined - so do you start the parser by working with the types you already have, or by immediately defining more types?
I’d write a parser by first writing code to open a file (for example) to get a collection of tokens, then loop through them and decide what to do for each one. I’d use those decisions to guide my design of the output types.
Others, it seems, start by designing the output types up front and use that to guide the design of the program logic.
Implicitly it means is not to model a program purely through types and how they interact with each other. In general you do start programming by defining domain types (through a combination of structs, slices and maps), but it is up to the programmer to wire up the behaviors between them and not by expressing behaviors through the type system (ML-style functional programming) or by defining type hierarchies to model the real world (Java-style OOP).
I think you’re conflating “domain types” and types. Here, types is an implementation detail that can be abstracted away / refactored after the logic makes it clear you can reuse things and such
I think it all depends on the problem. If it is known how and what the code need to do, like transform the input in a particular way, then the data structures will naturally follow from actions. But if actions are not known or too abstract, like show a graph or create an app to calculate taxes, then defining data structures first will be beneficial as it allows to make actions more detailed.
I'd personally rather think at a high level about design before writing software or defining types. But a rough description of the Inputs and Outputs of some new feature/application seems more important to me than writing code.
So… use generics the same way they’re used in other languages? I’m trying so hard to watch this evolution in Go with an assumption there’s careful consideration of something other languages have gotten wrong and it seems like the whole result is… use generics within the language constraints in ways that aren’t problematic. I mean, Go already isn’t TypeScript (which does this great despite complaints). Go on, Go folks. Use generics. If you’re using them wrong you’re gonna get much more direct criticism than general guidelines to use the platform.
This guidance is specific and explicitly calls out some dos and don'ts. Your comment implies that there is one flavor of utilizing generics that applies across all languages and that is most certainly not true.
Other languages don't tend to be as prescriptive about best practice as Go is trying to be here. Just because it's syntactically valid doesn't mean you should type it often. (Most language standard libraries are several contrasting styles shoehorned together. Although, to be fair, even Go suffers from this a bit.)
This guidance isn't heralding in a new age of programming, it's explaining the right way to think about a new feature within the language.
Some of it is definitely trying to walk back the refrain that Go is simple and simple is good, which implies that anything that isn't currently in Go is complicated and therefore bad. If you repeat that too many times, and people buy into it, it becomes hard to roll out any new feature without being a heretic.
But - there are some things about Go generics that are particularly weird. The biggest one is that, in a lot of cases, they can actually just be another way to write runtime polymorphism, not static polymorphism as in languages like C++ (https://planetscale.com/blog/generics-can-make-your-go-code-...).
> But - there are some things about Go generics that are particularly weird. The biggest one is that, in a lot of cases, they can actually just be another way to write runtime polymorphism, not static polymorphism as in languages like C++.
Isn't static polymorphism the whole point of generics? Runtime polymorphism was already possible through interfaces and type switches.
Also you can't type switch a generic variable, so runtime polymorphism with generics seems literally impossible.
I wouldn't say it's the whole point but very important one. For simple types the current Go implementation does produce monomorphic functions but for others, for example interfaces and other pointer types, it does pass around a dict with metadata in runtime.
As far as I understand it is still compile time checked, hence it is a form of static polymorphism. The fact it is still runtime dispatched it is just an implementation choice.
Actually historically most generic implementations do not monomorphize either except as an optimisation: Java, C# (at least for complex types), most functional languages use some sort of type erasure. C++ and, I think, ADA were the exceptions, although it has proven more popular with newer languages.
Haven't the initial designers of Go stepped back? I was under the assumption that Rob Pike, for example, wasn't overly convinced, but was happy to watch what happens.
All of the "when it's useful" examples are either of `any` or `comparable`. There are no examples restricted to an interface (other than `comparable` if that counts). Only the "when it's not useful" examples include a generic restricted to an interface, `[T io.Reader]`.
Since the language allows for restricting to an interface, and this is the language's official blog, surely there must be a recommended use case for it.
func Clone[T proto.Message](in T) T {
return proto.Clone(in).(T)
}
(proto.Clone(&somepb.MyMessage{...}) normally returns a proto.Message, but this returns a *somepb.MyMessage, which is nice.)
I think we'll find reasonable uses all over the place. The place where I most want it is in the k8s client-go. All of the Reflector/Informer code I've written starts every event reception handler with a typecast of a v1.Object into the actual object you care about. With generics, that type checking can be done for you in advance.
(I haven't thought about how all the codegen will be affected. Right now something like client.CoreV1().Secrets().List() returns a well-typed SecretList; return value dot Items's type is Secret, not Object. Obviously generics can do the same thing, the question is whether or not it's worth it when you already generate the necessary wrappers and "generic methods" anyway.)
I don't know Go but I would say most uses of generics in other languages are "Any" style uses.
The difference between a data structure that only takes any ONE type and a data structure that can only restrict to an interface is fairly huge. As the article describes, actual type safety is very nice.
Edit:
One trivial example might be a taking in a type and returning the exact same type. I don't think that's achievable with interfaces.
I think there may be confusion in language here. At no point am I asking about doing something that would reduce type safety.
Supposing I have a function that takes [A any] and returns [A any]. As I use the function, it is not returning type "any" aka "interface{}", aka pointer to unspecified type (which has always been possible in Go). It's returning type A, which is specified at compile time, where A can be any type, but has to be the same type passed into the function.
Supposing I have a function that takes [B io.Reader] and returns [B io.Reader]. Again, it doesn't return io.Reader, which again was always possible. Returning type io.Reader would be returning a pointer to unspecified type that implements the io.Reader interface. That sounds like what you thought I was saying. What I'm actually talking about is returning type B, specified at compile time, where B is restricted to types that implement io.Reader, and is also the same type passed into the function.
In both cases, you're making a function that takes a type and returns the same type. In the second case, that type can only be one that implements the io.Reader interface.
Exactly right. Now returning to your question... Is it not clear that you might want a method with this type safety that is also restricted to something more specific than [A any]?
I do want that. My question is why the blog post doesn't have any examples of it, other than in the "not recommended" section. I'm curious what sorts of uses the Go team would recommend (EDIT: and which they would discourage. I know I'd find uses if left to my own devices, but the Go team might discourage some of them.)
So, you could see this as a recommendation/request for a followup blog post.
Since you say you don't know Go, I'll clarify one thing: the Go ethos includes curbing programmers who like to be clever and tricky and sophisticated. It irritates a lot of us but I understand there's probably wisdom in it. Haskell is one of my favorite languages, you don't have to sell me on generics. Now that Go relented and finally implemented them, a lot of people are worried that the "clever" programmers are going to have their way. This blog post, I think, is a means to curb that. So I'm wondering exactly where the line is from their perspective.
I don't think the Go developers wanted to implement generics, but it was something they took a lot of heat for, and so while they relented and implemented the feature, they want to use their position to suggest "best practices" to limit their use.
The author of the blogpost has been making proposals for generics since Go 1.0 was released. Contrary to what some of the smooth-brains on here think, generics weren't implemented because of their incessant crying.
Ultimately a language lives through its users. If no one had complained about the lack of generics, the Go devs would probably not have liftet a finger to implement is, as per their original design.
Haskell (and languages with parametric polymorphism in general) can give you lots of guarantees for free when using generic types. Even if at runtime you only ever use your code with one specific type.
To give a super simple example: if in Haskell you see a function with type 'a -> a', you know that it can only be the identify function, because it's generic over 'a', so the code can't do anything specific.
In contrast, if you see something like 'float -> float' almost anything can happen (that doesn't involve side-effects). It might be identity, it might be sine, it might be square-roots, etc.
The guarantees in Haskell (and I guess the MLs as well?) come from a much tighter restriction than other languages put on parametric types.
For most other languages that support generics/parametric polymorphism, a type like 'a -> 'a would actually be closer to Typeable a => a -> a in Haskell, which can be an infinity of functions.
That is, in Go I can easily write something like:
func silly[T any](x T) T {
if reflect.TypeOf(x).Kind() == reflect.Int64 {
var y any = int64(17)
return y.(T)
}
return x
}
EDIT: In Java or C#, I can even do it without reflection:
public static <T> T silly(T x) {
if (x instanceof Integer) {
return (T)(Integer)17;
}
return x;
}
> To give a super simple example: if in Haskell you see a function with type 'a -> a', you know that it can only be the identify function, because it's generic over 'a', so the code can't do anything specific.
So I can't have any conditionals or pattern matching based on a parameter's type, or provide default implementations of any functions?
In Haskell, you can have conditionals/pattern matching on a parameter's type, in the range of the declared types. For example, if you have a parameter of type `Int | Char`, you can pattern match to see if it's Int or Char. But if it's type is completely unbound, you can't do anything with it except pass it around.
There is support for reflection, but you have to explicitly ask for it, using Typeable or Data as far as I understand. That is, your function signature still has to constrain the type parameter to be an instance of Typeable or Data (a "reflectable type").
And almost any type you care about is an instance of Typeable or Data. And your own types can get those instances for free from the compiler, if you ask for them.
Correct. And this is good. It's what makes Haskell so powerful. You can look at a type and immediately know a huge number of rules its implementation must follow. And even better, the type is machine checked. So unlike prose documentation, it can never lie to you.
My personal experience with generics was to port a parallel circular queue library (used to take interface{} and require a type cast at every use) add an experiment to build all the Lodash slice and collection functions in Go.
Happy to say that both worked really well. Took me less than an hour to understand the system enough for these purposes, and the implementation hides neatly in the library - the code that uses the library has absolutely no mention of anything generic. Having generics also helped switch to cleaner implementation inside as well, because I could now use maps as well without type casts.
Looks like a solid useful system - I could do everything I wanted, all the complexity stays in the library, and changes were pretty minimal and easy to understand. Would recommend.
// returns a pointer to v
func ToPointer[T any](v T) *T {
return &v
}
// returns a given default value if v is nil, otherwise v
func DefaultValue[T any](v *T, defaultVal T) T {
if v == nil {
return defaultVal
}
return *v
}
Although a quick grep shows I mainly use these when writing tests.
I hope they allow type aliases on generics soon.
in package a:
type ASomething[T any] struct {}
in package b:
type BSomething = a.ASomething // doesn't work
type BSomething[T any] = a.ASomething[T any] // doesn't work
I found a need for this recently when updating a bunch of code in a common package I have to use generics. The above type alias would have meant I didn't have to update any code that uses that common package - the transition to generics "under the hood" would have been seamless. Oh well, nothing sed can't handle.
If you have e.g. a *string you can't populate it with &"hello world." You have to first assign the string to a variable and then take the address of that variable. This comes up very often in tests and is super inconvenient. Many people write TypeNamePtr functions so that you can call e.g. StringPtr("hello world") to express it in one line and without creating any names. Without generics you had to do this individually for every type.
Sometimes you need to instantiate a struct, usually in a test, that contains a bunch of pointers to built in types. Say a struct representing a json request body with optional fields. You can’t just take a pointer to a string literal `&”foo”` and making a variable for everyone of these fields is cumbersome so you have a function that returns a pointer to the value it’s given.
Go has escape detection, so any returned pointer to a stack allocated variable becomes heap allocated. The code posted is a fairly common way of taking a stack allocated primitive type and turning it into a heap allocated one.
That's probably a better way of explaining the behaviour. My point was more that this pattern is a very common way of getting a pointer to a primitive type value (since you can't just do &5 in Go). It seems likely that Go would optimise this to just creating a stack allocated value.
Go is only allowed to move variables if it knows there are not pointers to them, so only the compiler can take this decision. The Go runtime is very much not allowed to move variables.
I don't think this is right. Every source I can find says that Go does not currently move heap variables, but it reserves the right to, and code that assumes no movement is wrong.
I also found code that demonstrates stack variables being moved despite a pointer to them, but it's version dependent so I can't test it.
When a variable is moved, then obviously pointers to it need to be updated, but that's not an intractable problem.
Looking further, Go doesn't even provide an API for pinning objects right now[1]. You have to do weird workarounds to allocate static objects, or write code that assumes nothing on the heap moves and will explode if that ever changes.
So, first of all, my (possibly incorrect) understanding is that pure parametric polymorphism doesn't allow you to specify constraints - in Haskell terms, `a -> IO a` is a case of parametric polymorphism, but `Showable a => a -> IO a` or whatever isn't (entirely) - you've got ad hoc polymorphism in there too. So just calling it "parametric polymorphism" would be inaccurate.
My assumption was that they took the terminology from Java, but as I looked more into it I started to doubt this. Java called them Generics because the implementation in Java 5 was descended from a research project called "Generic Java". The Generic Java paper [0] uses the term - and the term "Genericity" - freely but doesn't give an original source for it at a glance. They do note that "Modula 3, Ada 95, Eiffel, and Sather" all support them though.
Wikipedia has an article on "Generic programming" [1] which notes that "[they] are known as generics in Ada, C#, Delphi, Eiffel, F#, Java, Nim, Python, Go, Rust, Swift, TypeScript and Visual Basic .NET", so this isn't exclusive to Go. It cites a paper from 1989 called "Generic programming" as an origin of the term, although it also claims that "Ada has had generics since it was first designed in 1977–1980", and indeed "generic" is a keyword in Ada 83 [2]. The "Generic programming" paper is written about Ada [3]; my hunch is that Ada's where the term comes from. I'd be curious to see if anyone can find an earlier use of that term.
That the term came from Ada (or the research that went into it) is my understanding, too.
It may be associated specifically with Alexander Stepanov [original C++ STL author, but also active in early Ada], or work he did in the early 80s (which contributed to Ada, I believe, and then into C++): http://stepanovpapers.com/
"David R. Musser, and Alexander A. Stepanov: Tecton: A Language for Manipulating Generic Objects. In Program Specification, Proceedings of a Workshop, Aarhus, Denmark, August 1981"
"Generic Programming. The language Tecton, which first used generic programming concepts, is described in “Tecton: A Language for Manipulating Generic Objects” by Kapur, Musser, and Stepanov (1981). The Ada library is described in the paper “Generic Programming” by Musser and Stepanov (1988), and C++ STL in “The Standard Template Library” by Stepanov and Lee (1994). All of these materials are available on www.stepanovpapers.com."
Parametric polymorphism though was present in Standard ML and other functional lanagages from the mid-70s on. But not called generics, and not "generic programming" per se. So, hm.
> "David R. Musser, and Alexander A. Stepanov: Tecton: A Language for Manipulating Generic Objects. In Program Specification, Proceedings of a Workshop, Aarhus, Denmark, August 1981"
The concept of "generic object" introduced in this paper is quite different from today's:
In the proposed framework, a series of refinements constitute a set of design decisions leading to a program. It should be possible to backtrack to any stage of decision making and explore an alternate path of refinements. Different design decisions can thus be compared and evaluated.
A notion that seems to be particularly useful for achieving this kind of organization is that of "generic objects." A generic object is a notion for grouping objects sharing common syntactic structure and semantic properties. Informally, a generic object describes a whole class of non-isomorphic possibilities, unless a stage has been reached at which the described objects are fully defined. Adding properties to a generic object makes it more specific by narrowing the range of possibilities. Once we allow such genericity of objects, the way is open also to the development and use of "generic algorithms," which offer a way of organizing and extending our knowledge of algorithms and control.
We have begun developing a language called 'Tecton" (Greek for "builder") for constructing generic objects and algorithms.[1]
What is called a "generic object" here is what today would rather be called a "base class" of a type hierarchy.
The CLU book (Liskov & Guttag, 1986) talks about "parameterized types". Liskov's 1996 essay in History of Programming Languages II describes the feature under the heading "parametric polymorphism".
The Modula-3 book (Harbison, 1992) has a chapter (Ch.9, p.163) called "Generics". I suspect this derives from Ada (1983?), which used "generic" as an adjective for all kinds of parameterized constructs, including functions, packages, and types.
The first design for generics in Java (Pizza by Odersky & Wadler, 1997) uses the term "parametric polymorphism", which was pretty standard in academia at the time, and originates from Fundamental Concepts in Programming Languages (Strachey, 1967).
I think probably you should think of genericity as a style and like say, OOP this style can be supported by language features but you can insist on writing BBC Basic in an OO style, the language just gets in your way all the time.
So you could insist on doing generic programming in Go 1.17 say, but it feels very awkward and is slower and anybody reviewing your code is probably going to say "This is the wrong way to use Go" whereas with 1.18 there are more places where this is a reasonable solution.
If you hate OOP, you can write a Java program that's just one main function, it will really suck but it isn't impossible. If you hate Functional Programming you could write SML that's all imperatives and again that's going to be possible but the language is fighting you.
Parametric Polymorphism isn't a style it's a feature, like Ad Hoc Polymorphism (e.g. C++ function overloading) or Subtype Polymorphism (C++ inheritance) which you could choose to use in your programs or not, but doesn't go away just because you aren't using it.
Parametric polymorphism w/ constraints would be called "bounded" polymorphism AIUI. The "Generics" term is kinda confusing because it's so generic it doesn't pinpoint the actual programming feature we're working with.
Sibling comment has some evidence to the contrary but I've always thought of it as a Java marketing term.
It lets you write generic classes e.g. "I can write a generic container class that automatically works for any contained type". It's imprecise (you can describe lots of programming language features as enabling "genericity") but I think it makes sense as a descriptive term.
I felt "generics" became the term of industry to contrast to templates, and then if some environment provides generics it is wholesome and good vs nasty template metaprogramming.
I also think in that in the context of managed environments .NET was also pushing "generics" terminology (again vs templates) at the same time, and across more languages.
As an aside, it is interesting to see Go officially refer to them as generics. For the longest time Pike was overly pedantic about it being parametric polymorphism, not generics.
I can only remember three cases in our codebase (30+ microservices) where we suffer without generics:
1) we have to write many duplicate functions just to check if an element is inside a slice
2) slices are not covariant, so we have many duplicate functions just to convert something like []MyObject to []MyInterface
3) we have to resort to using slices/maps for most data structures, it's mostly good enough, but the intention is less clear (for example, using a map for a set, or a slice for a queue) and it's less performant than more specialized data structures
Looks like most of it will be covered by the new slices package and various third-party collections packages which are going to proliferate.
We don't use 1.18 in production yet however, because of poor tooling support yet (including linters).
So far, I'm planning on using them to replace interface{}. I wrote a stupid simple test assert library that abuses []interface{} for inputs/outputs with the assumption that a panic is fine because it'll make the test fail.
TL;DR use it only when you need parametric polymorphism
> If a function has parameters with those types, and the function code doesn’t make any particular assumptions about the element types, then it may be useful to use a type parameter.
In some languages when a type variable is universally quantified the language simply doesn't allow you to make assumptions about these types. This is of course the right thing to do.
> Another case where type parameters can be useful is when different types need to implement some common method, and the implementations for the different types all look the same.
Looking the same is a clear indication that your code isn't making assumptions about the type. So this is just a restatement of the above point.
Overall this is very good advice. At the risk of sounding elitist, I think this is something everyone should already know, regardless of whether they use Go or not. Of course the article is aimed at beginners, so there's nothing wrong with that, but at HN we can just use the proper terminology such as parametric polymorphism or a universally quantified type variable.
I think it's also aimed at the C++ crowd where generics = static dispatch and interfaces = dynamic dispatch and people try to eek out the extra performance.
Keep in mind that C++ doesn't provide for parametric polymorphism, one of the more important uses of generics.
Parametric polymorphism gives you guarantees about your program for free. C++'s templates can't do that. In C++ even if the types look generic, your code can make assumptions about the types.
Eg your generic code from some type 'a' to some type 'a' can implement the identify function for most types, but implement cosine when given a float.
In something like Haskell (or Java, I think?) that sneakiness is not possible without clearly telling the user in the type.
> C++'s templates can't do that. In C++ even if the types look generic, your code can make assumptions about the types.
If you want an assertion that a class pretends to implement certain semantics (a.k.a. implements an interface) then you can have a specialized struct that has to be declared for every class used with your template. Either the user provides it and guarantees that their identity() function implements the required semantics of identity() or he doesn't and the code fails to compile. You could also check explicitly if the template parameter type inherits from a certain interface type, but that tends to exclude a lot of types that do not generally implement any interface explicitly, cant remember an IIdentityProviderFactoryBuilderManagementJavaJXBeanObserverInterface on int for example.
Going via the interface route that you described doesn't rescue C++ here.
The thing that parametric polymorphism gives you is that the type system can ensure that your code is not doing anything with your values but pass them around. And you don't need to read nor trust any implementation code for that.
For what you describe you could still implement cosine for float, and 'not' for bools, and identity for char.
We don't want the class to implement certain semantics. We'd want an assurance that certain semantics are _not_ implemented (in some sense). We want an assurance that the code is forced to treat the value as an opaque black box.
C++ 20 Concepts are duck typed. Inertia means it isn't practical to ship something with the semantics of say, Rust's Traits.
Your own concepts could (but most will not) choose to abuse the duck typing to require a "marker" e.g. the natural way to define Gunfighter might be to say it has a draw() function, and we could imagine instead adding a marker named "is_a_Gunfighter" to every Gunfighter. However the built-in concepts intentionally do not do that.
Instead C++ documents a "model" for the standard concepts and explains that it is, as so often, the programmer's responsibility to ensure that everywhere their code need a concept they use things which correctly model the concept, the compiler will only do duck typing.
So, in Rust an f32 isn't Ord (the totally ordered Trait) because well, NaN != NaN. So if you ask Rust's default slice sort to sort a slice of f32s that won't compile - what if the slice has a NaN in it? You will need to either remember that sorting IEEE floating point numbers isn't generally a reasonable thing to do, or, explicitly write your deterministic compare function (maybe as a lambda) and call sort_by() with your comparison. Now it's obvious whose fault it is when your function blows up for NaN.
In C++ the float type is in fact std::totally_ordered and it's your fault if you ever end up with a NaN somewhere and it matters.
IMHO the biggest deciding use case for me is whether or not it's more performant than any other possible solution. So far I haven't seen this to be the case in other people's benchmarks. [1]
Is that so? It kind of sounds like there's cases to consider where there's actually a performance gain, but in certain use cases in languages with implementations of this feature in a different way... From the article I linked:
"Historically, monomorphization has been the design of choice for implementing Generics in systems languages such as C++, D, or Rust. There are many reasons for this, but it all boils down to trading longer compile times for significant performance gains in the resulting code. When you replace the type placeholders in generic code with their final types before the compiler has performed any optimization passes, you create an exciting universe of optimizations that are essentially impossible when using boxed types. At the very least, you get to de-virtualize function calls and get rid of virtual tables; in the best case scenario, you get to inline code, which in turn enables further optimizations. Inlining code is great. Monomorphization is a total win for systems programming languages: it is, essentially, the only form of polymorphism that has zero runtime overhead, and often it has negative performance overhead. It makes generic code faster.
So, as somebody who works on performance for large Go applications, I admit I was not particularly excited about Generics in Go, really. I was excited about monomorphization, and about the potential for the Go compiler to perform optimizations that it simply can’t do when it’s dealing with interfaces. Cue my disappointment: The Generics implementation in Go 1.18 does not use monomorphization… at least, not quite."
This is only the case* when there is actually something to be gained by specializing the implementation of a data type or function for a given type. In my experience that isn't particularly common.
* Excluding Go's implementation of generics, which are frequently less performant than even vtable dispatch, let alone manually copied implementations.
That's a strange take considering generics can provide important type safety benefits in many situations. For example, without generics, there's no way to implement the `MapKeys` example from the article unless you want to give up type safety and resort to unsafe casts. The performance impact is likely to be negligible in most code unless you're calling generic functions in a tight loop and you've exhausted every other optimization opportunity and you're in a position that requires such micro-optimization. That's a lot of ifs.
The Planetscale article mentions that byteseq (a constraint that is a string of slice of bytes) was actually faster than string or bytes, which was probably just noise in the benchmarking. In general, between the Planetscale article and this one, the takeaway lesson is “Use generics for data containers and other cases that don’t call constraint methods. If you need to call a method, it will probably end up being slower.” It makes sense if you think about it.
Can someone please explain to me of what use generics actually are and how they create faster code or improve the typing-workflow? I'm serious, I have no idea.
Googling it, I get:
> In a nutshell, generics enable types (classes and interfaces) to be parameters when defining classes, interfaces and methods. Much like the more familiar formal parameters used in method declarations, type parameters provide a way for you to re-use the same code with different inputs.
As someone who believes that OOP is bullshit and making things worse instead of better, can someone please put the above paragraph into humanly understandable words which translate into actual practical gains and improvements? Like ... actual use cases, maybe?
An use case that comes up quite often is building a List / Map / Set implementation. You want to have it to enforce to (typesafe) take only objects of the same type, and when taking it out again, get the object in the same type back. And you want the compiler to complain, when to try to put a String in a List<int> (a list of strings), or try to asign the object you get out of the list to a boolean.
And obviously you don't want to write a specific List implementation for every Type, like a StringList, DoubleList, IntList,... especially as every method in it would basically work the same.
I used generics as an exercise: for a circular buffer [1] is that ok?
The exercise project is a grep that greps through the content of archives [2] i needed a circular buffer to keep the last n lines for displaying the context of a match.
Still, I am feeling a bit uncomfy with returning a garbage value with the the error indication, even if that counts as being more idiomatic.
I keep wondering, why golang doesn't have an Optional monad in its standard library, like they do everywhere else.
I think so, too, but only because that’s idiomatic go. In general, it’s not a good idea to have functions that logically return a value or an error return a value and an error. If you use go, that’s what you do, though.
I think I would not return “ctx.data[0]” for the value returned in case of error, though. That runs the risk of users processing items multiple times if they forget to check for errors.
I would use either
- a fresh new instance of T
- a special instance of T created once (could be passed in to NewCBuf as a value with, to the user, obviously incorrect contents)
The first is cleanest, but may not be performant if creating T’s is costly.
The second may be risky if the returned items are mutable and the caller changes them.
> think I would not return “ctx.data[0]” for the value returned in case of error, though.
I am not to comfortable with this either, i wonder why golang doesn't have a Maybe/Option monad in its standard library, i mean that's a very widely used pattern, nowadays... or am I missing something here?
I guess they didn't want to copy the T value into the Option instance, but yeah, otherwise you get something like this, or you have to return a pointer.
> In C-like languages pointers are the Maybe/Option type.
I think that's a bit of a stretch, operations on Maybe types can be chained or used with Java streams. You can't do a thing like that with pointers.
They could have dome something like java streams, given that golang has lambdas, but I guess that would not be idiomatic, or something like that...
You do have a point with pattern matching, they don't do that in go. Scala has it, Java is getting pattern matching as some recent addition, but they added the Optional type with java 8.
For a collection like this, definitely agreed. If the users of that circular buffer want to use pointers, they can just pass a pointer type. No need to force it on all cases.
If you're not sure if you need them, you don't.
If you see yourself using reflection to implement a datastructure, you already know that you do. I don't get the nonsense around generics, it's so obvious to me...
>The Tree example illustrates another general guideline: when you need something like a comparison function, prefer a function to a method.
>We could have defined the Tree type such that the element type is required to have a Compare or Less method. This would be done by writing a constraint that requires the method, meaning that any type argument used to instantiate the Tree type would need to have that method.
Yes yes a thousand times yes.
If you have an interface with a single method, or something with a single varying piece of behavior, please strongly consider accepting a function instead. Going from method to function is trivial - pass `thing.method`. Going the other way is impossible at runtime, it requires package-level code at compile-time.
This is true for many/most? languages, but it feels like a lot of Go libraries have forgotten it, and they're sometimes a huge unnecessary pain to deal with for that reason.