More specifically the newtype pattern is mostly so that you can implement foreign traits on foreign types.
You can't normally do this since there's a high risk of duplicate implementations, so by enforcing only local types/foreign traits or foreign types/local traits (and of course local/local types and traits), this duplication doesn't occur - there's only ever one implementation of a trait per type, and the ownership of those implementations is very clear and we'll defined.
The newtype pattern simply makes a "local" type that wraps a foreign type, allowing you to implement the foreign trait on the new local type, and forwarding calls of the trait methods to the inner type - all while maintaining the single trait impl and ownership constraints rust imposes.
Further, given all of the compiler optimizations that happen, newtypes are effectively free.
For those who haven't seen it, this is what newtype looks like:
use some::ForeignType;
struct LocalType(ForeignType);
As the parent comment states, it has nothing to do with enums.
> More specifically the newtype pattern is mostly so that you can implement foreign traits on foreign types.
For some definition of "mostly".
I use most of my newtypes neither in Rust nor in Haskell to implement traits, but most of the time to declare - newtypes ;) that wrap other (primitive) types.
There are some rust packages that transform rust enums into proper enums, the best one being const_table, or my own table_enum. Java got it right except for bundling the data with the Enum object instead of storing it in separate tables.
Having said that, I could not be happier that sum types have entered the mainstream.
I think I have much more problem with the idea that table_enum is "proper enums" than with the choice to name Rust's sum types enum.
Because Rust is a low bit-banging language, union needs to exist, which I think rules out calling enum "tagged unions" even if that's one possible way to look at the implementation. In a high level language which lacks union types this wouldn't an issue, but Rust is not that language.
I always hated the name. You're not enumerating anything. You are picking a variant, encoding a choice. Maybe "sum type" is too "monad", but surely "variant record" or even "variant" would have been a better name.
In early Rust, the keyword was `tag`. In order to improve familiarity with C++ programmers, bikeshedding raged between `enum` and `union` for years before finally landing on the former.
TBH it's kinda funny how they narrowed it down to two options and then went with the "obviously" wrong option (also to get C++ programmers on board, wouldn't the obvious name not be "variant"?
I really wonder how that decision making progress looked like :)
But it's not, hence the aforementioned bikeshedding. When Rust enum variants have data, they're like C unions with automatically-inserted tag checking. When Rust enums don't have data, they're like C enums (they're literally how you do FFI interop with enums in C, which should demonstrate beyond a shadow of a doubt that `enum` isn't categorically wrong). They have the properties of both depending on how they're defined, but at the end of the day they technically have more in common with C enums than C unions (which is why Rust also has `union` for doing C FFI).
> to get C++ programmers on board, wouldn't the obvious name not be "variant"?
Most early Rust contributors were also C++ programmers (due to Mozilla), and I don't recall a single person ever suggesting `variant` as the keyword. I suspect people here are overestimating the mindshare of boost::variant as of 2014.
Why not call Enum Union, and the FFI one CUnion or something? Like, is there something meaningful that Union is for other than C interop? Isn't it functionally identical to a tagged union in every other respect?
C FFI is a very useful thing, sure, but I don’t think it’s exclusive to that. It’s a nice primitive to have access to. I think Rust would feel weird with tagged unions and not untagged ones, just like having both sum and product types are good.
The thing is that accessing an untagged union's data is just inherently unsafe. Say you made a union with a `u8` field and a `bool` field. If you had an instance of it and set its value to `5`, that's just not a valid `bool`. Hence the requirement of `unsafe`; you need to tell the compiler that you _know_ it's a valid value at the point you're accessing it.
I mean... yeah, that's exactly why Rust has tagged unions as well... both are useful in the right contexts, but tagged unions are easier to work with in general. That doesn't mean that (untagged) unions should be safe to work with though, because they can't be.
Rust already has things called unions, which are C-style unions. Using the same word for two different things would definitely have been more confusing, not less...
"Enum" is not the worst term here. In fact you are enumerating the variants. It's only slightly misleading. It implies that you're comparing an integer.
There is an actual enum backing it to encode the choice (tag), and a union to encode the data (if it has data). Or at least that's what I would expect.
"Variant" is used in Rust lingo to name the variants, not the whole type.
It is especially confusing for C and C++ coders, because Rust enums are a completely different thing than C enums, which makes the naming choice worse than just inventing a new name.
Better names would be "tagged union" (which is spot-on descriptive - because it *is* a union with a tag slapped on - but "tagged_union" of course is a bit unwieldy as keyword) or "variant" (which is the name for that thing in C++ and elsewhere since forever - it's not like Rust invented those tagged-union-thingies, it just added some convenient syntax sugar to the language to deal with them).
I agree with you! I dislike the name enum and really like how F# does it. In F#, everything is type and what kind of type you create comes down to the type construction literal you use.
`
(* enum )
type 'a Option = Some of 'a | None
`
( struct - well: record. *)
type Company = {
Name : string
Age : int
}
I didn’t write a lot of compilers in Rust yet, but I find it limiting that you can’t pattern match across a Box<T>, and all ASTs require boxing for recursive references.
The Rust language team wants a more general solution that also works with types like `Arc`, instead of hardcoding a special case for `Box` into the language. This is generally called "deref patterns." But there are tricky open design questions, so the effort is stalled until someone comes up with a proposal addressing them
I believe it's not stable because it gives std::boxed::Box special treatment in the language. They'd rather find a solution that could work equally well for smart pointers defined outside of std.
Box patterns are likely to be superseded by one of these:
Often, when you have a compiler-y recursive ADT, you will want to allocate the individual nodes in an arena rather than from the global allocator as with Box. And at that point, you can use references (with the lifetime of the arena) rather than Boxes, and you can pattern match across those.
(Note that this doesn't work if your "arena" is just a Vec, which is something people have sort of hijacked the term "arena" for. It needs to be an arena that doesn't reallocate its contents on growth, something like https://docs.rs/bumpalo/latest/bumpalo/, in order to hand out references that live across allocations like this.)
An interesting aspect of sum types (what Rust calls enums) is that you can implement them in the language as a library if you have real unions, but not vice-versa.
I don't find "you can implement tagged unions with regular unions but not vice versa" all that interesting of a statement, since one forces tags on you it's not enlightening that you can't remove them, and with any other structuring mechanism you can add tags back in.
Yes, that's correct, but I think the way people talk about these things, it's not always clear why that's the case. I.e. I think from that description it's not particularly obvious to me that you can't find sneaky ways of removing the tags.
The important thing here is that the tags are fundamental to the subtyping relations. So unless your language is flexible enough to let one redefine what it means to be a subtype, then there's no chance of making unions out of tagged unions.
TaggedUnion{A, B} does not have the property that A <: TaggedUnion{A, B}, whereas with regular unions you must have that property, which is a very hard thing to hack into a language after the fact.
> A commonly said piece of feedback from someone who's learning Rust as a second language tends to be that enums are far better supported in Rust than any other language.
This is nonsense. I doubt that this is common feedback and it is just not correct that it is far better supported in Rust than in any other language. Why would anyone say that? Rust pretty much has support for ADTs, sum types and product types. Just like any modern (functional) language has.
Haskell, Scala, F#, OCaml, etc.
(And I love rust, write it in my day-to-day as well as on my spare time.)
Rust is the first not-weirdass-academic-wtf language to have these features ;)
(I love functional programming and apply the functional approach no matter what language I’m using - but why is it that functional languages are always so deeply down the academic rabbithole that they think eg “car” and "cdr” are intuitive names for “retrieve the first element of a list” and “retrieve the rest of the list” functions??)
I'm sure you know car and cdr are named for historic reasons and they are not featured in ML family languages as listed above. But coming back to the name of "sum type", I find it very potent to be used with "product type" to reason about Algebraic Data Type's properties, like a * (b + c) = a * b + a * c and don't understand why it's often avoided to be called in this way in "mainstream" programming languages.
Except Haskell, Scala and F#. They are neither academic nor rabbitholes. You're simply not familiar with them - perhaps a trip outside of your safe space could be something? :)
Car and crd is lisp and from the 60:ies, basically. You can liken them to assembly instructions because that's kinda sorta why they're named like that.
I still stand by what I said about them.
If you write Rust for 1-2 years, I bet you you'll come to look at programming in those languages completely differently. I have a very high carry over, going in to Rust from those languages.
`car` (Contents of Address part of Register) and `cdr` (Contents of Decrement part of Register) are historical terms from the IBM 704, used in LISP. They're legacy from early computing, not academic jargon.
MacCarthy transformed them into academic jargon in fact. Knowing that the words originated in the IBM 704 idiosyncrasies not found in other machines, he used them anyway in papers about symbolic computation.
Lisp pairs are flexible objects that can be coupled together into shapes and uses that are not lists. car and cdr can mean first and rest, and ANSI Common Lisp [1994] has those synonyms, but there are uses of cons cells where first and rest do not make sense.
MacCarthy must have realized that words that do not invoke any connotations are good for the elements of a flexible pair structure. Choices like first/rest, left/right, top/bottom and others are saddled with semantics that don't match every use.
Programs that need a pair structure in which the two pieces have very specific roles can provide their own synonyms, if their authors feel it makes code more readable.
I seem to recall that Knuth, in TAOCP, at one point, calls the pointers of a binary tree node ALINK and BLINK.
To be fair though, Scala's sum types are a bit of a hack aren't they, or at least pretty cumbersome? Declaring a `sealed abstract class` (or is it trait) and then a bunch of case classes syntactically disconnected from the sealed abstract thing that defines the sum type.
car and cdr are no more confusing then the historical C functions that turn entire sentences into initialisms. And surely nobody would call C an academic language in this context.
This is certainly happening but one of the reasons that Rust become popular is that it included, maybe not mainstream, but already existing concepts (lifetimes included).
Java enums store the associated data once per option, while Rust enums store it per value. So they're very different. You couldn't model Rust's `Option` or `Result` as a java enum.
Prior to Java 17, sealed classes feature can be partially recreated via private constructors in the super classes. This approach forces you to put all classes in a single Java file, which can be awkward to navigate.
Though from a quick search, it's still just scalar/unary values.
Rhough one of the better implementations of those.
But the big advantage of the more sum type Style enums rust provides is the ability to nest values in them while preserving type safety guarantees.
So I still see them as miles better as a language concept.
Java enums are a class with a fixed number of instances.
The thing in Java that is closest to Rust enum's would be sealed classes, which was a preview feature in Java 15 and finalized and released with Java 17.
erh no, those in java allow you to do just that. It is one of my favorite features of java. In essence, you get to embed multiple synchronized maps inside a java enum (if you want to). All handled 'invisibly' for you.
No, Java enums can contain data but do so in a manner orthogonal to Rust enums. Java enums are not sum types; each variant is simply a (`final`) instance of the enum type which, mostly, is just a regular class type with a private constructor so that no other instances can be created. All variants have the same shape and the author of the enum decides what data the variants contain.
Rust enums are sum types, which means that each variant can have a totally different shape and contain different data, and it's the user of the enum that supplies the data when they create an instance of a variant. This quintessential use case for sum types not possible with Java enums (although it is now sort of possible with records and sealed traits):
Most people have not had any exposure to functional programming languages or functional programming in general. When they say "any other language" they mean, any other language that they have used or know anything about I guess.
This is one of those cases where we fail to realize that other people's reality is totally different than ours. Everyone does this all of the time, not just you specifically to be clear, this is a limitation in our brains.
Obvious they would say that because they're not familiar with the other languages. Like it or not they probably come from Python/C++/JavaScript or similar.
> Other languages like Go do not necessarily have enums, but you can represent enums by using something like this (in Go):
This misses the important point that Go lacks pattern matching -- one uses a switch statement instead -- and the Go compiler lacks the ability to check that all branches have been handled in a switch statement.
Go's concurrency stuff is beautiful. If Go had sum types / Rust-style enums with pattern matching and handled errors as a Result sum type, then it would be a much nicer language.
It's incorrect to describe an enum with two variants as a 'newtype'.
I think the example was misunderstood from https://rust-unofficial.github.io/patterns/patterns/behaviou... (which uses Password, but doesn't use an enum with two variants).
The use of enums in the example is fine; but it's not a "newtype".