The call out to sum types is something I feel. I've been using Rust daily for almost 10 years now, and sum types are absolutely still one of the things I love most about it. It's easily one of the things I miss the most in other languages that don't have them. I'm usually a proponent of "using languages as they're intended," but I missed exhaustiveness checking so much that I ported a version of it to Go[1] as a sort of lint.
This is exactly what I think too. Sum types are so powerful, I feel a lot safer in Python + mypy with sum types (`from typing import Union`) than anything with C++ [1] even though C++ has a significantly more complex type system (it's type system is TC) and Python is as type-unsafe as a language can get. C++ made this odd choice as if any complex type system is better than a simple type system. When I was a younger software engineer, I agreed with this choice but now I completely disagree. I think next generation of software engineers will see that Turing completeness is a bug, and that the whole thing about software safety is finding weak features that are just powerful enough to be useful. If you don't believe me, go look at a language like Agda. You do not need Turing completeness: function application, sum types, primitive/structural recursion/corecursion are just about everything you need as a programmer and everything else that gives you a Turing complete language is extraneous.
[1] unless you liberally use `union`s everywhere which is very unidiomatic C++, but can be done if desired.
You can just make your python modules out of another language. I am not sure why code has to be big. Break it down and make it executive, not very friendly in terms of threads but if you're writing an API you are writing it for Linux which is vastly safer for threads than Microsoft or Mac.
- nobody uses it in the ecosystem. As outlined in the article, a lot of value of Option/Result is derived from their pervasiveness in the Rust ecosystem. C++ is far from this
- the ergonomics of it are terrible: no pattern matching, structural variants instead of named variants (yes you can emulate that with wrapper types but meh), lambda-oriented matching means you cannot as easily do things like early returns, statement-oriented language limits the usefulness anyway, lack of combinators, and for error handling specifically, lack of `?` operator
- performance is dubious. I had very steep and unexpected performance cliffs when lambda inlining started to fail for some reason. Having sum types be a language construct guarantees we're not relying on things like lambda optimisation here.
Variant types where one has to use objects as types don’t come up that often in API design or data structures in my experience with mobile and system programming on e.g. Linux. I think I’ve genuinely had to use them only a few times.
Error types are probably the most popular incarnation of that. There’s several libraries available and they will be part of the C++ standard.
This is a case of the Rust community overselling a minor feature as a game-changing novelty.
> Variant types where one has to use objects as types don’t come up that often in API design or data structures in my experience with mobile and system programming on e.g. Linux
You can't use what you don't have, so you adapt to the tools you do have. In my C++ time, the team would often write types that logically held several variants. However they were expressed as product types, so with space overhead and error-prone, unergonomic use.
Sum types are game-changing. Look at any Rust project you'll find enums with data everywhere. They're just a building block to model problems, like product types are. A language that miss them is as strange too me as a language without product types. After 10 years of mostly C++11 and C++14 I would never go back to it for this reason alone (although as outlined in the article there are other reasons too).
> Error types are probably the most popular incarnation of that. There’s several libraries available and they will be part of the C++ standard.
I just checked the old Rust classic “ripgrep” on github.
In matcher.rs, the enum is used as poor man’s OOP. I saw several match expressions which then call the same function on each matched type.
searcher/mod.rs is an error definition.
glob.rs contains a sort of policy enum which can be implemented once again with OOP or as a policy template.
json.rs usage can be modeled as a single class.
core/app.rs is more involved, but can be modeled as a series of structs with a map from enum -> any. Or as am std::variant. Or using OOP.
I looked at all instances and didn’t see anything game-changing. It’s a nice syntax and it should have really good performance, but such idioms are way too low-level to change any game.
This is a straw man. There is nobody on this Earth claiming that sum types are necessary to solve a problem. Product types aren't necessary either. You could just write everything in Assembly. Or maybe raw bytecode if you want.
As should be patently fucking obvious to anyone who has been commenting on a technology web site for as long as you have, sum types are a tool. They are a tool for expressing clearly and concisely the idea that a value can be exactly one of several possible options. That tool then interacts with the rest of the language based on that invariant, sometimes providing things like exhaustiveness checking and pattern matching. Put all this together, and you have a very succinct and very clear way of representing certain kinds of values in a program.
Your comment might as well go through ripgrep and talk about how functions aren't needed. "They could have just used goto here and there."
> I looked at all instances and didn’t see anything game-changing. It’s a nice syntax and it should have really good performance, but such idioms are way too low-level to change any game.
Sum types were game changing to me when I learned about them over a decade ago. Since then, they have been a significant factor in how I think about and structure data in programs.
I have zero interest in trying to convince someone like you that you should think it's game changing. That's not the point. Maybe you could do some perspective taking and realize that others might just think differently than you.
not to diminish value of sum types (those are not specific to Rust of course) but I think that product types belong to a database / config rather than being hardcoded.
> This is a case of the Rust community overselling a minor feature as a game-changing novelty.
Sum types reify control flow into an object from which said control flow can be retrieved. Compiler checked sum types remove the possibility of retrieving inconsistent control flow.
The transform is equivalent to callback to future.
It's one of the things that looks unimportant until you use it. After that, the absence is repeatedly experienced when working with C++. We don't use tagged unions much because the ergonomics are terrible.
You clearly haven't tried using variant types properly then. Sure, `std::variant` is pretty unergonomic, but they really are game changing. Most of my types are variant types
Most of your types? Well that certainly deserves congratulations and I hope someone will take advantage of the opportunity.
I think you misunderstood me though if you say I haven’t used them properly. I just didn’t need them or find them that useful because it rarely happens that I want to model a type which has several states.
Specific variant types like std::optional or the Result-equivalents are useful, but not that game changing either. They’re nice I suppose.
I wonder what kind of software you write that such low-level coding idioms make a big difference to the end result. Or what do you mean by game changing?
So, so many ways in which this is defective compared to actual sum types, some of them were already listed, but to me the most crucial, even if mostly about theory rather than practice, is valueless_by_exception.
The choice to provide exceptions everywhere as a error handling means C++ is obliged to admit that your std::variant may not have a value at all. Which blows up all of your type safety. In Rust I can say that this Pet is either a Dog or a Cat, and it cannot be neither, but in C++ std::variant of a Cat and a Dog might nevertheless be valueless_by_exception anyway. What can you do about that? I guess you could throw an exception...
In Rust if X is either A or B, and Y is either C or D, and Z is either E or F, then a structure with X, Y and Z has only eight possible states. In C++ this structure has 27 possible states because of exceptions.
I get the "theoretical" point, but the issues I encountered in practice are closer to the ones I listed in my response than to valueless_by_exception (which I never encountered, I think?).
Wrt type safety we had more pressing issues with C++ (use after move, implicit conversions, ...)
Exceptions are an optional feature. Also, if sufficient amount of constructors are marked noexcept, then the variant can never be valueless. Implementations are optimized accordingly.
You also don't have to handle the exception, in which case you won't access the variant again anyway. Or it gets handled where the variant is teared down. It's very unlikely that it gets handled where the variant is constructed or assigned to, making it a non-issue.
I meant exceptions as a user of the language. You can decide not to throw. You can mark functions noexcept.
Granted, it's more awkward when you consume 3rd party libraries, but you can still wrap them in noexcept interfaces and be fine with terminating when an exception is actually thrown or do something else. Not much different to a panic.
Yeah. In the C++ codebase I work on, we use folly::Expected<T, E> (essentially similar to Rust's Result<T, E>) heavily. Is it as ergonomic as Result is in Rust? Absolutely not. But it's still a nice design pattern for some of the same reasons people use Result in Rust.
(And C++ std::optional<T> is somewhat similar to Rust's Option<T> type.)
Yeah, we use a fixed-E alias of folly::Expected<> in our codebase which is similar to absl::StatusOr. Something like:
namespace our_project {
class Status { ... };
template <class T>
using Expected = folly::Expected<T, Status>;
}
absl::StatusOr<T> looks a lot like our_project::Expected<T>. This pattern of providing your own error type and aliasing Result is also somewhat common in Rust, I believe.
Swift on linux is supposed to be decent server side, but as far as I've heard, not much use outside of that and even limited testimonials on that. The bit I have dabbled in Swift, I really like it. I just wish it grew more out of the Apple ecosystem. It has lots of things I like, but considering I can't exactly take it with me to any system and just run with it is why I have not dived to much into it.
Take a look at Zig (but it's not stable yet, may take a couple of years), linux is a first class citizen for it. It has `error!type` unions and `?type` optionals and language constructs to deal with them:
const number = try parseU64(str, 10); // returns the error
const number = parseU64(str, 10) catch 13; // default
// more complex stuff
if (parseU64(str, 10)) |number| {
doSomethingWithNumber(number);
} else |err| switch (err) {
error.Overflow => {
// handle overflow...
},
// we promise that InvalidChar won't happen (or crash in debug mode if it does)
error.InvalidChar => unreachable,
}
I've been checking out Zig over the past week. It is a really nice language and I like so far. Just not spending as much time as I'd like in it since it is still pre-1.0, but I am excited for it's 1.0.
Is swift outside of Apple as fully fledged and performant when it comes to concurrency? Last I checked, a lot of the heavy lifting came from GCD (ie the Apple runtime-scheduler)
To me its not just exhaustiveness but that sum types (enums) are just like product types (structs), they have have member methods, implement traits, etc. Coming from C++, when I realized Rust let me do that, it blew me away.
Actually all three of Rust's user defined types (the sum type enum, the product type struct, and union†) are fully fledged types which can implement traits and have functions of their own (including functions taking a self parameter, thus methods)
C++ unions can actually have methods, although this isn't used very much. However C++ enums can't have methods, even C++ 11 scoped enums ("enum classes") can't have methods, I have no idea why that restriction seemed like a good idea.
† Unions are special because they're crazy dangerous, which is why they're not usually covered in material for learning Rust - you can't fetch from them safely. You can store things in unions safely because the process of storing a value in a union tells the compiler which is the valid representation - the one you're storing to, but fetching is unsafe because you might fetch an inactive representation and that's UB. However Rust does have a particularly obvious union right in the standard library - MaybeUninit - and sure enough MaybeUninit implements Copy and has a bunch of methods.
> C++ enums can't have methods, even C++ 11 scoped enums ("enum classes") can't have methods, I have no idea why that restriction seemed like a good idea
It is possible that Oracle holding the patent[1] to methods on enums is the blocker, rather than any technical restriction.
That sounds bizarre. So this means that parents are potentially holding back programming language innovation? I hope I don’t have to consult with a lawyer every time I invent a new variation on the for loop. (I anticipate that someone will then tell me that there already is a patent for that…)
Why would it be bizarre? It's not exactly a fringe belief that patents in all software are holding back innovation.
I don't think programmers often consult 20 year old "inventions", so it seems pretty obvious on its face that the supposed benefit of patents, that something is _only_ locked up for 20 years, is quite pointless in software.
Problem is, unless it has been tried in court, you can't be certain. And if you're building something, you might not want to spend time in court having to fight it in the first place. So even if it's 50/50 enforceable/not enforceable, do you really want to spend the time testing if it is?
Patents really have a chilling effect, even if a particular one might not be enforceable.
Also the ISO is also very averse of patents. They will probably not standardize anything patent-encumbered. They would probably require invalidating the patent first before accepting it in the standard.
Having said that this is the first time I ever heard of methods on enums being patented, what a ridiculous patent. It's a good thing then that C++ doesn't have methods, it has "member functions" :).
Also C++ allows user defined operators on enums, which feels somewhat adjacent.
WG21 (the "C++ Committee") is under JTC1 the Joint Technical Committee ("joint" between ISO and IEC), now, you might know of a few other famous products of the Joint Technical Committee's sub and sub-sub committees, including JPEG (that's the Joint Photographic Experts Group getting a shout out in the name of the standard) and MPEG. Those standards both required patented "inventions" to implement in full. The patents were held by contributors...
In the case of MPEG the result is MPEG LA, a US company which you need to pay to implement certain important standards. In the case of JPEG the result was a little different, since only the improved Arithmetic Coding of JPEG was patented, people just don't implement the actual standard, they cut out the patented part, so all the world's JPEGs (well, mostly JFIF files, which are slightly different but we call them "JPEGs" anyway) are a little bigger than they need to be for no reason except patents.
So no, I don't buy that "ISO is also very averse of patents" in a sense that would restrict this unless you can show that's a new stance.
Since I've heard of this idea, I keep finding places where id use it, whether that be for variant-specific behavior or to remove the redudant type definitions in my API.
A union in C++ is the same thing as a struct, except all of its fields live at the same offset. So you can define any method you want on it, including special stuff like constructors and destructors. No base classes are allowed, though.
A method on a union is much less useful when you can't match on the tag though. I suppose you could store tag inside the union. Is that common? I've always imagined C/C++ tagged unions would store tag outside the union.
You would probably use std::variant in C++ if you want tagged unions.
So you could have a struct or class with one std::variant field and some methods which can match on the type of the variant. But it would be kind of clunky.
C++ allows accessing inactive members in the special case when the active member shares a "common initial sequence" with the particular inactive member.
So yeah, it's possible to store the tag in the common initial sequence of all the union members.
This is quite niche and rarely used. Most of the time it makes more sense to have the tag outside. Sometimes the tag in the common initial sequence allows the whole data structure to pack better than with a tag outside of the union.
At least in C, you cannot store a tag inside the union since all fields of the union live at the same offset (0) they tag will be smashed by the actual value it's trying to talk about.
In C, you have to wrap the union in a struct in order to add the tag, and that pattern is fantastically common. Let's have a little geometry-inspired example:
typedef enum { SHAPETYPE_RECTANGLE, ... } ShapeType;
typedef struct { ... } Rectangle;
typedef struct { ... } Circle;
typedef struct { ... } Triangle;
typedef struct { ... } Polygon;
typedef struct { // Outer struct, not a union at this level.
ShapeType type;
union {
Rectangle rectangle;
Circle circle;
Triangle triangle;
Polygon polygon;
} // This can be nameless in new(ish) C, which is nice.
} Shape;
I'm not certain for C, but definitely in C++ it's legal to union a bunch of structures with a common prefix, and then talk about the prefix in the "wrong" variant and that's OK. There may be some restrictions about exactly what is in that prefix, but at least obvious things like an enum or an integral type will work.
So for your example you put ShapeType type in each of Rectangle, Circle, Triangle etc. and then you can union all of them, and the language promises that shape.circle.type == Rectangle is a reasonable thing to ask, so you can use that to make a discriminated union.
> I'm not certain for C, but definitely in C++ it's legal to union a bunch of structures with a common prefix, and then talk about the prefix in the "wrong" variant and that's OK
That doesn't sound right to me. Do you have a source? Is that in the standard?
I of course do not own a copy of the expensive ISO document, however, in the draft:
11.5.1 [class.union.general]
[Note 1: One special guarantee is made in order to simplify the use of unions: If a standard-layout union contains several standard-layout structs that share a common initial sequence ([class.mem]), and if a non-static data member of an object of this standard-layout union type is active and is one of the standard-layout structs, it is permitted to inspect the common initial sequence of any of the standard-layout struct members; see [class.mem].
— end note]
I don't know about the standard, but if cppreference.com is good enough: At https://en.cppreference.com/w/cpp/language/union it says "If two union members are standard-layout types, it's well-defined to examine their common subsequence on any compiler."
i haven't done c++ in a million years, but huh, you can! can't have any types w/ non-trival copy constructors in a union, though, apparently, which is quite the restriction.
> can't have any types w/ non-trival copy constructors in a union, though, apparently, which is quite the restriction.
That was removed in C++11. The rule now is:
> Absent default member initializers ([class.mem]), if any non-static data member of a union has a non-trivial default constructor ([class.default.ctor]), copy constructor, move constructor ([class.copy.ctor]), copy assignment operator, move assignment operator ([class.copy.assign]), or destructor ([class.dtor]), the corresponding member function of the union must be user-provided or it will be implicitly deleted ([dcl.fct.def.delete]) for the union.
i.e., you need to provide an explicit version of the special function for the union if any member has a nontrivial implementation.
You can have non trivial types in an union, but you need to explicitly define constructor in the union or the containing type (i.e. the default union copy constructor and assignment operator are disabled).
I may be wrong, but I thought it’s that you can’t have any fields with non trivial destructors, because the union wouldn’t know which destructor to call. So POD types / raw pointers / arrays of the above / structs of the above are allowed, and that’s pretty much it.
In fairness to C++, you could derive a class from std::variant and add methods to it that way.
It's less awkward with Rust's enums for sure though. And pattern matching as in Rust is far more expressive (and legible) than what std::variant gives you.
A nice thing about Rust is that it brings many well established good ideas from the ML branch of languages to the curly brave branch of languages. And frankly does a better job naming them (for people who weren’t math majors).
Completely agree. This article [0] by Raphael Poss, explicitly listing how functional constructs look in Rust compared with "pure" functional languages is one of my favorite takes on Rust.
I'm going to make the even stronger claim that a general-purpose statically typed language that doesn't support sum types and exhaustive pattern matching to at least the extent that Rust does is unfit for general use, just like a language that doesn't have product types is unfit for general use.
Dynamically-typed languages often have adhoc sum types.
I know your claim will strike some people as absurd, but I agree. I am honestly flabbergasted that any general purpose language created in the last 20 years doesn't have sum types and pattern matching. To me they are very nearly as fundamental as aggregate types (aka structs or product types).
Suppose I say that cars without seatbelts are "unfit" to use on public roads. Clearly I can't mean it's impossible to use such cars, people used to do it all the time, but perhaps I mean it's a bad idea to use them, and that's harder to argue with which is why we got laws saying you need seatbelts.
And I must say that Typescript has the most flexible variation of it. I remember a while ago (> 10 years) that somebody on Lambda the ultimate asked how to flexibly represent compiler intermediate representations using the Scala type system, given that there can be many of them with overlaps. These overlaps are not a problem at all in Typescript.
Until very recently, all of the C++ code I wrote was in the 1998 style or in the 2014 style.
Since 2017, there is std::variant, which is a sum type template that _technically_ allows for language-level exhaustiveness checking via std::visit. It's not pretty, but it gets you there without requiring compiler support.
Most modern C++ codebases do compile-time polymorphism via templates, though. std::variant is a niche use for things like serialization, when you need to make type choices at runtime.
You can see that it starts out as a sum type at the top level. And inside each variant is all sorts of product types and other sum types.
But do note in practice that sum types aren't limited to specialized things like ASTs. In practice, they come up everywhere. For example, here's a small little state machine used to implement a simple "unescape" routine. e.g., converting the string 'a\xFF\t' to the byte sequence 0x61 0xFF 0x09: https://github.com/BurntSushi/ripgrep/blob/44fb9fce2c1ee1a86...
Real support like what Java has? It literally has sealed classes and interfaces that specify all the possible classes that can subtype it. This is enforced by the language and allows for things like exhaustive switch cases. Pattern matching is a related feature which is also available in Java now, though as of yet only limited to records.
agreed, many language features have trade offs or limited usefulness, but after years in the profession I feel that sum types are something every language should have.
In my experience with languages that lack concise sum types and pattern matching, you end up with data types that have lots of implicit invariants. If you find yourself writing docs or thinking in your head things like "Field X is only set if field Y is true" or "If field X is non-null then field Y must be null and vice-versa", then these are indications that sum types would model the data better. Then these invariants can be followed by construction, and it becomes clear what's available at data access - great for interface clarity, robustness, dev tools, morale, etc.
Relatedly, storing booleans is a smell, imho typically an enum or sum type is always better in languages that have concise syntax for these. True and False are meaningless without context, and so can easily lead to errors where they are provided to a different context or generally misused due to a misunderstanding about the meaning.
> if your type-system is sufficently strong to express this
No fanciness needed, just plain old sum types. It is certainly possible to express those invariants directly in languages with a dependent type systems or refinement types like in liquid haskell - see https://ucsd-progsys.github.io/liquidhaskell-tutorial/Tutori.... It's typically much easier to reason about and use sum types, though.
Of course these examples are trivial and silly, but I see instances of these patterns all the time in big co software, and of course usually the invariants are far more complex but many could be expressed via sum types. I've seen loads of bugs from constructing data that invalidates assumptions made elsewhere that could have been prevented by sum types, as well as lots of confusion among engineers about which states some data can have.
> Field X is only set if field Y is true
Original gnarly C style pattern:
struct TurboEncabulatorConfig {
// When true, the turbo-encabulator must reticulate splines, and 'splines'
// must be non-null. When false, 'splines' must be null.
bool reticulate_splines;
struct Splines *splines;
};
> If field X is non-null then field Y must be null and vice-versa.
Original gnarly C style pattern:
struct TurboEncabulatorConfig {
// When non-null, lunar_waneshaft must be null.
struct Fan *pentametric_fan;
// When non-null, pentametric_fan must be null.
struct Shaft *lunar_waneshaft;
};
Type systems are of course not the only possible mechanism that can enforce these kinds of invariants. In dynamic languages, schema style solutions (eg malli or spec in Clojure) have advantages: they have more expressivity than typical type systems, you can manipulate them as data, you can use in contexts that are not statically verifiable, like at the data interfaces of yuor app.
I agree it has more flexibility, but more expressiveness is a double edged sword. Type systems are often not Turing-complete, or at least they are limited in some ways. A language being very expressive means it cannot be run at compile time, sine we cannot know if it terminates.
If we cannot run it at runtime now we need tests to hit that path or to test it manually to actually know if it works.
Nope, not unions. Sum types. Sum types would be more analogous to tagged unions or discriminated unions.
> i have almost never needed to use such types
Well sure. When is an abstraction "needed"? Before Fortran, nobody "needed" a programming language either. So is a programming language necessary?
That's the funny thing about the word "need." It has very narrow application, and it's precisely why I didn't mention the word a single time in my top level comment.
Unfortunately it's actually kind of difficult to parse your question because of the ambiguity in the surrounding comments. You could mean "tagged union" instead of "union," or maybe not and "them" means "tagged union"...
If you're asking to compare tagged unions and unions, then...
A union is not a tagged union. A union is part of a tagged union. A union on its own is just a region of memory. What's in that memory? I dunno. Who does? Maybe something about your program knows what it is. Maybe not. But what if you need to know what it is and some other aspect of your program doesn't tell you what it is? Well, you instead put a little bit of memory next to your union. Perhaps it an integer. 0 means the memory is a 32-bit signed integer. 1 means it's a NUL terminated string. 2 means it's a 'struct dirent'. The point is, that integer is a tag. The combination of a union and a tag is a tagged union.
An abstract syntax tree is a classic example of a tagged union.
If instead you're asking to compare tagged unions and sum types, then...
Tagged unions are one particularly popular implementation choice of a sum type. Usually sum types have additional stuff layered on top of them that a simple tagged union does not have. For example, pattern matching and exhaustiveness checking.
Sorry, yes I meant tagged union, a C style union (as in the language's concept) is so useless by itself that I thought being tagged was an obvious given.
In that case, the pattern matching and exhaustiveness means that the value the sumtypes in rust offer is in combination with those other language features, not the mere usage of the sumtypes themselves then, I think that makes more sense.
Yes. When you say "sum type" and you're talking about, say, Ocaml or Standard ML or (mostly) Haskell or Rust, then what comes with that is also the infrastructure around sum types in those languages. They aren't really separable. That includes pattern matching and exhaustiveness checking (the latter is an opt-in warning in Haskell, and can be opted out of in Rust on a per-type basis).
The discussion around "sum type" the concept is to disentangle it from the implementation strategy. Tagged unions are the implementation strategy, and they don't come with pattern matching or exhaustiveness checks.
Of course, many of these terms are used interchangeably in common vernacular. But if you look at the comment that kicked off this annoying sub-thread:
> so, um, unions? i have to say that in many years of programming, i have almost never needed to use such types.
Then this is clearly trying to draw an equivalence between two different terms, but where no such equivalence exists. There's no acknowledgment here of the difference between what a "union" is (nevermind a tagged union) and the "sum types" being talked about in my original top level comment. And the last part, "i have almost never needed to use such types" is puzzling on multiple levels.
Based on your other replies in this thread, as far as I can tell, you're not here to try and understand something. You're here to fuck around and play games with people over definitions. So, I'm not responding to you. I'm responding to the people following along who might get confused by the mess you're making here. More to the point, the Wikipedia article for "sum type" redirects to "tagged union," which just honestly makes this more of a mess.
A tagged union is a specific kind of data structure. It essentially refers to a pair of information: the first is a tag and the second is the union. The union might represent many different kinds of values. The important bit is that the size of that memory is the size of the largest value. The tag then tells you what kind of value is in that memory, and crucially, tells you how to interpret that memory. Getting this wrong, for example, in a language like C can lead to undefined behavior.
Tagged unions show up in all sorts of places. The classical example is an AST. You usually have one "node" type that can be one of many possible things: a literal, a binary operation, a function call, a type definition, whatever. And for any non-primitive node, it usually has pointers to children. And what is the type of a child? Well, a "node"! It's a recursive data type. It lets you represent children as just more nodes, and those nodes might be any other AST node. (And as you might imagine, an AST can be quite a bit more complicated than this. You often want restrictions on which types of children are allowed in certain contexts, and so you wind up with many different tagged unions.)
The use of a tagged union like this can be conceptually thought of as a sum type. A sum type is not a data structure. A sum type is a concept. It comes from type theory and it is the "dual" of product types. Product types represent the conjunction of values while sum types represent the disjunction of values. In type theory notation, you usually see 'A x B x C' to refer to the product type of A, B and C. Similarly, you usually see 'A + B + C' to refer to the sum type of A, B or C.
A sum type can be represented by just about anything, because it's just a concept. You can take something that is typically used for product types and use it to present an API that gives you a sum type:
type OptionInt struct {
exists bool // private
number int // private
}
func OptionIntNone() OptionInt {
return OptionInt{false, 0}
}
func OptionIntSome(number int) OptionInt {
return OptionInt{true, number}
}
func (o OptionInt) IsSome() bool {
return o.exists
}
func (o OptionInt) Number() int {
if !o.exists {
panic("option is none, no number available")
}
return o.number
}
This is a very contrived and hamstrung example, because it really doesn't give you much. But what it does give you is the API of a sum type. It is a value that can be either missing or present with an integer. Notably, it does not rely on any particular integer sentinel to indicate whether the value is missing or not. It encodes that state separately. So it can fully represent the presence of any integer value.
As you can see, despite it offering an API that represents the sum type concept, it is actually implemented in terms of a product type, or a struct in this case.
In a language like Go, which does not really support either unions or sum types, this sort of API is really hamfisted. In particular, one could object that it is a sum type because it doesn't really represent the "sum typeness" in the type system. Namely, if you get its use wrong, then you don't get a compilation error. That is, you can call the 'Number()' method without checking whether 'IsSome()' returns true or not. In other words, correct use of OptionInt looks like this:
var opt OptionInt := doSomethingThatMightReturnNone()
if opt.IsSome() {
// OK, safe to access Number. It won't panic
// because we just checked that IsSome() is true.
fmt.Println(opt.Number())
}
But nothing in Go stops you from writing this instead:
var opt OptionInt := doSomethingThatMightReturnNone()
fmt.Println(opt.Number())
This is why "support for sum types" is critical for making them as useful as they can be. In Rust, that same option type is defined like this:
enum OptionInt {
None,
Some(i32),
}
That's it. The language gives you the rest. And crucially, instead of splitting out the 'IsSome()' check and the access via 'Number()', they are coupled together to make it impossible to get wrong. So the above code for getting a number out looks like this instead:
let opt: OptionInt = do_something_that_might_return_none()
if let OptionInt::Some(number) = opt {
println!("{}", number);
}
You can't get at the 'number' without doing the pattern matching. The language won't let you. Now, you can define methods on your sum types that behave like Go's 'Number()' method above and panic when the value isn't what you expect. And indeed, Rust's Option and Result types provide this routine under the names 'unwrap()' and 'expect()'. But the point here isn't just about Option and Result, it's about sum types in general. And you don't have to define those methods that might panic. You can just let the language help guide you along instead.
Sum types are an idea. A concept. A technique. Tagged unions are an implementation strategy.
Notice also that I haven't mentioned exhaustiveness at all here. It isn't required for sum types. Haskell, for example, doesn't mind at all if your pattern matching isn't exhaustive. You actually have to tell it to warn you about it, and even then, it's not a compilation error unless you treat it as one. In Rust, exhaustiveness checking is enabled by default for all sum types. But you can disable it for a particular sum type with the '#[non_exhaustive]' attribute. The benefit of disabling it is that adding a new variant is possibly semver compatible change, where as adding a new variant in an exhaustive sum type is definitely a semver incompatible change.
> But you can disable it for a particular sum type with the '#[non_exhaustive]' attribute.
I’m not sure this is the best description of what `#[non_exhaustive]` does in Rust. It doesn’t so much disable exhaustiveness checking as it marks the given list of variants as incomplete.
In Rust, some pattern matching contexts are required to be exhaustive (e.g. ‘let pattern = …;’, ‘match … {}’), and others are not (e.g. ‘if let … = … {}’). When an enum is tagged as ‘#[non_exhaustive]’, the former are required to include a wildcard matcher that will accept unknown variants, even if all known variants are explicitly matched.
Rust also allows structs to be `#[non_exhaustive]` to mark that the listed fields might not be sufficient. This disables the ability to (outside of a privacy boundary) directly construct an instance of the structure, or to destructure it with an exhaustive pattern.
The privacy boundary applies in both cases by the way. Despite labelling your AmericanCity enum as #[non_exhaustive] since it only has Houston and Chicago in it so far, you are allowed in your own code inside the boundary to write a match which handles only Houston and Chicago. And that's actually probably a good idea, because when you add SanFrancisco to the enum, the compiler will point out anywhere your match clauses don't handle it, whereas if you had a default case (as your 3rd parties will) you'd drop into the default.
My goal wasn't to describe it thoroughly, it was to call it out as a means of disabling exhaustiveness checking. Regardless of whether that's really what the essence of '#[non_exhaustive]' actually is, it does disable exhaustiveness checking. It's right in the name. It's an important part of what it does.
Basically, we're talking past each other. I called it out for what it does, but you're pointing out why someone would want to use it. That is, "#[non_exhaustive] on enums indicates that the list of variants is incomplete, and it does this by disabling exhaustiveness checking outside of the crate in which it is defined." Although, of course, not even that is precisely correct. Because your match arms are still forced to be technically exhaustive via the compiler requiring a wildcard arm. But the end effect to the user of the enum is that exhaustiveness checking on the variants of the enum is disabled.
> More to the point, the Wikipedia article for "sum type" redirects to "tagged union," which just honestly makes this more of a mess
I often see Wikipedia articles in CS topics severely lacking, or claiming exact definitions, when no such thing exists. Many definitions we regularly use are not exact at all, different papers/books use them differently, e.g. how would you even define OOP, low-level languages, etc? I would prefer them to use less strong language, as too many people argue on the internet based on Wikipedia definitions.
I agree that they are lacking. In theory, I have the disposition to improve articles like that, but in practice, whenever I've tried, I get shut down by editors. It's been a long time since I've tried though.
It happens on other wikis too, like the Arch wiki.
I realize it's a tired complaint that a lot of people have and editors don't have an easy job, but I am so completely done with contributing to wikis. I know a lot of other people with the same complaint and same conclusion, including experts in the domain of type theory. One wonders whether this is one of the reasons why the articles are not as good as they could be.
The union part is just an (obvious) optimization. An implementation detail. Sum types are not required to use a union; they can be implemented without using a union.
If the data types have constructors and/or destructors that have side effects then it has to be lazy. Otherwise IMO it's not really behaving like a sum type.
This isn't real code; it's a pretend language that's kind of a mix of Rust and C++. But it illustrates that the sum type can be lowered into a non-union representation.
It's not a union, though, it stores references to three memory locations, not just one (as a union represents a single memory location which can have multiple possible interpretations). And the lowered code has nothing protecting it from an invariant where more than one reference is non-null!
Tagged unions with exhaustiveness checking, built into the language in a fairly deep way. Very different from what would come to mind for a C or C++ developer who hears the word "union".
Does your code ever contain class hierarchies with a fixed set of classes? Or variables where certain values have special case meaning? Those are the cases where Sum Types make things immeasurably nicer than the alternatives.
> Does your code ever contain class hierarchies with a fixed set of classes?
no - one of the advantages of OO programing is that class hierarchies (should you feel the need to use them, which mostly i do not) can be expanded. or indeed contracted.
> Or variables where certain values have special case meaning?
very, very rarely (i would say never, in my own code) but of course we have the null pointer as a counter-example. to which i can say: don't use raw pointers.
The Visitor pattern, which is a must in certain niches is exactly analogous to pattern matching over a sum type, and I am fairly certain that I can objectively say that the latter is a million times more readable and maintainable.
It doesn’t make OOP obsolete, though, at all. Certain other areas are better expressed as hierarchies, e.g. GUI nodes.
I can’t imagine not having the ability for a variable to be one of multiple potential types. I could probably work around it, sure, but why would you want to?
i can't imagine why you would want that. for example, why would i want something defined as an integer hold something other than an integer? this is how strongly-typed languages work.
You definitely want an integer to always be an integer, but you often want to represent "A or B". E.g. user has a contact method that's a phone number or an email address. Result of validation is a validated value or a failure message.
The C++ (or Java etc) way to do many of the kind of things people do with Rust pattern matching & sum types would be via OO subtyping polymorphism. e.g. classic visitor pattern, etc. Way more verbose and awkward, and scatters the logic all over the place.
Enums + switch is the other, and far less powerful.
I swear I am not paid to shill for Java or anything…, but Java did get sum types recently, and it’s quite good. They made it in a backwards compatible manner by sealed classes that list all of its subtypes.
Switch expressions were also implemented, and they can exhaustively match on the given sum type. Pattern matching is quite limited as of now (only usable with records), but it is coming.
If you’re going to nitpick you better be right, and you’re not. - and I think he might be too humble to say anything but you might want to look into who you’re replying to and their body of work.
Rust has been around for well over a decade - the first “stable” release was 7 years ago.
It’s always been opensource - any random hacker could download it and start using it since it was available. I think you will find kind quite a few people on this site that took it for a spin when it was still under development.
I have nowhere near the credits of BurntSushi, and even I was dabbling with rust more than 7 years ago.
Rust was initially created in 2006. Mozilla started sponsoring Rust in 2009. Maybe you're talking about the first stable release, which happened in 2015? Rust has been around for longer than the first stable release though.
First spike at Google Trends for "Rust (Programming Language)" happened back in late 2013/early 2014 (https://trends.google.com/trends/explore?date=all&q=%2Fm%2F0...) so not at all impossible for someone to have ~10 years experience with it. Although that's probably pretty uncommon.
As a bonus, here is a very old (by internet time standards) HN submission titled "Mozilla releases version 0.1 of the Rust programming language (mail.mozilla.org)" - 236 points | Jan 23, 2012 | 82 comments - https://news.ycombinator.com/item?id=3501980
The person you're replying to, BurntSushi, is one of the most well-known Rust library authors and has been publishing open source Rust code since (at least) March of 2014, when the initial commits on the old regex crate were recorded. That's 9 years, just based off of a random quick Github search, which was hardly exhaustive.
I want to switch to Rust, to get away from the C/C++/Nim mix we rely on at work. But embedded development (specifically for the ESP32-S3 chip, but we also have to do a lot of STM32 work as well for our coprocessors) has sort of forced us down this path.
That said, the bulk of our code is Nim, and it is lovely to work in. I just wish we didn't have to rely on C/C++ libraries and toolchains as much. CMake will be the death of me, I swear, and it would be nice to have all the thought we have to put into to lifetimes and safe bindings around unsafe C calls be able to be encoded into the language itself, rather than as comments and so on.
For what it's worth, the entire esp32 family is officially supported in Rust by espressif themselves [1], and the stm32 family is probably one of the most widely used cortex-m families in embedded Rust as well[2].
Right, though Espressif's official support for anything should be taken somewhat with a grain of salt ;)
Having used their tools at work in anger for years at this point, there is a lot of gotchas, issues and bugs with their bindings at this point -- not all of which is Rust's fault, but the underlying problems with ESP-IDF itself.
That said, its getting better over time, so I'm keeping a close eye on it (and we keep some experimental projects up to test it every couple of months).
The upside of Nim is that "it's just C" at the end, with all the upsides and downsides that comes with.
I'm very hopeful though!
Edit to add: We started this project quite a while ago, too, which forced our hands. Writing unsafe Rust is a pain, so I didn't want to take the burden on for managing bindings myself at the start. If we were to start again, maybe we'd have made a different choice, but Nim at least eased the pain quite a bit.
We are stuck on the S3, yeah. The C3 is lovely, but doesn’t have everything we require, at least currently. Might change in the future but not right now.
Ah I was writing a reply to someone else as you commented this :) We've got experimental projects with some ports of our firmware to it already, but there are lots of showstoppers that we've run into (some of which are ESP-IDF's fault more-so than esp-rs directly). In the medium-to-long-term, I'm really hopeful we can move to it. Not quite there yet for our production use cases though unfortunately.
If the lack of some services allows it (maybe not given the chip, but it was enough for me) I recommend using esp-hal+esp-wifi rather than esp-idf-hal, it is a pure rust 'rewrite' and it's been a great experience compared to esp-idf(c++). Default is bare-metal but embassy resolves most of the need of an rtos.
I'll check it out! While it's unlikely that we're going to rewrite this firmware anytime soon, being able to leverage Rust for future projects would be nice. The big thing for us is, we use basically every peripheral and driver that ESP-IDF has, along with some extra third-party ones for some other peripheral chips we run via our daughterboard setup -- and currently today, a lot of that isn't supported yet.
Definitely keeping an eye on it though. The future of Rust on the ESP32 chips is promising!
I don't know what the issue is but I've tried esp-wifi on the ESP32-C3 with the examples and nothing would run. It's a bummer since I'd have liked to leverage the Bluetooth support for my project.
Thanks for mentioning it. I already tried using it a few months ago and brushed it off as still being a WIP, but it still didn't work when I tried again just a few days ago. Asking in the Matrix chat when first trying it sadly only got me a "works for me" from the developers.
What Espressif is doing with their esp-idf and porting it to Rust is promising, but overall it still needs work. Using the toolchain to develop on the ESP32 was at least slightly painful half a year ago before they introduced espup[1], having to keep a patched LLVM around etc., and supposedly support for their Xtensa architecture is coming to LLVM soon[2] so this will improve in the future.
I'd also love to see Bluetooth support in esp-idf-svc[3], but they seem to be lacking people with the required knowledge to design and implement an abstraction for it[4].
That’s about my experience too. Which is far too fiddly for us to rely on at work for production tasks. I basically keep checking back every six months or so: I’m hopeful that in a year, year and a half, I could replicate what we’ve done with our project on esp-rs entirely. Fingers crossed anyway, but right now there are quite a few showstoppers.
I appreciate the inclusion of this section. Different people writing C++ can have wildly different experiences, based on the type of things they are working on. This section offers a good reminder about this, and that one person's perspective is not more or less correct than anyone else's.
The most fun thing reading this comparision is again that for an experienced C++ programmer who is used to managing memory by hand (and fixing segfaults), Rust's memory model doesn't seem that hard, because it makes sense, and he understands why the checks are important.
Rust's memory model is even more familiar to C++ programmers who don't manage memory by hand because they have the same concepts of borrowing and lifetimes.
I was just trying to find a modern C++ tutorial, but there's nothing comparable to the Rust Book on the internet.
I think even now the best resource to learn C++ is to first learn C, then original C++, then all the modern memory management techniques, which is just crazy hard for a new programmer compared to just going through the Rust book 10 times (which is needed to get a deep understanding).
Professional C++ is the 1072 page book that taught me modern C++, for what it's worth. I have the physical copy. And even then I have gripes with some of it's contents! The Rust book is second to none, tbh, and I say this as someone who doesn't really write Rust.
As someone who wrote C++ for 20 years, just don't. It's a horrible language(s). None of the codebases look the same, everyone uses different features or different versions of the language. Building is a nightmare, it takes forever to do anything. Run!
I know this is an unpopular opinion but if you want to stick with C++, the solution to this, at least in my experience, is to stop doing things that let you shoot your foot off. C++ gives you every tool in the tool chest, and most of them are not safe. Stick to a safe subset and you've solved 90% of all those stereotypically C++ problems.
I've got a pretty big C++ codebase for my hobby projects, sanded down, polished and perfected over the years without the usual corporate pressure to ship. The few times I run into memory corruption, a memory leak, a segfault, dereferencing a shit pointer, undefined behavior, and so on, its always, always because I'm doing something I shouldn't be doing. Like working with raw pointers or pointers to pointers to pointers, or traversing an array of bytes to do something there's already a library that does, or manually calling delete on something, or using reinterpret_cast<>, or using one of the many footguns C++ happily gives me. The simple key is to just stop doing these unnecessary things.
Yes, but "just don't do that" doesn't scale to large teams with varying levels of experience and discipline. It sounds like you have rules that even you disobey on this project.
Everything you said is true. C++ gives you everything you need to be safe. But that's not the point.
Rust makes it HARD (or very hard) to do things that are unsafe.
It's not what C++ can or cannot do, it's not what Rust can or cannot do. It's what the Rust language and compiler opinionatedly encourage you to do and not to do. This is what's lacking in C++.
I think many people - me included - are complaining that it makes things hard period. Not in the sense of “rocket science” as this argument is sometimes smugly dismissed, but in the sense of extra friction which slows things down and decreases the pleasure of coding.
> The few times I run into memory corruption, a memory leak, a segfault, dereferencing a shit pointer, undefined behavior, and so on, its always, always because I'm doing something I shouldn't be doing. Like working with raw pointers or pointers to pointers to pointers, or traversing an array of bytes to do something there's already a library that does, or manually calling delete on something, or using reinterpret_cast<>, or using one of the many footguns C++ happily gives me.
You've never used an out-of-bounds index? Accidentally used an object that was on the stack beyond the function call? Let an integer overflow? The problem with C++ is that all of these things don't look like unsafe operations, and people end up making mistakes with them that are not obvious.
> You've never used an out-of-bounds index? Accidentally used an object that was on the stack beyond the function call? Let an integer overflow?
Can't speak for the OP, but I never see any of these bugs in my C++ code. You can write code with these bugs but that is more of a style choice, practically speaking. The C++ bugs I see are almost always in complex state logic or in rarer cases an unexpected/unhandled error case from a call to outside code, which can happen in every language.
When using GCC for example, there’s a macro one can define which adds bounds checking to all containers.
Integer overflow can be turned into defined behavior through a compiler switch.
Holding pointers to stack variables past the return is not as easy to inadvertently do as an off-by-one. Someone has to store the address into a variable and this kind of code should raise alarm bells.
A more indirect way is to pass an address of a local variable to a function which is typical. But then if a function takes arbitrary pointers it receives and stores them, this should also raise some alarm bells.
The most recent example I debugged for a colleague was a container of callbacks, and one of them ended up changing the container while it was being iterated over. Nothing crazy, just no easy way that we were aware of to catch that kind of thing.
Yes, of course. If you use C++, try to not shoot your foot off. The same advice also applies to Rust or any other programming language. You should still try to follow best practices and not mess up, even if a language makes it less likely to mess up in certain ways.
The question is how often you mess up in each language and how bad those mess-ups are. This depends a lot on the programmer(s) and the kind of project. My personal view is that the space of team/project combinations where you should ever start a new C++ project is now confined to "team desperately wants C++".
when I started moving from C++ to Rust I found the easiest initial way to make sense of the borrow checker was to basically imagine that almost every argument passed in a function call was wrapped in a std::move.
fair enough, but also the semantics of a Rust 'move' aren't identical to C++ std::move anyways. you can "get it back" after the function call, as it's a borrow, not a move.
Makes me wonder out loud if the C++ community isn't going to add some kind of std::borrow/std::return_borrowed to the language. If that's even feasible with the type system and reference system as it is today.
I suppose it comes down to your definition of "by hand," but creating custom allocators for your application and arranging your memory layout carefully can still absolutely be a worthwhile endeavor.
Especially when performance matters (eg: high frequency trading systems, video games, constrained embedded systems, and more), the improved locality that you can get from custom memory management can be worth it all on its own.
For a good talk on this subject, see Lakos "Local Memory Allocators" presentation:
Sure, but those are still carefully controlled places and the majority of code isn't doing it by hand. Not like some old code I've seen where there were many different code paths memory could get ownership transferred to.
Rust’s memory management is not hard, it’s just tedious. If programming in the future will mean annotating the crap out of everything then garbage-collected languages start to become attractive - see golang competing with Rust in unexpected areas.
Perhaps the only hard part is storing and passing references everywhere, which may mean that one has to act like an automaton and patiently type their lifetimes to the Rust compiler. Unfortunately the Rust programming community has settled exactly on this kind of reference usage.
Well, a huge swath of the developer community did realize it like 20 years ago — the vast majority of applications can absolutely get away with a GC, and that’s the correct choice from the perspective of developer productivity and safety.
Rust is a huge win for the small niches where the GC overhead is unacceptable, because it is completely novel in its memory safety, which is absolutely a must and was neglected for way too long, but I never really understood the desire to use it for CRUD apps and the like. Sure, to each their own, but it is just an arguably bad choice. Even OSs could be written in managed languages - it’s not like it hasn’t been done before and they can just as well have escape hatches like Rust’s unsafe (hell, they are likely even safer if they are only used to manipulate “external heap”)
All true, but I suspect the end goal Rust's type system wasn't to eliminate GC. It was "fearless concurrency". You can see this in languages that encourage multithreadeding and have a GC. In Go memory allocation is easy because of GC, but even so naively put working single threaded Go code in a multithreaded environment and it will likely segfault. GC doesn't solve the currency problem.
I suspect it was just happy circumstance that a type system so strong it makes memory access compile time checkable is also strong enough to eliminate the overheads of GC. So they did that too.
But the claim that all this tediousness is there to eliminate GC sort of misses the point - that's not the reason they introduced it. Rust's complex type system is a type of formal proof system that eliminates a lot of bugs at compile time. The "sum types" discussion above talks about another class of bugs it eliminates. Eliminating bugs at compile time is the point - not making memory allocation easier.
Well, Rust doesn’t solve concurrency issues either, it can only prevent data races which is a tiny subset only. There are different kinds of race conditions as well, deadlocks, livelocks, etc which are generally unsolvable.
And to be honest, Rust doesn’t have all that strong type system compared to Haskell/ML which introduced these concepts in the first place. Also, there are plenty of languages in the category of “managed, with a strong type system”.
Background: C++ since around 1998, added in Python around 2008, Rust around 2016
While I agree with everything in the article, some times I wish I could have it both ways with:
> First of all, generics without duck typing are greatly appreciated. Traits clearly indicate the contract struct or function expects from the type, which is great. This also helps compiler to generate helpful error messages. Instead of “invalid reference to method clone() on line Y” you get “type X does not implement Clone” - clean and informative.
I will say that there are times when I wish I could sneak in some duck typed traits, particularly `Debug`. Sometimes you are working in generic or dynamic dispatch code and just want to use `dbg!` but can't because (1) you didn't think ahead or were too lazy to annotate things with `Debug`, (2) you are a bit paranoid about excluding a type that doesn't support `Debug`, or (3) You are doing dynamic dispatch and `FooTrait + Debug` only works for generics.
`Debug` is also just one example. Overall, I love the explicitness but at times I do wish I could fudge things a little.
I strongly recommend just writing a #[derive] line for every user defined type you make. Start off with Debug in there, and you'll find often you already know some other traits that'll be trivially derived for this type, Clone, Default, stuff like that.
I didn't know Vatic Labs was a Rust shop! (I joke, but they are a finance firm that is notoriously paranoid about you announcing where you work).
I'm glad to hear about your experience with the language as a developer. It's also been my experience that Rust has the magical power where your compile errors are almost always logic bugs. That's pretty cool from a psychological perspective, because you really feel like your C++ compiler is your enemy (or worse, your lawyer) while the Rust compiler is your friend.
The C++ developer experience is absolutely horrendous (with far too much of the unholiness that is cmake), and I hope that shedding some light on the situation can help people understand how to make it better.
To me, one of the best thing in Rust compared to C++ is that I no longer have to be super careful all the time to avoid common pitfalls, I just write stuff and fix compilation warnings/errors until the compiler is happy, and things mostly just work (apart from domain-specific bugs of course)
After getting used to the basics of Rust, I found the cognitive load to become much lower compared to C++. There's just no way I could write correct C++ code in such a careless fashion.
So now I can spend brain cycles on things that actually matter, and it's great.
The author mentioned that Rust errors sometimes force you to restructure your code to satisfy the borrow checker. I’m curious whether anyone has some real-world before-and-after examples of this kind of change.
The most common example is when you accidentally start doing object-orientation, and are trying to get a value to hold a reference to another value when there's no point in doing so from an ownership perspective, just so that it can be part of `self` in the method. The more you try to preserve the model, the wackier the errors get, until one of them is literally unsolvable and you have to junk the object-oriented design entirely. Generally this happens a maximum of twice, because once you've learned it your reflexes change, but generally this happens to everyone who came from an object-oriented language.
I come from OOP languages and do rust mostly nowadays.
> you have to junk the object-oriented design entirely
I think it's not this black-and-white. OOP on a higher level is about encapsulation and message passing. Such design is perfectly doable in Rust. I'd say that it's more natural in Rust due to it's `impl` concept: where behaviour and data are coupled in a way that's different from most OOP languages.
Who owns which data, and who can operate on it, is something that, in OOP, should be thought about just as well. Java, Ruby, et al make it easy to make a mess from this; yet that doesn't make all OO-design a mess. Ihat really is "bad use of OOP", and no reason to "junk the OO design entirely". At most it's "junk the bad OO design".
I was working on a card game simulator and I had a Vec of players. I needed to pull two players from that Vec and the first player would give a card to the second player. In my head I would grab both players via get_mut and then perform my operation. However, get_mut borrows the vec mutable and the compiler complained that I borrowed the Vec mutably two times.
It took me a bit to understand why the compiler complained, but then it clicked: It couldn't prove that get_mut wasn't returning the same item both times.
There were a few solutions. One was to borrow the first player, take a card, drop the &mut and then take the second player. At some point in the future I could use https://github.com/rust-lang/rust/issues/104642 to get_many_mut. I ended up with a pretty inefficient version of get_many_mut that fully traversed my iterator to get the two mut references (which works because traversing the iterator guarantees you won't see the same element twice) and it was fine for a collection of a half dozen players.
Basically he moves from his homegrown pointer chasing looking a lot like a doubly linked list to just using a hashmap and putting the key in his structs. As a naive first approach to satisfy the borrow checker, thinking he would sacrifice performance but at least the compiler would be happy.
Spoiler: it got faster. Far faster. To his surprise.
Nice to see some level-headed anecdotes on the experiences of actually _using_ each language, as opposed to the usual "z0mg rustc fixez all yur mem0ry bUgz!!111one"
And, as someone who's spelunking in the depths of a large CMake project right now, cargo sounds pretty nice.
Cargo is definitely a more pleasant build system than anything the C/C++ world has, until you fall off the happy path.
One of the pain points of trying to design a build system for C/C++ is that a lot of projects do weird stuff that needs to be supported in build systems, which means you end up needing escape hatches to do that weird stuff all over the place. Cargo takes a narrower view, which means it doesn't attempt to do things like package things for install or handle multi-stage builds. It also has the advantage that things like running tests or package dependencies were built in from the start, so it doesn't have to try to support a million different tools that provide that functionality.
Cargo is really nice! First party integration of developer tools is a big step forward and it’s nice to see it become more widespread (other examples include golang and Terraform)
Bryan Cantrill has a number of talks about Rust which are similarly pragmatic
I think the issue with CMake is exactly the issue with C++.
There's a 'modern' way to do things, and a 'legacy' way to do things, and the respective designers of CMake and C++ have decided that it was—for several reasons—better to leave the 'legacy' stuff in the languages, rather than pull a Python and make a clean split.
In 'modern' CMake, there are targets, and properties on said targets, i.e. compile and link options, C/C++ versions, libraries, headers, other custom dependencies, etc.
There are also functions and generator expressions[1] to make control flow a little easier. On top of these, CMake's built-in `find-package` and `FetchContent` make package management a lot easier than it used to be. Want Boost? Just do `find_package(Boost)`, and then `target_link_libraries(<my_target> Boost::boost)`. It gets easier still with a proper C++ dependency manager like vcpkg or Conan.
In legacy CMake, all these were set with global variables and there was no unified way to handle packages. I fully foresee going forward that at least a plurality of C++ developers will coalesce on a CMake + vcpkg (which has more packages than Conan) workflow.
Bash cannot handle figuring out all my dependencies and how to run them in across howeverany CPUs I have. Nor can it handle all the different cross compile, static analysis, address sanitizer, and so on.builds I run.
> How these commands change if I want to build the project with sanitizers? What do you mean the sanitizers are not supported by the build process? Why the build script suddenly started printing linker errors?
Doesn't Rust require you to switch to nightly to use sanitizers?
I read the article only not to be able to to find any meaningful notes. I did not expect versus so I support the writer that it's weak point to compare on the versus basis, but still as a person who actively maintains C++ experience since 1996 and been through Borland, Watcom, Visual C, old standards, C++11, C++14, and tries to follow isocpp.org standard discussions to understand where C++ is heading as standard and still use clang, cl and gcc regularly and.. I have zero experience in RUST, I only heard from many seasoned C++ developers that this platform comes with significant value proposition.
I had no chance to meet it yet, nor I have currently time to experiment, thus I was curious about the article to help me understand the differences. The only thing I remembered from the article is that compiler error messaging are different in the two and more clearly defined in RUST. Plus classical anti C++ example => memory management. C#, Java already addressed that, so I expect RUST has something more unique than that. Can somebody point to better article that shows some practical appliances why to switch to RUST if you have significant C++ experience?
Not that I want to bitch the author. Nice try explaining something, bue you know fixing error messages in C++ code was never a random staring at the code in my case.
An important distinction is that rust doesn’t use exceptions for recoverable errors. You don’t have try/catch like you do in C++. Instead you use the Result type to propagate errors. This has the advantage of avoiding many of the downsides of exceptions (like leaving your program in a bad state) while making error handling more explicit.
This doesn't mean you don't still have to be panic-safe, though. This is mainly important for unsafe code, because it has to preserve memory safety even in the presence of panics. But it can also be important for safe code in an application that recovers from panics.
You are correct that there are circumstances where you have to consider panic safety. But those are rare. Most code is not unsafe, and catching panics is usually bad practice in Rust (though I am sure there are circumstances where it is needed). In C++ by contrast you have to think about panic safety almost everywhere. Rust’s error handling is a big improvement in that regard.
Rust programmers misuse unwrap and the like all the time, to the point that I’ve seen some projects explicitly state that they don’t do that, since nobody wants their library to crash their program because someone was too lazy to propagate errors.
Which brings me to the point: propagating errors is tedious enough that people are looking for shortcuts, which then cause other problems.
For other folks following along and possibly getting mislead by this (partial) nonsense, I'd encourage you to read a blog post I wrote about the topic of unwrap. I think it should clear up most things: https://blog.burntsushi.net/unwrap/
At a meta level, one of the reasons why there is so much focus on 'unwrap()' in particular is because it is often the precise point at which in your code where a runtime invariant is broken and thus leads to a panic. Nearly all code has runtime invariants in one form or another. The question is what happens when they're broken. In languages that are memory unsafe by default, the answer is often (but not always) "undefined behavior." In languages like Rust, or Python, or Go, the answer is often (but not always) "the process quits." The reality is more complicated than that, but those are a fine first approximation. For example, breaking a runtime invariant doesn't have to lead to undefined behavior or process termination. It can simply result in a logic error that leads to unexpected behavior.
Of course, making the issue more complicated is that sometimes 'unwrap()' is abused. And indeed, sometimes it is used in cases where an error ought to be returned. I find this to be generally pretty rare in popular libraries. But the key point here is that you can't just say, "oh I see unwrap() in a library, so now I'm going to scream ABUSE!!!!" It's more complicated than that.
How is it nonsense if you admit at the end (albeit after a straw-man of me allegedly screaming “abuse”) that it happens?
Result is being sold on HN as a great solution to error handling while there’s several crates available which are needed to polish its rough edges, rough edges which have led to e.g. unwrap abuse in the past.
It is tedious to shuffle around results, but this is never part of the sales pitch. And for those of us that don’t want to depend on all sorts of crates for the simplest things, that’s the reality.
Re your reply “Maybe improve your reading comprehension. I said "(partial) nonsense." Therefore, some of what you said wasn't nonsense. But some is. What a fucking revelation.”
Partial nonsense isn’t nonsense by definition, I believe, but I won’t dwell that point. I am surprised seeing a Rustacean being (mildly) offensive and even saying the f-word. I must say that I am proud of you, too many nowadays act like polite, pedantic robots. :-)
I didn't say "nonsense." I said "(partial) nonsense." If we use our powers of logical deduction, it therefore follows that I wasn't claiming that everything you said was nonsense. Hence the reason I added nuance to my comment, unlike yours. I'm trying to tease apart the convoluted mess you're making. You're aware of Brandolini's law, right? This is a microcosm of that. Some of what you're saying is bullshit, but not quite all of it, and aspects of it are rooted in truth and reasonable experiences. (For example, I also do not "depend on all sorts of crates for the simplest things.")
My tactic is to present nuance. You come back to me and instead of actually engaging with the nuance, starting whinging about what's being "sold on HN." Yawn.
> It is tedious to shuffle around results, but this is never part of the sales pitch.
This is a good example. Your sentence starts with something that I wouldn't agree with as a general rule, but I could certainly see it being true in specific circumstances. And even then, I'd want to explore those circumstances. That is, is it really a property of `Result` that makes it tedious, or is the problem of error handling in that context itself? It could be either.
But then you follow it up with whinging about "sales pitch." Are you talking about some kind of sales pitch put out by the Rust project? Because if so, please link it to me. Or are you talking about a bunch of random HN commentators that can be overzealous about just literally anything? I don't see anyone saying Result is the best thing since sliced bread. What I do see are people describing positive experiences with it, especially in relationship to alternative error handling paradigms. Is that really a sales pitch?
I mean, look at my top-level comment in this entire post. I sang the praise of sum types, and I did it by echoing what the OP said. Is that a sales pitch? Am I saying that sum types are the "best" at something? Am I saying that they have literally no costs at all? Am I saying that useful programs can't be written without sum types? Am I saying that all languages should be designed to include sum types?
No. No. No. No and no. Did some other people go a touch too bar and make broad pronouncements about "correct" language design? Oh yeah, absolutely. Are those people some kind of singular phenomenon unique to Rust? Fuck no. And it is absolutely fucking baffling that I need to explain that to someone who has been on HN for as long as you have. This observation is so banal that it's blub.
The Rust community always had a deep interest to sell Rust and did so through blog posts, comments and projects. I’ve seen this with many other languages on HN, but the zeal with which Rust is constantly shoved in the face of everyone is a tad more irritating than what I remember about what seemed to be grassroots interest in Ruby, Haskell or Objective-C.
When I read your top-level comment I don’t read a story from a random developer, I read an endorsement from a prominent member of the Rust community trying to paint Rust in a positive light, but conveniently omitting any negative aspects. This happens way too often to be a coincidence: the polite Rust developer educating others about the benefits of Rust may as well be a recurring character in this series.
Yet as companies actually start to use Rust and hit various problems, these are not given the same amount of attention. Obviously you don’t care about that, but this kind of submarine advertisement is a pet peeve of mine, so don’t be surprised if I continue to comment.
So what I'm getting from your comment is that it's not possible to say something positive about a project like Rust unless all commensurate trade offs are accounted for. Otherwise, the comment is a "submarine advertisement"? I didn't pipe into a conversation about non-Rust. The OP is about Rust. The OP mentioned sum types. I commented endorsing what OP said and to call extra attention to it, because sum types (with pattern matching and exhaustiveness checking) are amazingly useful. I also legitimately do not believe they have many downsides, if any at all. They might have downsides within the context of a particular language design (for example, Go, where their interaction with default values and interfaces would potentially be quite weird). But in general, no, sum types are pretty close to an unmitigated good thing in my view.
Does Rust writ large have downsides? Oh absolutely! So unless you're telling me I need to exhaustively enumerate every downside of Rust every time I mention something positive, then I don't know what you're getting on about.
> so don’t be surprised if I continue to comment.
That's not surprising? You've been posting low quality commentary in Rust threads for literal years. If there are people sick of the "advertising" for Rust, then there are also people who are sick of the people whinging about it. What would be surprising is if you started posting well informed productive comments in Rust topics.
Somehow I must've been lucky enough to avoid this unwrap abuse in crates I use.
I think you maybe meant converting between errors in Rust can be tedious sometimes, but propagation is pretty easy.
The conversation is something that generally need's to happen rarely (like a few times per project?), but `thiserror` and `from` make this pretty ergonomic.
A lot of these Rust complaints are just people not reading the docs (not trying to say Rust doesn't have legitimate usage friction; of course it does).
Pro tip: most popular crates have excellent documentation (I know it's a shocker coming from other languages). So, check stuff before you use it.
I assume this is because the ecosystem lowers the barrier to entry for writing an generating documentation (compared to other ecosystems).
Second, I never claimed it was an issue or a bad thing at any point. Someone asked for examples of libraries that panic, and I gave an example.
I must be missing something because it makes no sense to me that people seem to be responding defensively and claiming that I either don’t understand the Rust docs or am pointing out problems in Rust.
A panic is an unrecoverable error caused by violating runtime invariants. A bug exists in the program and there is no (correct) way for the program to continue execution. For example, an array was accessed at index "-1".
Exceptions are a particular technique for dealing with errors in general, that can be used for both unrecoverable and recoverable errors. They infamously trade programmer convenience for hard to understand bugs, fragile run-time behavior and unmaintainable code.
Although the code generated might be similar, it's not a "distinction without a difference" because the language doesn't provide the same tools for interacting with panics as say C++ does with exceptions. There is no "try catch" for panics, you either catch them at the thread boundary or catch them with catch_unwind. Everything in the std lib is built with Result for errors that can be handled, not panic.
Programming is hard, but that doesn't mean that certain language design choices don't have meaningful impact on what type of code becomes easier/harder to write.
Other people are saying similar things, and I ask you listen to what they're saying, rather than talking about people being "traumatized" by a programming construct.
Panics and exceptions are not the same thing. For one, it's not possible to recover from a panic. You can handle a panic from within a handler before the program terminates, but you cannot resume execution as you can with exceptions.
You can recover from a panic though? I don't think you're required to call resume_unwind after catch_unwind. The underlying machinery is the same as C++ exceptions, so there's no reason that wouldn't work.
Obviously this is not something you're encouraged to do often, but there's applications where it might be reasonable.
> In C++, you can comfortably measure the size of error messages in kilobytes. Infinite scroll in the terminal emulator is an absolute must because oh boy does the compiler like printing text.
And you usually want the first line, because that's where the line number you actually want to go to is. Perhaps removing repetition and not printing lines of stdlib headers would help.
> I feel more at piece shipping Rust code that C++ code.
More a piece? More in pieces? :D
I too got lots of crashes and UB in my life, but more because of pile-of-garbage architecture and interoperability than of language. (un)surprisingly, Rust emerged AFTER people learned their mistakes in C/C++ so it gets lots of attention and expectations.
Err... C++20 added concepts but not definition checking, so templates are still duck typed. That is, it is still possible for a substitution failure to occur after performing a concept check. Definition checking would eliminate such failures.
In every language that I can think of that supports generic programming (including Go), there's support for definition checking.
Only C++20 has concepts, which is a lot newer than Rust is. I've yet to ever see a C++20 concept in the wild or anyone using one outside of an example.
Heck, if you ask the average C++ programmer about concepts they'll likely have not even heard of them.
> C++ has a very long history (over 50 years) and to say he's some expert matter on the subject after a mere 4 years of "professional experience" is quite baffling to me.
It's understandable that you're baffled, especially given that the OP didn't claim to be an expert. The OP pretty clearly did not claim any authority, and instead went out of their way to clearly disclose not just the number of years of experience they have, but what they worked on.
> Waste of time
It's worse than a waste of time. Comments like yours make HN a worse place. Extrapolating some minor English writting buugs intto an asessment on how good they're code is? Man, what an absolute stinky pile of bullshit.
He didn't claim expertise. It is actually more interesting to me to hear someone's opinion with a medium level of expertise in the language. We've had plenty of experts and beginners already weigh in.
The article specifically mentions not holding said candle as a positive. I would too. They're confusing at best and undebuggable at worst, and virtually everything you can do with them that you can't do with generics is something it'd be better for everyone if you didn't do.
Oh here we go again. “Everything rust can’t do is an antipattern” is not a valid argument. Tuples for instance are a library feature in C++ but have to be a language feature in rust. Plenty of other useful things that C++ can do and rust cannot.
So, in response to 'templates are generics + things you shouldn't do', your counterargument is a non sequitur about tuples, a video titled 'Code You Should Learn From & Never Write', and a video about introductory templates which doesn't actually contain any counterexamples?
I disagree, Rust is already being used for tons of real-world projects. It’s quite rare that you miss the flexibility of templates (and can’t accomplish a similar thing using macros or generated code).
1. how much psychological trauma it brings to average developer because today's most C/C++ library is not shipping static library by default.
2. how much pain it causes when the build system just can not copy the whole damn library into somewhere called lib and include instead spill into hundreds of random locations even we know that it will just build if we just copy it one place.
3. returning a struct with a integer status in it is so challenging for average programmers.
All these must be true because it seems it is the reality.
[1]: https://github.com/BurntSushi/go-sumtype