> There was this vague concern about splitting the ecosystem. The concern is understandable - to an ecosystem that has been "in control" by a central implementation for a decade. I would file it under growing pains. Rust can't, when it grows up, always be a single-implementation language.
This is why formal specifications matter.
The Rust community say languages like C and C++ didn't have specifications for a while. That's true, but it's a flimsy argument because the development velocity of both C and C++ prior to formalization were a fraction of Rust's current velocity. C and C++ also had fewer features pre-specification than Rust has pre-specification. Rust is also continuing to add features.
Without a specification how do we know which Rust implementation is correct and which has bugs? It's easy to point to the reference implementation in 2022, but that could change in five or ten years. What happens if drama from within the reference project causes a fork? People like to imagine forks as easy to differentiate, but it's never that simple. It's almost always very nuanced, gray, and muddy with very good arguments on both sides. Which implementation is the correct Rust in that case? None of this is clear or obvious.
C++ doesn't have a formal specification either. What C++ has is a document that says in informal prose how the compiler is supposed to behave. Rust in fact has such a document too: the reference manual [1]. It even uses spec language like EBNF to describe grammar productions.
You can quibble about whether Rust's reference manual is more or less detailed and useful than that of C++. I'd readily acknowledge that the C++ specification has more detail than the Rust reference, right now. But this is a difference of degree, not of kind.
Honestly, I've long been in favor of just renaming the Rust "reference manual" to the Rust "specification" and calling it a day, so we can stop arguing about whether Rust has a spec and do the more useful work of improving the document that we have. Other languages like Go call their equivalent document the "spec", even though there's no real difference in detail between the Go "spec" and the Rust "reference manual", and nobody criticizes Go for that naming decision.
> Without a specification how do we know which Rust implementation is correct and which has bugs?
The gcc-rs project answers this question explicitly in their FAQ, which I've linked downthread:
> If gccrs interprets a program differently from rustc, this is considered a bug.
They also go on to say:
> Once Rust-GCC can compile and verify all Rust programs, this can also help figure out any inconsistencies in the specification of features in the language. This should help to get features right in both compilers before they are stabilized.
If you're looking for a specification, additional implementations help that effort, not hurt it.
note that "formal specifications" generally means "specification written in a formal language", very often in Hoare logic.
Here's a paper that presents one possible way to formalize the building blocks for control flow (if and goto basically) in a very, very, very, very simple language: https://www.cse.psu.edu/~gxt29/papers/controllogic.pdf ; the core is Fig. 3.
Let's be honest: maybe 0.00001% of anyone programming can understand this properly without spending 100 hours on learning the associated formal baggage. Doing something like this at the scale of "real" languages is unrealistic and useless: how many contributors GCC or LLVM would have if the requirement was to be able to parse Hoare logic fluently to transpose e.g. the C / C++ formal spec into code ?
The most generous response I could give to that is "Maybe".
You mentioned C and C++ several times, but for C++ what actually happened is that they just shipped a half dozen distinct languages with the name C++ in different years. C++ 98 and C++ 20 are similar languages, but only in the same way that the 1998 Ford Fiesta and the 2020 Ford Fiesta are similar cars. They're occupying the same niche, some of the fittings are familiar, others are not. Many of the parts are different.
Rust has no plans to do that. If you have some code that went on the shelf in 2015 for Rust 1.0 and blow dust off it, it compiles with a brand new Rust compiler today in 2022, and works just fine along side brand new code written today. Actually almost all of it could still be pasted in to new code, although some of it would look a bit unnecessary and clunky to a new Rust programmar, "Grandad -", for example they might ask, "Why are you specifying the type here when it would obviously be inferred correctly anyway?". Well, in 2015 that type wouldn't have been inferred.
> Without a specification how do we know which Rust implementation is correct and which has bugs?
Reading exercise for you: Look up "Pointer Provenance" and read about the problem. Then, read whatever version of C++ "formal specification" you think you're relying on. Huh, it doesn't mention provenance anywhere in this document.
You may need to go back and re-read the stuff you read at this point. This is a difficult problem, and the compiler must care about it deeply to produce reasonable machine code for even fairly simple programs. But the specification doesn't mention it. What does that mean?
I'll save you some time: The compilers do not implement the standard, and they haven't for decades. What they implement resembles the specification but not very closely and never where it conflicts with their duty to generate machine code you'd actually be willing to run. Some C++ programmers are very angry about that, but WG21 shows no sign of doing anything about it decades later. C++ 23 still won't fix the provenance problem, it will once again be kicked into the long grass.
Even aside from provenance and similar issues, C++ is riddled with Soundness bugs where your program is meaningless and basically the compiler, being pragmatic, will do something but it's unspecified what. This type of problem relies on a get out in the C++ Standards Document triggered by the phrase "Ill-formed, no diagnostic required" meaning what you wrote isn't a C++ program and none of the rules in this standard apply but your conforming C++ compiler may not even warn you about this, it might compile your code anyway, even though what (if anything) it does is entirely arbitrary.
There are some aspects of 'systems' programming languages that don't have a good history of being properly documented. Pointer provenance, as you mention, is one of them. The semantics of concurrent access to memory is another.
So it's not too much of a mark against Rust that it doesn't have proper documentation for those aspects either.
But there is much, much more to a large programming language like Rust than those things, and Rust is missing documentation (of the sort that intends to be complete and correct) in many important areas that don't have that sort of excuse.
> So it's not too much of a mark against Rust that it doesn't have proper documentation for those aspects either.
There is obviously the benefit that this generally only affects unsafe code (as opposed to C/C++ where the scope is less clear), but Rust just passes the buck to LLVM and inherits the partially baked solutions being used in practice for C++. And if Rust ever changes this, it will involve either splitting the language into distinct versions or breaking strict backwards compatibility.
> But there is much, much more to a large programming language like Rust than those things, and Rust is missing documentation (of the sort that intends to be complete and correct) in many important areas that don't have that sort of excuse.
As somebody who has contributed to both rustc and LLVM/Clang (and would probably rather program in Rust than C++ at this point), I think it would be much easier to implement C++ from scratch than Rust for this reason. The C++ spec is incomplete and some portions are internally inconsistent, but at least there's a real spec. When you hit one of these problems you can just look at what other implementations do in practice. Rust doesn't even have anything resembling a spec.
> When you hit one of these problems you can just look at what other implementations do in practice. Rust doesn't even have anything resembling a spec.
But it does have a reference implementation. The first quoted sentence is what alternative implementations are already doing: looking at what `rustc` does and ask for clarification upstream when necessary. I can envision cases where `rustc`'s behavior isn't what was intended being uncovered when trying to reach parity, and changing `rustc` to conform to an explicit intent, however unlikely this might be in practice.
This is why formal specifications matter.
The Rust community say languages like C and C++ didn't have specifications for a while. That's true, but it's a flimsy argument because the development velocity of both C and C++ prior to formalization were a fraction of Rust's current velocity. C and C++ also had fewer features pre-specification than Rust has pre-specification. Rust is also continuing to add features.
Without a specification how do we know which Rust implementation is correct and which has bugs? It's easy to point to the reference implementation in 2022, but that could change in five or ten years. What happens if drama from within the reference project causes a fork? People like to imagine forks as easy to differentiate, but it's never that simple. It's almost always very nuanced, gray, and muddy with very good arguments on both sides. Which implementation is the correct Rust in that case? None of this is clear or obvious.