Yeah, their initial explanation is that its the "safety features", but I was under the impression that literally everything related to rust's safety happens at compile time. Their big selling point is the "zero cost abstractions".
Bounds checking is a “zero cost abstraction” in the sense that it is both “pay for what you use” and “you couldn’t implement it yourself any better.” That said, the implemention is not as fast as it theoretically can be, because the compiler isn’t always removing bounds checks that are provably true.
When const generic lands, you will be able to make those assertions! A basic implementation is in nightly.
This is taking zero-cost abstraction to the extreme, and I think waters down the concept to the point that it almost isn't useful.
One can argue that any feature is a zero-cost abstraction for the exact set of trade-offs it has (in the limit: "you couldn't implement a feature that executes this exact sequence of instructions 'mov ...' any better if you did it by hand").
I think focusing on a hypothetical perfect coder is more useful, someone who never fails an assertion or bounds check. This person does not need any safety/convenience features (unlike us real humans), but they can still use some such features without a penalty. Bounds checks are not one of them. (But... At least they don't have much of a pervasive/whole program cost.)
Maybe! I think that the concept is a bit more focused than most people think it is. I can appreciate this perspective though. It's a fuzzy concept, so there's decent room to poke and prod around the edges :)
Yeah, that's fair. I think there's a strong argument that bounds checking satisfies the "don't pay for what you don't use" rule (although one could say that every slicing having to carry around its length, doubling it's size, is a pervasive cost, but that's fairly weak, given the other things it enables), but I think the other rule is much, much less clear.
You're mis-understanding what "you don't pay for what you don't use" means. It means that a program that does not contain [] in it does not include code for []. There's no "mandatory runtime" code for []. It doesn't mean you can use [] in some special way and somehow get less code.
And even if we were talking about your understanding, I don't see why get_unchecked would be disqualified. unsafe is a part of the language for a reason.
I know, the meaning of "zero cost abstraction" is really "zero cost to not use the abstraction."
But that's like saying if you don't use rust you don't pay for it. Just because there is the unsafe escape hatch in the language, you don't get to label the language as zero cost abstraction. Because practically speaking there is a lot more runtime cost to rust than people will tell you.
Many langauges (almost anything without a runtime) meet that definition of zero cost abstraction then.
If you use a Rust construct that does bounds-checked accesses, you're explicitly using bounds checks. You're not paying for anything beyond the bounds checks that you're using.
If you use a Rust construct that elides the bounds checks (e.g. get_unchecked), there are no bounds checks, and you don't pay any cost for the fact that Rust supports a bounds-checked subscript operator.
If you want to compare Rust's subscript operator, do so with an explicitly bounds-checked version of the C code. That's the appropriate comparison for "zero-cost abstraction". Because the whole point of "abstraction" is it's not a change to the observed runtime behavior, it's just a language abstraction.
I don't entirely agree. As an analogy, suppose that you want to pass a pointer to an object as a function argument. As the programmer, you expect the object will necessarily remain allocated as long as the function uses it, since it's kept alive elsewhere, but you might have made a mistake. Before Rust, your options would be:
1. Use a C-style raw pointer. This has no overhead but is unsafe.
2. Use a reference counted or garbage collected pointer. This provides safety but is a non-zero-overhead abstraction. In other situations, reference counting can be necessary to manage unpredictable object lifetimes, but in this case we're assuming it would only be used to guard against mistakes, so the cost represents overhead. Even if some language implements reference counting just as fast as you can implement reference counting in C, that doesn't make it zero-overhead.
But Rust adds a new option:
3. Use a borrow-checked reference. This is safe, yet zero-overhead compared to the unsafe option, as long as your usage pattern can be expressed in terms of Rust lifetimes.
Going back to array indexing, the analogous situation is that you want to access an index that you, the programmer, expect to always be in bounds. Option 1 corresponds to unchecked indexing, and option 2 corresponds to bounds-checked indexing. But Rust brings nothing new in this area: there is no option 3.
Yet an option 3 is theoretically possible: safe unchecked indexing if you can prove to the compiler that the index can never be out of bounds. This can be done in languages with dependent types or built-in proof systems. (It can even be done in Rust with a certain neat hack [1], but with a lot of limitations.)
I'm not saying Rust is bad for not having dependent types. As far as I know, it's an open research problem how to make those features feel ergonomic and easy to use, even to the extent that Rust lifetimes are (i.e. not totally). And on the flipside, bounds checks usually don't cause very much overhead in practice, thanks to CPU branch prediction, plus the optimizer can sometimes remove them.
But I'd say that Rust's choice not to go there (...yet...) justifies calling its current approach a non-zero-overhead abstraction.
It's literally the definition. And very little qualifies. An abstraction is zero-cost if there's no runtime penalty, including if the code you'd write by hand to accomplish this goal is no better than what the compiler synthesized for you via the abstraction.
If the use of the abstraction forces your data model into a suboptimal representation, it's not zero-cost. If the compiler emits code that's worse than the straightforward manual implementation, it's not zero-cost. If the emitted code involves runtime checks that aren't necessary without the abstraction, it's not zero-cost.
For example, reference-counting is an abstraction over tracking the lifetime of your object. In some cases (where ownership isn't clear) the retain count introduced by reference counting is required and therefore not a cost, but in most cases the point at which the object ends up freed is actually predictable, and therefore the cost of all the reference counting is something that would have been avoided without the abstraction. Therefore, reference-counting as a replacement for manual alloc/free is generally not zero-cost.
Or how about iteration. Rust has an external iterator model (where you construct an iterator and run that, rather than passing a callback to the collection). In most languages an external iterator model is a non-zero cost, because you need to construct the iterator object and work with that, which is a bit of overhead compared to what the collection could do with an internal iterator model. In Rust, external iterators are frequently zero-cost because they usually get inlined. So you can write a high-level loop that includes a filter and map and the end result is very frequently just as good as (if not better than) what you'd have done if you wrote the iteration/filter/map by hand without Rust's iterator abstraction.
I understand what you're saying, but that's not how I understand people use the term "zero cost abstraction". That term usually refers to things you can use and pay zero cost for using it. It does not usually refer to abstractions that impose a cost for use, but no additional cost if don't use them, as you suggest.
A quick Google search of the top hits for the term seems to align with my understanding.
Maybe there is a term for what you are talking about, like "pay for what you use".
What part of my comment made you think I didn't understand that?
Was it when I wrote: "it really [is] zero cost to not use the abstraction"? Or when I wrote: "if you don't use rust you don't pay for it"?
I understand the concept fine. It is rust's marketing gimmick and not a useful tool to describe the language.
Rust is no more "zero cost" than C++, Fortran, Ada, Objective C, and even D can even make the claim to some extent (you can opt out of the GC and use malloc/free directly). I'm there are plenty more I'm missing. If you use the Java GC that doesn't collect (used for real-time programs), even that probably fits the description.
Under your description an opt in generational GC is zero cost. You can't have the vector be checked, (along with other small performance issues), but use the excuse that you can write code in a way that doesn't use checked access or have that feature's cost. I can do that in a lot of languages.
Also to some extent you should take the ecosystem into consideration where a large chunk of it goes to lengths to not use unsafe such as using a vector for homogeneous objects to get around pointer semantics. That incurs a huge cost, and that should be counted against the language because it makes the other ways too difficult or the main libraries used rely on those techniques. (and if you don't count rust's ersatz standard library, the you can't count Java's standard library since you can always write Java code that doesn't use a GC or extra features - I've done it a few times).
I saw so much promise in rust, but is seems to have gone to complete crap.
> What part of my comment made you think I didn't understand that?
This part:
> But that's like saying if you don't use rust you don't pay for it. Just because there is the unsafe escape hatch in the language, you don't get to label the language as zero cost abstraction.
Rust isn't "zero-cost" because of the unsafe hatch; that's completely orthogonal. It's zero-cost because if you don't use a feature you don't pay for it. The fact that you need unsafe to get non-checked subscripting isn't particularly relevant to the fact that using non-checked subscripting in Rust means you're not paying for the existence of checked subscripting.
> Under your description an opt in generational GC is zero cost.
You're conflating implementation with semantics. If you have a choice between different allocation strategies that all result in the same observable runtime behavior, using a garbage collector over manual alloc/free is a cost. With manual alloc/free there's no runtime tracking to determine when an object should be freed, it's entirely static. Using a GC dramatically simplifies this from the developer's perspective and avoids a whole class of bugs, but comes with a runtime cost. Meanwhile for single-owner models, Rust's Box type has no runtime overhead compared to alloc/free, since there's no runtime tracking, the alloc and free points are determined statically by the compiler.
C++ popularized the term. And if C++ is commonly described as having designed its features to be zero-cost (or you only pay for what you choose), there's nothing wrong with Rust describing the same concept with the same term.
What languages dont have a runtime? Even C has one (albeit a very small one). Nobody labels Rust the language as a zero cost abstraction (thatd be silly - there is a cost to learning it!). Rather, they try to provide zero cost abstractions. A great example is that there are no green threads in rust because they consciously removed them as they penalized rust performance regardless of whether said people used green threads or not.
This is exactly how you'd write the feature, by hand, if you were implementing the language.
That the optimizer could, but does not, do as much optimization as it theoretically can, means that it has more work to do. But that's different than the feature being written in a sub-optimal way.
I don't know if you edited your comment, or if it was just my pre-coffee reading comprehension missed the part "because the compiler isn’t always removing bounds checks that are provably true."
This got me wondering why we don't include dedicated hardware for bounds checking in CPU architectures. Intel made an attempt with MPX but from a brief glance on Wikipedia it looks like a fail (slower than pure software approaches like AddressSanitizer, among other issues).
I really like dependent types as a concept! But they're not really in any mainstream languages, so I haven't had much opportunity to play around with them. :(
There are! It is done in compiled Lisp and Scheme runtimes such as SBCL to remove provably redundant bounds checking. For example, in (when (< x 4) (when (>= 0 x) (...))) SBCL will infer that if x is an integer, it must have one of the values 0, 1, 2 or 3. The knowledge is useful because if x is used as the index of an array containing 3 or more items the bounds checking can be elided.
FWIW it isn't hard to write safe Rust code that elides bound checks, but one needs to write proper code.
If one pin points a performance issue due to bound checks, it is also always trivial to disable them in a safe way by writing proper `unsafe` code when it makes a difference.
Bounds checks are always on. Integer overflow checks are disabled in release mode. Bounds checks are necessary for memory safety, whereas integer overflow isn't
there are unsafe method for vec that allow you unchecked access to the array, but you need to to write it in unsafe block (some code that is performance sensitive does this).
The explanation I heard from a Rust core dev (explain why Rust tends to be slightly slower than C++) is the borrow checker makes programmers to use a slightly different coding style. Were C programmers tend to pass references around Rust programmers tend to take copies of to avoid the borrow checker.
Passing the smaller reference will be faster in single threaded code. In multi threaded code I'd expect making a thread local copy to be a win, and indeed multi threaded Rust programs tend to be quicker than their C/C++ counterparts.
I don't know... while it may not be the case here, I wouldn't be exactly shocked if the fact that Rust restricts your ability to do some things ends up having a performance penalty, even if any safety checks themselves happen at compile time.