
You can’t “turn off the borrow checker” in Rust - ngaut
https://words.steveklabnik.com/you-can-t-turn-off-the-borrow-checker-in-rust
======
kibwen
I think there's mental mismatch between groups of people who talk about
"turning off the borrow checker". The borrow checker is a tool to validate
references. Sometimes people using references in Rust might feel like the
borrow checker makes using references too cumbersome in a certain situation,
so they switch to using a tool other than references (Rc, indexes into a Vec,
etc). But this isn't bypassing the borrow checker; it's bypassing references
themselves. The same phenomenon happens in C++; if references start to be a
pain, you might switch to using something else (shared_ptr, indexes into a
vector, etc.).

When this happens in C++, we don't call this "bypassing the borrow checker".
You don't need a borrow checker to know that references aren't always the
right tool for a given job. It's the same in Rust.

~~~
steveklabnik
Yup. Part of why I wrote this post is for exactly this reason. This phrase is
used colloquially, but I think it misleads a lot of people on how Rust
actually works.

------
kbwt
I thought this was going to be a response to Jonathan Blow's video about how
doing your own memory management is effectively turning off the borrow
checker:
[https://www.youtube.com/watch?v=4t1K66dMhWk](https://www.youtube.com/watch?v=4t1K66dMhWk)

The takeaway being that the borrow checker doesn't magically prevent the use-
after-free class of bugs. Although you will never experience a segmentation
fault in safe Rust, the bug is still there and your program keeps running in
an invalid state. The symptoms are changed, but no less dangerous.

To make the problem even more obvious, think of allocating a large array to be
used as a heap and handing out indices to implement your own malloc. You have
bounds checking to prevent indexing outside the bounds of the heap, but it
doesn't really help when the elements have logically different lifetimes and
occupy different parts of the array. I don't think this is a contrived example
either. A less obvious version of this can easily creep into large or complex
systems, as evidenced by the Entity Component System in Rust example.

~~~
sdegutis
So it’s the difference between crash early (C) and don’t crash but run wrong
(Rust), inherent by design in the main selling point of Rust (borrow checker)?

~~~
gliptic
Why is everyone assuming C is "crash early" as opposed to "undefined behaviour
will crash if you're lucky and cause RCE in the worst case, or any weird thing
in between". The selling point of the borrow checker (at least one of them) is
that it doesn't allow undefined behaviour unless you explicitly enable unsafe
operations. If there was a way to detect UB and predictably crash in a
performant way, you could probably implement that in Rust as well. In fact,
that's often what is done with generational indexes and similar.

------
twarge
As a non-coder (physicist) writing Rust, the thing that really stuck me was
that the time between _successful compile_ and flawless operation was
significantly shorter than the C and Python I write. Furthermore, this
difference when writing anything _threaded_ is simply breathtaking. Im my
experience there are simply fewer corner cases that the Rust compiler lets
through.

~~~
blueprint
I think that makes you a coder :)

------
orf
> This means that we can combine it with Option<T>, and the option will use
> the null case for None:

This sounds interesting, can anyone elaborate on this?

~~~
kibwen
One of the design goals of Option is to be a library-level replacement for the
language-level null value found in languages like C. Furthermore, one of the
design goals for Rust itself is to have its abstractions be zero-overhead.
With a naive implementation of Option, these goals would be in opposition.

To illustrate why these goals would be in opposition, look at a trivial
example of a tagged union (enum) in Rust:

    
    
      enum Foo {
        Bar(i32),
        Qux(u32)
      }
    

At runtime, any value of type Foo will _either_ be in the Bar state, _or_ it
will be in the Qux state. Both Bar and Qux hold types that are 32 bits in
size, and since only one of those states can be active at a time, we know that
Foo only needs 32 bits of storage to satisfy both these states. But
additionally, it needs _extra_ storage to store the runtime information
telling us _which_ state it's currently in. The smallest "extra" amount of
storage that can be added to a type is 8 bits, so we would expect every value
of type Foo to be 40 bits in size at runtime. In fact it's larger, due to
alignment and padding, so Foo will be 64 bits in size at runtime.

Let's bring it back around to Option, which is an enum that looks like this:

    
    
      enum Option<T> {
        Some(T),
        None
      }
    

A null pointer in C will be the same size as a non-null pointer in C: 64 bits,
assuming a 64-bit platform. A Rust reference would also be 64 bits, and these
cannot be null. If we were to try to use Option on a reference to "opt-in" to
nullability, what size would that "nullable reference" be _assuming a naive
implementation of Option_? Well, firstly we'd need storage for the value of
the reference itself (64 bits), and then, as per above, we'd need our "extra"
storage to tell us at runtime whether our Option is a Some or a None. And
again, because of alignment and padding, this would theoretically result in a
type that is 128 bits in size in total, which is real shame since in theory
distinguishing between two states only takes a single bit of storage. Overall
this would be a performance regression from C, where nullability does not
impose any space overhead.

Fortunately, Rust's implementation is not naive. Remember: Rust references
cannot be null. That means that the Rust compiler knows that any type that is
a reference will _not_ contain a value that is all zeroes at runtime. Rust
leverages this knowledge for optimization: for any Option containing a
reference, only a single pointer-sized piece of memory is needed, and the None
case will be represented by a value of all zeroes. This means that the Option
is now a zero-overhead abstraction for this use case, because Option<&Foo>
will be the same size as &Foo.

And this smart logic isn't hardcoded for the Option enum. Any enum, written by
anyone, can automatically benefit from the ability to "hide" the enum tag in
such "uninhabited" values. The OP's example of NonNull<T> is, like references,
an example of of a type that has an uninhabited value that permits this
optimization. Others include the NonZeroU8 type and its friends, where
Option<NonZeroU8> will be the same size as a standard u8, though which give up
the ability to represent zero (in the future this may be extended to allow
arbitrary user-defined types which can make whatever values they want act as
uninhabited for the purposes of enum size optimization, but it will take some
work to get there).

~~~
tlb
Is it possible for Rust to store this in 64 bits?

    
    
      enum Boxed<T>
        Number(double),
        Some(T)
      }
    

ie, a type which can either be a double or a pointer by encoding the pointer
as invalid (NaN) floating point numbers? Many JS engines use this trick.

~~~
lachlan-sneff
No, because NaN is a valid floating point number in rust.

~~~
alkonaut
There really should be a ”normal” f32 and f64 type with guarantees for non-
NaN, similar to the nonzero integers. More importantly than size, these floats
would be totally ordered unlike the partially ordered regular ieee floats.

Edit: turns out these exist in various forms e.g “noisy float”

------
jononor
My TLDR/alternate title: "unsafe Rust retains most safety benefits of Rust
(including the borrow checker)"

------
shawn
Note that you can drop down to unsafe C-style code in Rust. [https://doc.rust-
lang.org/stable/nomicon/](https://doc.rust-lang.org/stable/nomicon/)

Anyone who claims Rust is simple should ensure they thoroughly understand that
book.

Another way to "turn off" the borrow checker is to write a scripting language
that compiles to Rust which automatically annotates all variables with the
longest lifetime possible, and spits out mutable references depending on
whether you actually mutate anything.

There's also RefCell, which lets you defer borrow checking till runtime. It's
handy for pretending like your references are immutable.

~~~
kibwen
I don't think that anyone's claiming that Rust is objectively a simple
language. Simpler than some other languages, certainly, but it's a medium-
sized language at best.

Furthermore, unsafe code gives new users a firm boundary of complexity that
can be ignored. New to Rust? Don't use the unsafe keyword. Using the unsafe
keyword? Read the nomicon first. It's quite useful for onboarding to know that
all the C-style UB shenanigans are behind a gate that can be ignored until
you're comfortable with the rest of the language.

------
mlevental
not that steve isn't correct but Rc and Arc effectively (in exchange for
runtime overhead) "turn off" the borrow checker. i'm sure i'll get yelled but
it's just this week i had the borrow checker yelling at me for something and i
realized that the appropriate thing to do was use Arc (yes of course i'm not
advocating for ref counting instead of being more precise).

~~~
masklinn
Rc and Arc are about (shared) ownership not borrowing. If they're turning off
anything, it's ownership. Borrowing works the same way as ever.

In fact that's an advantage of Rust here: because of the borrow checker you
can _safely_ get a reference to an Rc's contents and hand it off to something
without having to alter the refcount, so you get significantly less refcount
traffic than in other languages where such a thing is unsafe (or not under
your control). That's especially important for Arc.

It also provides for nice _safe_ optimisations like "move out of this Rc if
I'm the only owner" (Rc/Arc::try_unwrap).

