
Blue Team Rust: What Is “Memory Safety”, Really? - ALX9
https://tiemoko.com/blog/blue-team-rust/
======
staticassertion
One thing I've realized with Rust is that its guarantees are a moving target.
Today we can guarantee (barring compiler errors) memory safety in safe rust,
but not unsafe rust.

But that story is improving. For one thing, we have people working to build
safe abstractions for more unsafe use cases, at zero cost. We also have people
improving fuzzing, and in theory safe rust code grows at a much faster rate
than unsafe rust code, so fuzzing is far more tractable. We have people
working on proving more about rust code, even when unsafe is around.

I'm quite excited to see how far Rust is able to go, I don't believe the state
today is the end at all.

~~~
dataking
While some patterns that call for unsafe today might be eliminated in whole or
part, code that violates the ownership and borrowing rules will have to remain
unsafe. I think fuzzing is not guaranteed to reach all unsafe code paths, nor
provide full test coverage. Ideally, you want a) some sort of formal proof
that the unsafe code cannot violate memory safety, b) 100% branch coverage for
unsafe code, or c) both (because profilers and proofs can be wrong too :)

edit: grammar.

~~~
foota
What I really want is an integration of automated proof checking with unsafe
code, allowing completely safe rust programs.

Additionally, this could be extended to safe code to allow removing overhead
from safety in things like bounds checks and Rc.

~~~
zozbot234
This requires a formal semantics for Unsafe Rust. It's a hard problem, albeit
one that's being worked on.

~~~
foota
I'm aware that however it's done it'll be hard.

Does it really require formal semantics for unsafe rust though? I'm not
familiar enough with rust to give an example, but if you imagine there's the
unsafe rust level and beneath that the "machine code" (not actually machine
code, just at the abstraction level equivalent to it) you should be able to
hand write what the rust code is doing, without requiring the compiler to
construct the machine level operations.

With a formal semantics the proof checker just checks that the written proof
(at rust level) matches with the rust code, but requires as you said an
understanding of how unsafe rust interacts with the proof, whereas with a
proof written at machine level, you don't need to understand the rust
semantics, you just need to translate the rust to machine level and then check
the proof there.

Perhaps the machine level could be some layer in llvm? I'm only a little
familiar with compilers, and hardly at all with more complicated compiler
theory, but this seems reasonable to me.

~~~
kd5bjo
> With a proof written at machine level, you don't need to understand the rust
> semantics, you just need to translate the rust to machine level and then
> check the proof there.

This approach would only be able to verify a particular compilation result as
safe. If you want to verify that it will always be safe, you need to be
comparing against the behavior of future compilers, which requires some kind
of contract about their behavior. “Formal semantics” is the technical term for
that contract.

------
saagarjha
> Bounds checks are more effective than the stack cookies a C compiler might
> insert because they still apply when indexing linear data structures, an
> operation that's easier to get right with Rust's iterator APIs.

Not only that, bounds checks always work, while stack cookies are possible to
bypass either by luck or by information disclosure.

~~~
Animats
The key thing with bounds checks is to hoist them out of inner loops. If you
don't have that optimization, people will turn them off because of the
performance impact. Except in inner loops, the performance penalty isn't
usually that bad.

~~~
zozbot234
> The key thing with bounds checks is to hoist them out of inner loops.

The compiler can't always hoist the check on its own, because program behavior
might depend on the bounds check occurring in the loop. But you can write an
assert!() outside the loop to hoist it explicitly, and verify that the bounds
checks are optimized away - or use unsafe unchecked access when they aren't.

~~~
IshKebab
How do you write the assert? I've not heard of that before.

Oh you mean `assert!(array.len() > 100); for i in 0..100 { array[I]; }`

I don't think that _guarantees_ that bounds checks will be hoisted. It's just
a strong hint. I mean in this case it will almost certainly work, but in more
complexes cases it might not and the compiler is still free to emit bounds
checks without telling you.

It would be nice if there was an explicit way of forcing an error if the
bounds check was not hoisted. Similar story for lots of other optimisations -
autovectorisation, tail call optimization, etc.

Some game developer made a good point that sometimes fast is correct, i.e.
it's actually a bug if autovectorisation or whatever doesn't happen, so you
really need a way to guarantee it.

~~~
nicoburns
In Rust, the way to force removal of bounds checks is to use iterators rather
than a for loop.

~~~
IshKebab
I don't think that forces their removal either. It just happens to help the
compiler enough that it can remove them automatically most of the time.

As far as I know the only way to _guarantee_ that bounds checks are not used
in a block of code is to use `unsafe`.

~~~
dexterlemmer
Iterators do indeed not force the removal of bounds check, but that's simply
because they don't exist to remove in the first place. That's because
iterators actually do use `unsafe` internally. They are a safe abstraction.

    
    
      // This is unidiomatic. Don't do this.
      for i in 0..100 {
        // Oops! Array indexing. Here's a nasty bounds check!
        array[i] * 2
      }
    
      // Very unidiomatic. Never do this! Not even as an
      // optimization. You should properly use safe abstractions
      // in stead.
      for i in 1..100 {
        // Oh, NOOOO! This isn't supposed to be C!
        unsafe { array.get_unchecked(i) * 2 }
      }
    
      // This is still unidiomatic unless you want to use
      // it to explicitly show your code has side-effects.
      // However, we're completely rid of the bounds check and
      // yet it's perfectly safe perfectly safe.
      // The only way this could be unsafe is if rustbelt's
      // formal proofs, reviews, audits
      // and (probably) MLOC of thoroughly tested and
      // possibly fuzzed code using this in practice
      // have all somehow missed an unsoundness bug
      // in one of the more used safe abstractions in the
      // core library for years.
    
      for a in array {
        // Look, Ma! No indexing, therefore no bounds check!
        // Internally `get_unchecked` is (conceptually) used
        // but it's perfectly safe!
        a * 2
      }
    
      // This is idiomatic Rust. Again, there's no bounds check
      // to start with, since the save abstraction knows
      // exactly how many elements are in the array and
      // that the borrow checker will ensure nobody can
      // possibly invalidate its indexing.
      // Ditto what was said above that the safe abstraction
      // is pretty much guaranteed to be sound.
      a.iter().map(|a| a * 2).collect()

------
vinay_ys
Really well written article with a nice diagram that concisely explains all
the memory safe/unsafe areas of a rust program.

~~~
Klasiaster
The diagram arrows of »Only valid references«, »No dangling pointers«, and »No
data races« only point to the Heap Memory but should also point to the Stack
Memory. One reason is that a function can borrow a stack pointers to other
functions it calls at which point for the other functions there is no
difference whether they are stack or heap pointers. Valid references are
relevant the same way. Other reasons are that the borrow checker prevents data
races for data on the stack, and it also disallows to pass a stack pointer as
return value which would be a dangling pointer.

------
ncmncm
The pervasive mention of "C/C++", as if C and C++ were the same language with
the same failure modes, soured the whole thing for me, for reasons:

In modern C++, C bugbears are just not a problem that demands much attention;
I had one (1) memory mistake in five years, caught in initial testing.

There are still plenty of bugs, of course, but they are overwhelmingly
specification bugs: code is doing what was asked for, but what was asked for
was wrong.

Babysitting the borrow checker steals attention from preventing those actually
very common problems. In effect, the borrow checker has also consumed all the
time then spent tracking down and fixing the bugs it prevented avoiding, and
all the time spent adapting to bad interfaces it caused that more or less
worked.

Attention is, _by far_ , the scarcest resource every programmer manages.
Anything that burns attention without adequate return is actively harmful.
Coding at a higher level, using powerful, safe libraries trusted not to cost
too much, is how C++ programmers get the safety that Rust coders grind out on
the borrow checker. C++ has many, many features, not in Rust, meant
specifically to help capture powerful semantics in libraries that raise the
level of coders' attention.

The large and still growing suite of such features, and the powerful libraries
written using them, account for the almost exclusive use of C++ in all the
highest-paid, most demanding applications in fintech, medtech, HPC, CAE,
telecom, and aerospace. Rust will never be able to call those libraries.

Still, Rust is the only extant language plausibly gunning for C++'s role. It
starts with the advantage of leaving behind many of C++'s worst backward-
compatibility boat anchors, but its focus on low-level safety features
detracts attention from the high-level coding support that makes them
decreasingly relevant. Rust is adding features and users at a rapid rate, but
C++ picks up, easily, _many_ more new users in each week than the total
headcount of Rust programmers working in that week, and will continue. To be
remembered in ten years, Rust will need to become useful to many, many more
programmers than it is now winning over (HN buzz notwithstanding).

Gunning for C++ is a losing strategy. Planning to coexist with C++ will have
better results. Rust is an overwhelmingly better language, on every axis, than
C, Go, Java, C#, ObjC, Visual Basic, Delphi, and COBOL, all being coded today
mostly by people who will and often should never use C++. Every conversion
from those is a big net gain for the world. Rust will pick up few C coders,
despite that this would produce the greatest benefit to society, just because
almost all who might jump already did. But the rest are wide open.

Promoting memory safety is not the way to win those, either because they have
it (at enormous cost) or don't value it. To win those coders, Rust needs to
take memory safety as given, and promote fun, performance, and a future.

Java won big in 1995 by offering Microsoft sharecroppers a way out. Rust could
be the way out for a new generation, if only it can raise its sights from C's
too familiar failings.

------
doonesbury
Op - thanks. A great read. Helpful and informative.

------
nullc
I'm getting an increasingly bad taste from rust. In spite of the hype 9 out of
10 rust programs I download end up with a panic in my first 10 minutes of
using them.

Javascript is a memory safe language yet there is plenty of broken code
written it it.

Memory safety is a critical step forward but its wasted if it's paired with an
ecosystem that thinks memory safety means that software doesn't need to be
correct or is so fixated on reimplementing things for "memory safety" that
understanding the problem domain and actually making things that work falls by
the wayside.

~~~
Recursing
> 9 out of 10 rust programs I download end up with a panic in my first 10
> minutes of using them.

Do you have some examples? Firefox, ripgrep and fd are the only programs I use
that I know are written in rust (I know only a small part of Firefox is in
Rust), and they work fine for me

~~~
gameswithgo
Sounds like a completely made up story really. Where would he find ~20 odd
rust programs to download and try to report such a stat?

