
The Next Steps for Single Ownership and RAII - verdagon
https://vale.dev/blog/next-steps-raii
======
Animats
Hm. That's an interesting approach. The constraint reference encapsulating a
ref counter is useful.

The "destructors with parameters" are interesting, but it's not clear what
prevents using the object (or trying to) after destroying it.

None of this seems to address locking. The trouble with locking in C-type
languages is that the language has no idea what data the lock covers, and thus
it's hard to do an analysis for race conditions.

 _" We made two tiny classes, Request and RequestHandle. Each had only a
pointer to the other. When one was destroyed, it would reach into the other to
null out the pointer, thus severing the connection."_

I've wanted something like that for Rust. Rust has problems with back
references in trees and lists. There's no safe way to do those. What you need
for a back reference is something like what Vale has there. The back reference
has to be a non-owning optional reference. The only allowed values which can
be assigned to it are the owner and None. Changing the ownership implicitly
sets it.

(This is one of the two big safety holes in Rust. The other is partially
initializing an array. There's no way to say to the language that "0..N of
this array are initalized". So "Vec" has to be unsafe. If both of those holes
were plugged, there would be far fewer excuses for "unsafe")

~~~
verdagon
Thanks!

There are three things preventing someone from using the object after
destroying it:

    
    
       * If it was an owning reference, the compiler will ensure nobody else uses it.
       * The program will halt if there are any live constraint references to an object we destroy, preventing anyone using one of those.
       * If it was a weak reference, the language will (conceptually) null that out when we destroy the object.
    

Locking goes deep into Vale's concurrency story, but I can try to sum it up:

    
    
       * By default, an object can't be visible to two threads, every thread has its own isolated region (similar to Pony and Rust).
       * We can use a mutex to contain a region, and Vale's "region borrow checking" can make sure references don't escape the lock's duration.
    

I'm curious about that second safety hole Rust has. I don't think Vale has
that same hole, as we initialize every array with a size and a lambda, and
there's no opportunity to stop it halfway through (well, our form of
"exceptions", involving pure functions or rollbacks, could stop it and keep
everything safe), you can see an example in our little roguelike sample [1] on
line 415.

[1]: roguelike.vale,
[https://github.com/ValeLang/Vale/blob/9541d414ce6b1c2745de19...](https://github.com/ValeLang/Vale/blob/9541d414ce6b1c2745de19f1335e4f213cae32cf/Midas/test/tests/roguelike.vale#L415)

~~~
Animats
_I 'm curious about that second safety hole Rust has._

All variables are supposed to be initialized before use. But a Rust Vec, like
a C++ std::vector<T>, has both an in-use "size" and a usually-larger pre-
allocated "capacity". This allows adding elements without getting a new memory
block.

Implementing Vec safely in Rust runs into the problem of initializing the
capacity not yet used. Not initializing it is unsafe, so Vec has to have
"unsafe" code.

I'd argue for this:

First, data types should be divided into "fully mapped" and "not fully
mapped". A fully mapped type is one where all bit combinations are valid.
Fully mapped types don't have to be initialized to be memory-safe. This
includes integers, floating point on most systems, but not booleans. (C++
calls this "plain old data", or POD, but allows Booleans and enums. Whether
Booleans with values other than 0 or 1, or out of range enums, should be
considered "plain old data" is a good question for a language designer.)

If it's not fully mapped (worst case, it has a pointer or reference in it),
what do you do about uninitialized data in the unused "capacity" area? If it's
not initialized, you have a pointer with a junk value, as far as the language
is concerned. If it has to be initialized, you have a problem if
initialization requires an input value.

So I'd argue for the concept of a partially initialized array as a language
feature. For simplicity, only being initialized from the beginning to a limit
(0..N) should be supported. (Yes, there is such a thing as a giant sparsely
initialized hash table where allocation info is stored outside the table. See
Google's sparse hash table. That's very rare.) The idea is that you designate
some field of the structure as the "size", and accessing beyond "size-1" is a
subscript error. But you can initialize at "size", and this increases "size"
by 1. You can decrease "size" if you own the whole object.

If you have both of those features in a Rust-like language, you should not
need "unsafe". Except to call other languages, maybe.

~~~
algesten
I don't think you can have a "fully mapped type" the way you describe it.
Uninitalized memory in today's compilers do not have an observable value.

[https://www.ralfj.de/blog/2019/07/14/uninit.html](https://www.ralfj.de/blog/2019/07/14/uninit.html)

~~~
twoodfin
The compiler’s representation of an uninitialized lvalue might not be
observable as the article describes, but an uninitialized heap allocation from
the OS (i.e. malloc()) is observable by design.

Of course a language and its clever implementation can make reading
unintialized values from malloc()‘d objects UB with all that implies, but
that’s a spec choice.

So I believe you could define what’s returned by malloc() as safely castable
to one or more “fully mapped types”, which would solve the problem for vectors
at least of those types.

~~~
Animats
Yes, you need to be able to at least do that, for big I/O buffers and such.
Those are usually raw arrays of bytes, so no problem there.

The problem I was discussing is with growable arrays, where the memory
underneath is partially initialized.

------
drewm1980
Interesting work! When I read it, I came away with the impression that vale
provides guarantees like Rust, except that you get runtime errors rather than
compile time errors, and that the guarantees for thread safety are weaker.

It's likely these are well motivated design decisions since you designed this
with knowledge of Rust. As a programmer relieved by the transition from C++ to
Rust, going back "halfway" in the direction of C++ will be a hard sell, even
if it makes some things easier.

~~~
ridiculous_fish
Not the author, and still new to Rust, but I think this approach is
interesting in Rust as well.

Consider visiting a tree:

    
    
        enum Tree {
           Branch(Box<Tree>, Box<Tree>), // left, right
           Leaf(i32)
        }
    
        fn postorder_visit(t: &mut Tree, f: fn(&mut Tree)) {
             // ???
        }
    
    

A recursive implementation of `postorder_visit` is trivial but risks stack
overflow. A heap-allocated stack (say, `Vec<&mut Tree>`) runs into problems
with the borrow checker, which knows about the C stack, but not _your_ stack.

So an overflow-safe implementation must use unsafe, but still wants to have
confidence in ordered destruction. You could make it a `Vec<*mut Tree>`, but
now there are zero checks!

The idea here is, in debug mode only, replace `Box` with `Rc` that checks its
refcount upon drop, and `&mut Tree` with Rc as well. Now your debug mode does
not require unsafe, and your release mode is fast. You have dynamically
enforced ownership in a regime inaccessible to the borrow checker.

~~~
btschaegg
Since I'm still starting out with Rust, there's probably just something I'm
missing here, but couldn't you easily solve that with `Cell<T>` or
`RefCell<T>`?

~~~
ridiculous_fish
I hope to see an expert answer to this question. In my lame attempt, Cell was
too awkward and RefCell was also awkward with a side of runtime cost.

------
kccqzy
> When one was destroyed, it would reach into the other to null out the
> pointer, thus severing the connection.

How does this deal with concurrency? Must you now always use mutexes when
handling this kind of pointers?

Also the discussion about destructors having a return value or taking
arguments forgets to mention a simple trick: an rvalue-reference qualified
member function. To call it you must use std::move() on the object. Linters
can then warn about use-after-move. That method can do the real work of
destruction while the destructor can be quite trivial.

~~~
verdagon
Great question! In C++, yes. When I did a clasp in C++, I had a mutex on each
side, where the "primary" one locked its mutex then locked the other, and the
"secondary" one would instead lock its mutex and _try_ -lock the other. Then,
it would sever the pointers in both.

Vale, like Rust, isolates its threads' memory regions from each other, and
also allows borrowing from mutexes. (Vale will use region-based borrow
checking rather than Rust's object-based kind)

You're quite right about the rvalue-reference qualified member functions!
There's one hiccup though; it can't be used with unique_ptr. I think theres an
obscure note in the article that mentions that C++ could use Rust's "Arbitrary
Self Types" to really close this hole. At that point, C++ might be able to
have this kind of improved RAII!

~~~
steveklabnik
Rust does not provide any specific isolation between threads' memory. You can
Send an &mut T to another thread, and it will be able to mutate your memory.

~~~
nybble41
> You can Send an &mut T to another thread, and it will be able to mutate your
> memory.

Mutable references are exclusive. If you Send an &mut T to another thread then
it isn't really _your_ memory any more—you can't read from it or write to it
unless the other thread somehow Sends the reference back.

~~~
steveklabnik
If it gets sent back or if it ends execution, which releases the &mut T, and
the borrow. It isn't your memory _temporarily_. If it was forever, you'd be
passing ownership, not a reference.

~~~
atq2119
In other words, Rust provides the isolation that was originally mentioned.

That there are mechanisms by which the isolation boundary can shift does not
change that.

~~~
steveklabnik
I guess we’re arguing about definitions :) nothing about this is specific to
threads, and to me, thread isolation means that threads are forbidden from
writing to each other’s memory entirely. Maybe my understanding of the
definition is idiosyncratic!

~~~
nybble41
I think the point that verdagon was making regarding "isolation" was that the
Rust ownership and borrowing rules prevent any _concurrency_ issues from
arising. At the OS level threads, by definition, share all their memory—that's
the primary factor that distinguishes them from processes—so it doesn't really
make sense to talk about memory belonging to a particular thread at that
level. Most other languages do not have that level of isolation and would not
actively prevent multiple threads from accessing the same mutable memory
location.

In one sense it's true that in Rust terms the actual _ownership_ of the memory
remains with the original thread, and the receiving thread is only borrowing
it, but the reality is a bit stricter than that would imply since the
borrowing thread has exclusive access for the duration of the borrow and can't
be forced to give it up. To me, "which stack was the object allocated from" is
less important than "which thread actually has control right now".

~~~
steveklabnik
Yeah, and it's possible I'm over-indexing on something like an Erlang-style
green thread model, where shared state is (almost) forbidden, and thinking
that's what was being referred to.

------
steveklabnik
The Rust section is missing the various Cell types, they’d help here,
depending.

~~~
verdagon
Yes indeed! Cell types can be used for a lot of different situations depending
on the specific requirements, but there's no good way (that I've found) to
really represent the Plane/Airport example in a way that doesn't discard
safety, incur a runtime cost, or freeze the containing Vec... I may very well
have missed something though.

~~~
comex
> but there's no good way (that I've found) to really represent the
> Plane/Airport example in a way that doesn't discard safety, incur a runtime
> cost, or freeze the containing Vec…

That's true. But according to your blog post, neither does Vale: it makes you
choose between unsafe mode and reference-counting-overhead mode. From what I
can tell, the advantages you're claiming are:

1\. Optimized reference counting: Valid, though I'm wondering how much
difference it makes.

2\. Constraint references: Valid, but as far as I can tell this could be
easily implemented as a wrapper around Rc like the C++ version is around
shared_ptr.

3\. Mutable aliasing: I think this is the most important one. In Rust you can
do this by wrapping all your struct fields in Cell or RefCell (so that the
'immutable' references you get from Rc are actually mutable ones), but the
ergonomics are poor. Other reference-counting languages (e.g. Swift) do this
ergonomically, but have no way to opt out and use a borrow checker for
increased performance. There's definitely room for a language to combine
features of both.

4\. Ability to compile the same code as either fast or safe: Doesn't appeal to
me, because safety matters most in production where you might actually get
exploited; but YMMV. I know game developers tend to have a different outlook.

~~~
verdagon
Absolutely right! Both Rust and Vale have their respective overheads.

Fast mode is as fast as even C++, and strictly safer than e.g. the C++ + ASan
approach, but Resilient Mode is where the speed+safety story gets interesting.

I didn't have much opportunity to go into it in the article (it was already
going on 15 pages, hah!) but there are some other aspects to Vale which should
drastically cut down on reference-counting overhead:

    
    
       * Immutable region borrowing: we guarantee an entire is immutable, so all references into it are temporary and zero-cost, verified by the region borrow checker)
       * Bump-calling: a pure function can make its own new region which uses a bump-allocator for all allocations, and at the end we copy out the return value.
       * As mentioned, optimized ref-counting; we hope to reuse Lobster's algorithm, which removes 95% of ref counts.
       * Non-atomic ref-counting, because of the region isolation (like Rust's Rc, as opposed to its Arc), which is faster and more optimizable.
    

I would of course agree that fast mode isn't for every use case; I would use
it for a game or WASM, but not for a production server, where priorities are
different. We think that the above four optimizations will give Resilient Mode
speed on par with Rust and C++. Benchmarking will show us how close we can
get!

------
cevans01
The trick of switching the implementation of owning_ptr and constraint_ptr
based on a compile flag is very neat.

Is there any risk of the compile flag influencing which object owns the
reference, and therefore causing a kind of "heisenbug" where it doesn't crash
during the safe mode but still has dangling pointers in the fast mode?

~~~
verdagon
Thanks! We stand on the shoulders of giants, this method has been in use in
the wild for a while, and Gel introduced it back in 2007.

Behavior will be the same in all three modes. There is however a chance that
testing and development didn't cover a certain code path, and we would trigger
unsafety in production, similar to unsafe blocks in Rust. When one encounters
unsafety in Vale, they'll be able to just re-run in normal mode to instantly
identify what caused it.

Normal Mode is very conservative (halts early when a constraint ref becomes
dangling, rather than when it's dereferenced), so combined with test coverage,
it can give high confidence in safety, and is strictly better than even C++
and ASan.

If that's not enough, Resilient Mode has zero unsafety, and with the
optimizations we'll be using (nonatomic RC, Lobster's algorithm, immutable
region borrowing, bump calling, etc) should be incredibly fast in practice,
possibly on par with Rust and C++, and exceeding them in certain cases, with
bump calling.

------
sagarm
> Safe Handling of Aliases

Er, why wouldn't you just use automatic storage duration? That would look even
simpler than the Vale implementation and have the same benefits:

    
    
      class BigClass {
        A a;
        B b{&a};
        C c{&a};
        D d{&a, &c};
      };
    

You could of course explicitly write the constructor if you choose.

> Destructor Parameters

This can be done in C++: both unique and shared ptrs can take a deleter
instance. So for the rollback example:

    
    
      std::unique_ptr<Transaction, Transaction::RollbackWithMode> tx(
        new Transaction(), Transaction::RollbackWithMode(mode));
    

But my question is -- why? At this point you are losing one of the main
benefits of RAII IMO: scheduling cleanup work right next to initialization
work, allowing you to assume it is taken care of as you read the rest of the
function:

Transaction tx(conn, Transaction::ROLLBACK_MODE_TUMBLE);

...

if (ok) { tx.commit(); } return;

Similarly with the Future example -- IMO this is far clearer:

    
    
      void Producer(Future<T> future) {
        auto auto_reject = AutoRun([&] {
          if (!future.ready()) future.reject();
        });
        ...code, including potentially many branches....
      }
    

where AutoRun is something simple like

    
    
      template <typename Functor>
      auto AutoRun(Functor f) {
        class AutoRunner {
         public:
          AutoRunner(Functor f) : f_(f) {}
          ~AutoRunner() { f_(); }
         private:
          Functor f_;
        };
        return AutoRunner(f);
      }

~~~
verdagon
Great question! We could also use a deleter, set up when we create the object,
but thats often too early to know what parameters to pass into the destructor.

In some cases, you do know the destruction parameters up-front, and there I
would agree that a deleter is great.

Sometimes you can't know the parameters up front, so can't easily get them
into the deleter when you initialize. The call-once std::function and the
future are good examples: These both take parameters which you might not know
until later on. If you already know the values of a future when you're
creating it, it probably wouldn't be a future.

future.~resolve(resultOfSomeAsynchronousCalculation);

------
saurik
The thing this is doing with "named deconstructors" seems to essentially be a
form of linear typing, which is already well-modeled; and while the article
claims this is "incompatible" with exceptions, I don't think that is true: it
only means certain kinds of objects can't be on the stack if an exception were
to occur without a "default" deconstructor (similar to a default constructor),
but being able to get that scenario as a compile error is actually "cool" (as
you can always model exception handling as just an automated expansion of an
error propagation monad, at which point the code either would have worked
before or it wouldn't have) and so I wouldn't throw the baby out with the bath
water so soon.

~~~
verdagon
I'm not familiar with that particular flavor of monads, is that kind of like
how Javascript's Promise has a .then() and a .catch()?

Also, someone mentioned yesterday that this is a form of linear typing, since
everything must be destructed exactly once. But we would still run into a
problem if we _did_ make any objects with no zero-arg destructor...

...except now that I'm typing this out, it occurs to me that we could just use
the "Bailing Past Destructors" [1] rule says: hold the error, call the
appropriate destructor, and continue returning the error upwards. (Vale would
use a Result type, not exceptions)

Wow, the solution was there all along, and I never thought to retrofit it to
C++. Perhaps I'll revise the article tonight to take out the mention of
exceptions =)

[1] "Vale: The Interesting Parts",
[https://docs.google.com/document/d/1t0zzW0K9jilbCkuAulZfDlZc...](https://docs.google.com/document/d/1t0zzW0K9jilbCkuAulZfDlZcYioRO10Ny5K7LEciyMI)
(fair warning, written for language enthusiasts!)

