
Simple, Fast and Safe Manual Memory Management [pdf] - dmmalam
https://www.microsoft.com/en-us/research/wp-content/uploads/2017/03/kedia2017mem.pdf
======
Animats
That's very clever.

It has most of the problems of garbage collectors that need to know object
layout. "(void *)" will break this.

They don't mention I/O. If you have an I/O operation in progress, its buffer
areas need to be locked. But the object-moving operation is lazy, so that's
not too bad. You have to avoid getting into a situation where object-moving
gets started, clashes with a pending I/O operation, and has to block. But this
is Microsoft working on their own runtime, so they can fix I/O if needed.

------
7197ghr918hf
> We guarantee type and memory safety by introducing a new exception
> (DanglingReferenceException) with the following semantics: a dereference to
> a deleted object will either succeed _as if that very object was not yet
> deleted_ , or result in a DanglingReferenceException.

I would worry about what might happen in a variable-sized object
reuse/freelist type scheme combined with this allocator. A dangling reference
might not contain garbage that allows an attacker to control program flow,
while still containing data that is exploitable in other ways.

So this is not an entirely safe way to do things. Arguably it would be worth
the speedup. But many of the techniques for making this approach safe . . .

> such exceptions can be detected with the combination of rigorous testing and
> support in the allocator for a debug mode that enforces stronger semantics
> (i.e. exceptions on every dereference to a deleted object) at a higher cost

. . . can also be used to test C++. For example, we use
[https://github.com/google/sanitizers](https://github.com/google/sanitizers) a
lot at the office to detect these sorts of errors.

Still, an interesting and clever result. Nicely done!

------
egnehots
Albeit more complex, the main advantage of rust, cyclone like solutions is
that their guaranties are checked at compile time.

For a critical application I vastly prefer to get these annoying compilation
errors rather than the illusion of productivity now and then getting runtime
exceptions in a shipped product.

Be happy to fail as soon as possible.

~~~
falcolas
The truth of the matter is, so long as you're dealing with non-deterministic
system states or unsafe regions of code, you will always be prone to failing
at runtime. Incorrect programmer assumptions, out of memory errors (which are
becoming more common with memory-constrained containers), parsing implemented
using a ok-or-panic idiom, attempting to read a file you have no permissions
to access, having to read from volatile memory regions, cosmic rays flipping
memory bits... all can cause even Rust programs to fail at runtime.

Runtime failure is a reality of every program, best to account for it.

~~~
oconnor663
> all can cause even Rust programs to fail at runtime

For sure, a safe Rust program can fail in a dozen different ways. My favorite
is just creating an unsigned 0 and trying to subtract 1 from it -- in debug
mode that'll panic and probably crash the whole program. But unless you touch
the `unsafe` keyword, it should be impossible to fail in a way that causes
_undefined behavior_. That's the big difference.

Edit: You're right, I missed the point :| :| :|

~~~
falcolas
The whole article is about preventing undefined behavior WRT memory allocation
and deallocation, so that's kind of beside the point.

------
millstone
One reaction, with the caveat that I may have misunderstood some or all of
this:

One of the great advantages of manual memory management is that it does not
require a runtime that can identify all live references. A pointer may be cast
in an int, NaN-boxed, etc. - no problem.

In this scheme the "promoting live objects" phase works by copying objects to
a new allocation. Crucially, this avoids the ABA problem by always walking
forwards in memory. Second, it does not attempt to eagerly update all
references - instead it permits references to become dangling. So let's say we
have a dangling NaN-boxed pointer.

Now later, our NaN-boxed dangler is unpacked and dereferenced. This causes a
SIGSEGV since the page is no longer readable. The signal handler notices it
and attempts to fix it up.

IMO this is where things go off the rails:

> If the object was promoted, the handler scans all registers and the current
> stack frame, and patches all stale references to promoted objects.

We were so close, but now the runtime needs to be able to distinguish
references to promoted objects and values that happen to share their address.
We almost got away without stack maps.

> Therefore, we modify the compiler to emit meta-data describing the location
> of heap references in registers and on the stack for every load and store
> instruction (instead of just gc safe points)

Ok I give up. Now we're just building a garbage collector.

This does not help our NaN-boxing example. There's no way to easily inform the
NaN-boxed value that it needs to update; we could hack it by comparing the
register's value after the dereference, but that solution is hard to love.

IMO more interesting would be to modify the compiler to require all
dereferences to be updatable, and propagate that back through to the original
reference. In C++ speak we could imagine this:

    
    
        void *dereference(void *&ptr) {...}
    

and force the clients to deal with the fallout of back-propagating the new
value. It's a lot of work, but the carrot is "pointers cannot dangle and you
don't need a GC and you don't need Rust-style borrow checking" which sounds
pretty rad.

Anyways seems interesting but could be bolder?

~~~
snaky
Do we know yet any ways to get "pointers cannot dangle and you don't need a GC
and you don't need Rust-style borrow checking" other than region-based memory
management (MLton, Cyclone)?

~~~
frankmcsherry
Ownership / affine types?

It's hard to say, because some folks will jump up and down and say "reference
counting is GC!", and then some folks could just as justifiably say "regions
are GC!"

~~~
snaky
Maybe linear types actually, affine is not enough?

~~~
frankmcsherry
I suspect affine is fine, but I'm not an authority. The distinction (as I
understand it) is only that linear types require that you consume all
instances, whereas affine allow you to leak if you want. Leaking doesn't seem
to lead to dangling pointers, but I've been wrong before.

