
Ownership and Borrowing in D - systems
https://dlang.org/blog/2019/07/15/ownership-and-borrowing-in-d/
======
WalterBright
I've been working on a method for adding ownership and borrowing to the D
programming language, and people kept asking me to explain how it works, so I
wrote this article about it. I had originally thought it would be impossible
to add this to D, but the more I thought about it, the more tractable the
problem became.

~~~
abainbridge
Nice article as always. I don't think I fully grok this stuff (in other
languages I've read about too). Perhaps you can help me have some clarity of
thought.

1\. What should I do when I want to make a data structure like a doubly linked
list or a graph - ie one where multiple pointers point to the same object?

2\. Memory is only one kind of resource where ownership needs to be tracked.
Others include file handles, network sockets etc. Can you comment on whether
the OB mechanism solves the problem in general, or is mainly for memory
tracking?

~~~
YawningAngel
You could make the entire thing `const` and operate in a purely functional
fashion - though this approach might be absurd in the context of systems
programming (I can't claim to know as I've no background in the area
whatsoever.

~~~
twic
How do you make an immutable doubly linked list?

~~~
WalterBright
Build it, then cast it to immutable in @system annotated code.

------
logicchains
If D manages to implement sound borrowing checking this will be huge! Brings
it a step closer to matching Rust's feature list.

\- Zero-cost abstractions: Yes!

\- Move semantics: Yes!

\- Guaranteed memory safety: Yes! (in code marked with the appropriate
annotations)

\- Threads without data races: Yes!

\- Trait-based generics: No!

\- Pattern matching: No! (Although like C++, in can be done as a library:
[https://code.dlang.org/packages/sumtype](https://code.dlang.org/packages/sumtype))

\- Type inference: Yes! (Plus unlike Rust, you can also mark function return
types auto, and let them be inferred too).

\- Minimal runtime: Yes! (the GC could be disabled if using exclusively
borrow-checked code)

\- Efficient C bindings: Yes!

D also brings some features of its own to the table that Rust doesn't yet
match.

\- Higher-kinded types (template template params): it's possible to implement
Haskell-style monads etc. in D, if for some reason you felt the need.

\- Pure functions: these allow you to statically guarantee that a piece of
code doesn't perform any IO and is a completely deterministic function of its
inputs. This makes reasoning about a large codebase much easier, compared to
Rust where there's no way to statically guarantee a function isn't doing any
IO.

\- Fast compile times: the reference DMD compiler is almost as fast as the Go
compiler, at least for code that doesn't do a bunch of compile-time
calculation.

\- Compile-time compute: Rust now has constfn, but that still only supports a
limited subset of the language. D allows almost the entire language to be used
at compile time.

\- Variadic templates: Something C++ users might miss when coming to Rust,
they allow for the creation of things like tensors of arbitrary dimension
(myTensor<6, 3, 2, 5, 6, 20>) on which operations are checked at compile time
to ensure sizes are compatible (so it's a compiler error if you try to
multiply two tensors of the wrong size or wrong number of dimensions).

~~~
coder543
The problem with D is that it is so non-uniform. People talk about the betterC
mode, but not nearly all D code written can run in this mode. Now people will
talk about this ownership mode, but very little code will really use it. To
become an expert in D or C++, you really need to understand what feels like a
dozen different languages... and that’s not what I’m looking for in a single
language.

D’s concept of ownership (for now) doesn’t seem to have any plan to support
partially moved structs, which are a major ergonomic feature.

D has organically grown just about every interesting feature ever conceived,
but I would much rather use a purpose-built language, which is what Rust feels
like.

I don’t want a C++ that’s better at being C++ than C++ is. I really enjoy how
consistent and expressive Rust is. I also appreciate how intentional Go is
about being Go.

A lot of people enjoy C++, and some people enjoy D. I enjoyed C++ back when
Rust wasn’t an option... then something better came along. D and C++ try to be
everything to everyone. I like D better than C++... but I have seen no
compelling reason to trade Rust for D.

Variadic templates are much better expressed by either tuples or const
generics. A fixed length array of a constant generic length is the Rustic
solution to tensor-like problems, and it is being rigorously developed.

Arbitrary compile time computation should be separated from the body of the
program. If you need to download files or read the file tree at compile time,
that should happen in a build script. Rust provides build.rs for that.
Precomputing values without I/O is conceptually just an advanced kind of
compiler optimization, and that’s why Rust is basically seeking to make const
fns only pure functions.

The fast compile times are a double edged sword, since those binaries are
noticeably slower than Rust binaries, from what I’ve seen. I hope that Rust
will one day have a much faster compiler for development builds.

I also think every sign is showing that Rust is starting to get some real
traction, so I don’t think I’m alone in these opinions... but they are mostly
just that: opinions. You’re welcome to your own.

~~~
Scarbutt
If you don't mind, I'm curious about what "partially move structs" are?
Structs have always feel like a chore to me, complicating things more when
compared to maps in dynamic langs. Maybe Rust has something for dealing with
the "explosion of types" issue and dealing with partial information in structs
that I don't know about. Skimming through the rust book looks like Option<T>
is the solution to missing data in structs.

~~~
coder543
Partially moved just means that the compiler tracks which fields have been
“moved out” of the struct and prevents you from either accessing those fields
or attempting to reuse the partially moved struct as if it were still a whole
struct, since it is no longer whole... it has given up some of its members,
conceptually.

In the article linked by this discussion, the author specifically mentions how
fields can’t be moved out in D’s implementation.

Option is the correct approach for dealing with information that might be
missing, but that’s not really related to partially moved structs.

Structs document what is available in a given value, to both the compiler and
to you. If that’s not convincing, then I’m not going to try more here... it’s
off topic. But I will say you _can_ use maps in Rust or other statically typed
languages. They’re just not meant to be an alternative to well-defined
structs.

~~~
didibus
Can Rust now handle partial move semantics on structs? I remember reading it
couldn't do it before, and there were threads about how you should make small
structs because of that, or that you should not use methods, because they take
ownership of the full struct and not just the fields that they actually use,
etc.

~~~
littlestymaar
It can, with the big caveat that it only works well if you don't cross a
function boundary.

~~~
didibus
Ah, so passing a struct as an argument to a function will still take ownership
or borrow the entire struct?

------
Animats
Oh, nice. It's hopeless for C/C++ for legacy reasons, but D - that could work.

Two memory safety related things to consider. Rust needs unsafe code to do
these things.

1) Backpointers. Rust's ownership system doesn't allow backpointers, at least
not without reference counts. So some standard data structures, such as doubly
linked lists and various types of trees, can't be coded in safe Rust.

A backpointer is defined by two objects which have an invariant locking them
together. If you could talk about a backpointer in the language, and tell the
compiler where its forward pointer is, you could enforce the necessary
invariants. If A contains a pointer to B, B's backpointer must point to A. Any
code which modifies either of those pointers must maintain the consistency of
that relationship. That's checkable at compile time.

This fits well with ownership. A owns B. B's backpointer is a slave. All
that's needed is type syntax which says "backpointer b.reff always points to
my owner."

There are lots of interesting cases to be worked out, especially around
destructors. Those are worth checking at compile time, because people get them
wrong. It's really nice if you can be sure that a data structure always tears
itself down cleanly when you drop the link to its root.

2) Partial initialization. Collection classes which allow growth are hard to
do in safe code, because they require arrays which are partially initialized.
It should be possible to talk about partially initialized arrays in the
language. Maybe some predicate such as "valid(tab,n,m)" indicating that array
tab is valid from n to m inclusive. You can't access an element outside the
n..m range. The checker for this needs to know some simple theorems, like
"valid(tab,n,m) and valid(tab,m+1,m+1) implies valid(tab,n,m+1)". Then it can
track through a loop and verify that the array is initialized for the given
range.

Somebody will complain that they want sparsely initialized arrays containing
types that really need initialization. Tell them no.

~~~
om2
Backpointers can be more than one step away, for example with a circular
linked list, or with a directed (not-acyclic) graph. It's harder to see how to
enforce these more complex invariants at compile time (but maybe not
impossible).

~~~
Animats
That's where the language designer has to say "no". Trying to make ownership
semantics work for arbitrary graphs is too much trouble.

I had to deal with this once writing a 3D collision detection system. Objects
were convex hulls, with vertices, edges, and faces, all pointing to each
other. I had an implementation in C to look at, and it was a mess. Ownership
was so tangled that they couldn't delete objects.

So I rewrote it in C++, with each object having collections of vertices,
edges, and faces. All links between them were indices, not pointers. Lots of
asserts to validate all the consistency rules. Easy to delete, and much better
cache coherence. Chasing pointers to tiny objects all over memory belongs to
the era when cache misses didn't dominate performance.

So that's the answer. Some data structures are too complex for ownership
semantics. So do them another way and have all the parts be owned by some
master object.

------
sriram_malhar
The borrowing scheme works for pointers that are passed in to functions, but
not ones returned from functions As long as you don't go against the grain of
the scope, you are fine. It is a natural and simple restriction that can be
addressed by local data flow analysis.

The problem, as I see it, is going against the grain of the scope, a problem
faced by iterators. For example, consider

    
    
        for (p = f(s); p; p = p.next) {} 
    

won't be allowed, since f may not be able to return a borrowed pointer (like
s.ptr). Rust solves this using parametric lifetimes.

In general, I am saying that returning a non-owned pointer from a function is
common, and required. Am I wrong in this assumption? If not, is there an
alternate architecture to get around this problem? Note that marking the
pointer as const is not an option.

~~~
WalterBright
The solution is to have the function not return a pointer but return a ref. A
ref is a non-owned pointer.

~~~
e12e
Would this be similar to passing argument(s) and return pointers/variables to
a C function - or similar with assembler, reserving some registers for return
values (aka caller manages memory)?

But with the (d) compiler helping enforce correctness?

~~~
WalterBright
Not sure what you mean, but sounds like it is.

------
jstimpfle
> I’ve been using malloc and free for 35 years, and through bitter and endless
> experience rarely make a mistake with them anymore. But that’s not the sort
> of thing a programming shop can rely on, and note I said “rarely” and not
> “never”.

Speaking as someone who often chooses the bitter path (because that's how I
like it), this is an honest look at the state of things that I rarely find in
articles about programming.

------
twic
This seems to be all about heap-allocated objects. Can the ownership machinery
reason about lifetimes of stack-allocated objects at all?

For example, i would like to declare a buffer on the stack, read some data
into it, make const pointers into the buffer, put those pointers in heap-
allocated structs, put those structs in a vector, and do some manipulation. As
long as all the heap-allocated structs expire before i return from the
function where the buffer was declared, this is entirely safe, but if they
escape, i'm in trouble.

~~~
WalterBright
> Can the ownership machinery reason about lifetimes of stack-allocated
> objects at all?

It already does. If you compile with the -dip1000 switch, you'll be unable to
escape references to the stack for @safe code.

------
azakai
Very interesting!

One of the key points here:

> This means that [Ownership/Borrowing] can be added to D code incrementally,
> as needed, and as time and resources permit. It becomes possible to add OB
> while, and this is critical, keeping your project in a fully functioning,
> tested, and releasable state. It’s mechanically auditable how much of the
> project is memory safe in this manner.

------
pcwalton
Does this mean that you can't take unique/mutable references to the inside of
reference-counted objects? Is there no escape hatch, like Rust's RefCell?

~~~
WalterBright
You can, it's just that the RC object itself cannot be implemented with the
@live checking on. The RC object can then present itself with a @live
compatible interface.

~~~
qznc
How does that look like? Do I annotate it as @undead?

~~~
WalterBright
While I like your style, in D we do it with @trusted annotations.

------
jbb123
I think this looks very good. The main thing though is that it is opt-in where
you want it. If it became the default, or required then it would make the
language difficult and unsuitable for many things, but as an option it's great
:)

------
beeforpork
Is the proposed embedding of ownership/borrowing into D complete? Rust has
these ugly lifetime annotations which is one of the reasons I do not like its
syntax. Why is there nothing like this in the proposal for D? Or why is it not
needed?

~~~
pjmlp
Most languages with some form of GC (Swift, D, C#) that are looking into
adopting some form of ownership look at it from the productivity point of
view, where you would only make use of it in the hot path, the very last mile
so to speak.

------
twic
So, if i have a mutable pointer, i can make multiple const pointers which
borrow from it - but i'm not allowed to use the mutable pointer while those
borrows exist. Makes sense.

Can any of the const pointers outlive the original mutable pointer?

I would guess not, because you're not allowed to let an owning pointer fall
out of scope, and you can't pass the owning pointer to a freeing function
while the const pointers are alive.

So is there any way to start with only a mutable pointer, and end up with only
a const pointer? Say i am gradually building up some object, i want to use a
mutable pointer to do that, but once i'm one, i want it effectively frozen,
with only a const pointer to it.

~~~
irishsultan
The article says that const pointers can't release their memory, so that would
lead to memory leaks. (Obviously in some cases you might want to keep that
const pointer until the program is finished, but it's not a generic solution
to the "I want a pointer that can't be mutated" because often you do want to
drop the value at some point)

~~~
twic
That's a very good point!

I was coming from a Rust mindset, where ownership and mutability are somewhat
separated. Although you can't own an object through an immutable reference,
you can own one through an immutable binding. But i am slowly learning that
D's model is sufficiently different that there isn't a useful comparison here.

