
Learning Rust with Entirely Too Many Linked Lists - xwvvvvwx
http://cglab.ca/~abeinges/blah/too-many-lists/book/README.html
======
Animats
Rust needs back pointers as a primitive. A back pointer and a forward pointer
are locked in an invariant relationship (A points to B which points back to
A). The borrow checker needs to know about that to check back pointers
properly. Then you could do trees, doubly-linked lists, and various other
graphs safely.

(The two basic constructs that are hard to express safely in Rust are back
pointers and partially initialized arrays. There's also some trouble with
concurrency primitives. See [1])

[1] [https://people.mpi-
sws.org/~dreyer/papers/rustbelt/paper.pdf](https://people.mpi-
sws.org/~dreyer/papers/rustbelt/paper.pdf)

~~~
dbaupp
Do you have any thoughts on how it might even begin to be possible to do back
pointers safely, in ways that don't already exist in Rust?

I find it very confusing that you often complain about Rust's complexity, and
then also complain that it isn't complex enough (i.e. doesn't have the
features required statically reason about complex ownership graphs at compile
time). It's fair that maybe there's complexity budget spent in places that you
think aren't quite right, but I haven't picked up any reasonable plan for
modelling back pointers (from anyone), nor many specific pain points that
could be solved in your suggested way while maintaining the low-level nature
of Rust.

~~~
Animats
_Do you have any thoughts on how it might even begin to be possible to do back
pointers safely, in ways that don 't already exist in Rust?_

First, you need to be able to talk about them. A forward ref/back ref pair has
to be identified with some form of declaration at compile time. The forward
pointer is an ordinary owning pointer. The back pointer is special. So add a
variable attribute "back", to be used like much like "mut", but only on a
reference field of a struct. Let's see if that's enough.

The invariants to be enforced are:

\- If b.bak points to a, a.fwd must point to b.

\- If a.fwd points to b, b.bak must be Option(None) or point to a.

These can be enforced by some simple update checks when a.fwd or b.bak is
updated. The normal sequence of events is something like:

    
    
        a.fwd = &mut b;
        b.bak = &back a;
    

or, when breaking a link,

    
    
        b.bak = Option(None);
        a.fwd = None; // drop ownership of b, which deallocates it.
    

You have to clear the back reference before dropping the object, to avoid a
dangling pointer situation. The compiler might optimize that out, observing
that b is dead at deallocation.

If you try to change b.bak to anything but Option(None) or a, that's an error.
If you try to change a.fwd while b.bak is not Option(None), that's an error.
This can usually be checked at compile time.

Open questions:

\- Can the compiler figure out which field of a is the forward ref? That's the
ref that owns b. Or is syntax needed in the declaration of the a struct for
that.

\- What about arrays of forward refs?

~~~
dbaupp
You've missed a far more fundamental question: how does this interact with
aliasing XOR mutability? That's the core reason why back pointers are hard:
they result in complicated relationships between things, in that an &mut
pointer to a child may not be independent of its owner. I suspect one would
need annotations for different ways to handle this, like single threaded
checking of mutability (which already exists: Cell and RefCell) or multi-
threaded (Mutex, RWLock), essentially meaning that this would be inserting
into the language things things that don't need to be there.

In any case, even switching to the correct Option::None syntax, your example
doesn't actually make sense: &mut isn't an owning pointer, so it doesn't make
sense to try to construct an owning tree with it. Maybe something like

    
    
      a.fwd = Some(b);
      a.fwd.as_mut().unwrap().bak = Some(&back a);
    

is what you're trying to get at? In either example, the mutability question
appears: in yours, b is mutated while an outstanding &mut borrow exists, and
in mine, a is mutated while an outstanding &back borrow exists.

This all seems like it might be possible to solve for some special cases, but
pointer aliasing is one of the hardest parts of static analysis. It's
essentially impossible to do in the general case, and my intuition is that
back pointers get you very close to the general case. It's not even obvious to
me how one can get a binary tree with back pointers to work well.

 _> The compiler might optimize that out, observing that b is dead at
deallocation._

Only if b doesn't have a destructor (or has a destructor that can be
sufficiently inlined).

~~~
Animats
_That 's the core reason why back pointers are hard: they result in
complicated relationships between things._

That's why they need built-in checking, rather than hacks using "unsafe". Use
of back references is going to have to be restricted so you can't get two
mutable handles to the same object. The trick is making that work through
restrictions which can be checked locally.

~~~
dbaupp
That's not my point. A general built-in checking scheme that handles all the
variations of linked lists and trees will likely have to have quite a few nobs
to tweak for the various trade-offs one might want to make, and will likely
have to be very general to handle all the variations, making that general
infrastructure significantly more complicated (both to implement in the unsafe
block that is the compiler, and, likely, for users to understand) than focused
`unsafe` blocks, which can, and often are, packaged up into safe interfaces
(like reference counting with weak pointers).

 _> Use of back references is going to have to be restricted so you can't get
two mutable handles to the same object. The trick is making that work through
restrictions which can be checked locally._

Talk is cheap, and I'm not sure this talk actually says anything of note.
Those "ideas" seem to be the most obvious first considerations for working out
how this feature might even work:

\- disallowing two mutable handles is the fundamental rule of Rust,

\- the ability to check locally is a strong convention for programming
language implementations, and a _very_ strong one for Rust.

Having _some_ concrete idea that addresses just those two points (no need to
worry about syntax or anything like that) would be an improvement on the
current situation: I haven't see any for this other than "runtime checking"
(already exists in Rust) and "no back references".

\---

In other words, vague assertions like "Rust should support back pointers" with
syntax ideas are jumping way ahead. The language typically tries to provide
building blocks to allow things to be built in libraries, rather than lumping
opinionated features into the language. These features will often then require
unsafe to create safe abstractions, but this isn't bad: the compiler itself is
essentially one large unsafe block.

The place to start would be trying to create a safe back-pointer interface in
a library, and seeing if there's anything that is too hard to make truly safe.
This has in fact already been done, with Rc/Weak and Arc/Weak, but for zero-
overhead/non-reference-counted scenarios, I think a good place to start would
be a smart pointer pair like Rc but with a constructor `fn make<T>(val: T) ->
(Forward<T>, Back<T>)`.

~~~
Animats
It's hard to make compile-time checks for consistency between two different
places in the code using a library.

------
gradschool
An old school c programmer wants to know if there's a Rust idiom for
recovering gracefully from failed allocations (i.e., when malloc returns
NULL). Otherwise, what good is a type safe program that crashes due to a heap
overflow? If this is the wrong kind of question, I'm listening.

~~~
bryanlarsen
Where are you using this? Embedded? By default Linux is set to over commit
memory, so malloc always be succeeds even if you're out of memory.

~~~
SeanDav
> _" By default Linux is set to over commit memory, so malloc always be
> succeeds even if you're out of memory."_

How does Linux handle 1 trillion mallocs of 1MB, on a 8GB system with 500GB
hard drive, without failing?

There is nothing magic about Linux, it still cannot allocate and use memory
which physically and virtually is simply not there.

~~~
steveklabnik
When overcommit is set, it's not malloc that fails; the OS kills your process.
So, checking the result of your allocations will never let you handle this
kind of error.

~~~
rcxdude
Though if the allocation is ludicrously oversized (not sure the threshold, but
2x present memory will do it) it will just straight up fail without over-
committing.

------
emerged
I feel that Rust is something of a religion at this point. Curious how many
downvotes I might accrue for expressing that perspective. Might be wrong, but
my sense has been that challenges with Rust are written off in an off-hand way
and its features exaggerated.

Glad to see promising new languages pushing progress forward, though. I'm head
high in C/C++/asm for decades so many of the complaints about those languages
fall flat with me since I'm over the hurdles via sheer years of experience.

~~~
mikebenfield
> challenges with Rust are written off in an off-hand way and its features
> exaggerated

I think people are pretty straightforward about Rust's challenges. The borrow
checker can be a hurdle and probably needs more work (although it is getting a
big boost soon with non-lexical lifetimes). Compile times are not great. The
macro system is powerful but can be confusing. People from some mainstream
languages may have trouble wrapping their heads around the type system.

Regarding its features, these attributes:

* it can do low level, fast code;

* it's not garbage collected, but is memory safe;

* it has zero cost abstractions (and, conversely, it _doesn 't_ have ridiculous, unnecessary, slow abstractions);

* it doesn't ignore the last thirty years of innovation in programming languages (by which I mean it has a modern type system, closures, variant types, and a few other things);

* it doesn't have 30-40 years of cruft;

* it has a standard and very high quality build/packaging system;

* its developers are thoughtful and have good taste;

* it has a reasonably big community and has a good chance of catching on even more widely;

are huge. You say you think its features are exaggerated, but to my mind,
what's to exaggerate? That list looks pretty great to me.

~~~
posterboy
half of these "features" are your own subjective opinion

~~~
mikebenfield
They are? I'll grant you the one about the taste of the developers. Which
other one is opinion?

~~~
posterboy
after zero cost abstractions you got carried away.

------
derefr
Question: does a hash table with linked-list chaining count as a non-niche
use-case for linked lists, or is the linked list just counted as an
implementation detail rather than a separate data structure in that context?

(Though, Rust's own std::collections::HashMap just does robin-hood hashing
instead of separate chaining, and everyone's just going to use that, so "what
data structure should a separately-chained hash-table use in Rust" is mostly a
moot question.)

------
PaulHoule
Makes Rust look hard.

~~~
steveklabnik
Getting this code right is hard in many languages in Rust's space; Rust just
forces you to get it right, whereas they let you get it wrong.

