
Learn Rust with entirely too many linked lists (2019) - pcr910303
https://rust-unofficial.github.io/too-many-lists/
======
pornel
Learning Rust by writing a linked list is like learning Python by writing a
CPython extension. Sure, you'll learn a lot, and it may even be useful, but
it's not the typical experience of using the language.

I've seen people completely confused that such simple "CS 101" thing is such a
mess in Rust. But nobody actually writes linked lists in Rust:

• The borrow checker wants to have clear single ownership, and doesn't support
reference cycles. Lists happen to have mixed ownership. I'm not sure if it's
even possible to prove at compile time that a list will never have a cycle,
but it seems at least in the Sufficiently Smart Compiler territory.

• Safe linked lists are in the std lib if you really need them, but std has
also plenty of other containers that are more efficient on modern hardware.

Compile-time safety checks can't prove all valid programs are valid (halting
problem). Instead of going after the enormously complex problem, the borrow
checker rules are actually pretty simple. It's a feature, not a defect.
Simplifying the problem to single ownership with shared-immutable vs
exclusive-mutable makes it much easier to reason about borrow checking. If
something doesn't fit these rules, you either use a different approach, or use
`unsafe`, and then wrap it in a higher-level safe abstraction that fits the
model.

~~~
peteretep
> But nobody actually writes linked lists in Rust

It feels entirely like you didn't even start reading this. He spends a whole
long first page bitching about Linked Lists and why nobody uses them in Rust.

~~~
cormacrelf
I come back to this from time to time as a reference for doing the uglier
things in Rust. Writing Iterator<Item=&T> with raw pointers, recursive drops,
etc. It's better than reading the std source code, because it's got a guide
right next the code. I think I'm using it as intended. The title might be
suboptimal/confusing for what it's trying to do. It's less a resource for
newcomers than an illustration of someone bumping their head against the wall
to do hard things.

------
Animats
Doubly-linked lists are hard with Rust's ownership system. I've argued for
Rust having backpointers as a built-in type the borrow checker understands.
You have to maintain the invariant that if A points to B, B points back to A.
A is an owning pointer, and B is a non-owning pointer locked in a relationship
with A. Easy to check if the language lets you say that's what you're doing.
Rust lacks that.

If you have safe backpointers, most tree-type data structures with
backpointers can be constructed. A nice feature to have.

~~~
magicalhippo
As a non-Ruster, any particular reason they didn't include this? A lot of the
code I've done over the years involve graphs, which have a lot of circular
pointer and/or backpointers. I find it kinda weird they'd make such a common
thing a PITA in a new language.

~~~
dodobirdlord
Rust's ownership rules aren't for the purpose of making the user's life hard,
they are the least restrictive system that could be devised that allow the
compiler to uphold Rust's safety guarantees. It was not too long ago that it
was common knowledge that a programming language either has a garbage
collector or has manual memory management, or possible both. Safe Rust has
neither, but not without the effect of making awkward certain kinds of
programming that are fundamentally difficult to make safe.

The Rust question to ask here is: Do you really need _pointers_ ,
specifically? Or would some other reference mechanism work? You could put all
of your graph nodes into a linear data structure like an array, and have them
point to each other by holding a list of indices instead of a list of
pointers. Or you could give all of your nodes unique keys and keep them in a
table, and have them hold references to each other by key. The compiler will
not try to prove the correctness of your graph algorithm, and in the event of
programmer error that leads to dangling references your program will have to
handle the scenario of following an index or a key and not finding a value, so
bugs will not introduce memory unsafety.

There's also ongoing work on memory arenas in the nightly compiler, and I
believe some libraries. Putting a graph into an arena is a good way to appease
the compiler, because the entire arena will be freed at the same time,
ensuring that hanging pointers will never exist between graph nodes.

~~~
dmitrygr
> You could put all of your graph nodes into a linear data structure like an
> array, and have them point to each other by holding a list of indices
> instead of a list of pointers

Rust practitioners keep proposing this solution. It is like you have never
heard of caches or do not understand that modern CPUs have multiple special
prefetchers for pointer chasing and no prefetchers for "rust fanatics"

This solution will be SIGNIFICANTLY slower on any modern CPU. By a wide
margin! Since you'll keep having to wait for main memory as the cache
prefetchers have no idea about this insane scheme and will not prefetch the
data for you. They will for actual pointers.

~~~
adito
> This solution will be SIGNIFICANTLY slower on any modern CPU. By a wide
> margin!

I'm not an expert and would know more. Can you point out on any benchmark
demonstrating those claim? Would love to see the actual number.

~~~
fredsanford
Search for ArrayOfStructures or StructureOfArrays to see benchmarks similar to
what you're asking for

------
kybernetikos
I really value this guide.

Writing data structures in a language is one of the standard ways that I
convince myself that I understand it. Rust really messes hard with that
mindset.

This guide talked me through all the things I'm not used to worrying about in
garbage collected languages, showed me the error messages that I'd been
banging my head against, explained what they meant, and did so with humor.

~~~
Igelau
This Gonzo approach of running headlong into a bad idea is refreshing.

------
jasondclinton
I went through this a few years ago for fun. It's a great guided tour of the
compiler errors when working with complex memory safety needs. The most
important thing to know about these is that you would almost certainly never
do these in a real project: lists are in `std` but even then the vast majority
of cases should use `Vec`.

~~~
grok22
I am learning Rust due to it's memory safety features and comments like this
rub me the wrong way for some reason -- they give me an uneasy feeling that I
will run into unexpected limitations in the language because these corners are
not exercised enough. Because people who know the language are saying that the
kind of data-structures that I deal with day in and out in my day-to-day work
are not the "preferred" thing to do.

I do system software/networking software and I deal with a lot of linked lists
and trees and these comments make me feel that Rust may not be a good language
for my use-case in-spite of me wanting the memory safety features.

~~~
pas
Do you currently use C? C++ perhaps?

I mean basically no other language that currently comes to mind fiddles so
much with the basics.

In Rust, just as in Java, people write highly optimized safe and fast data
structures and others build upon that.

In C/C++ it seems every project reinvents the wheel to a large degree.

Or am I mistaken?

~~~
grok22
I use C. And you are right, in my field there is a tendency to a large degree
to re-invent the wheel of data-structures :-). Sometimes justified, but most
times not. But mostly because C doesn't usually have well-known, industry-
standard libraries for common data-structures. Or even if they do, it's kind
of trivial to implement basic data-structures (not saying they will be bug-
free!) instead of relying on some random distribution of a library from the
Internet.

BTW, most times, the problem is not performance, but tight control of memory
usage.

~~~
johnisgood
Yeah, and there is a reason for why there are not many "industry-standard
libraries for common data-structures" out there in C. I think the reason (or
one of the reasons) is that there are zillions of ways to implement them, and
"one size does not fit all". I often have to ditch the standard library's
implementation of this and that in other languages, too, because they are not
fine-tuned enough for my use case.

That said, there are lots and lots of C libraries installed by default on
Linux distributions (via the distribution's own package manager!) that are
reused by other projects.

------
SAI_Peregrinus
Lots of these comments seem confused about the purpose of this. It's not about
using linked lists (that's easy, they're in std::collections), it's about
implementing them. So the common non-kernel/embedded use cases are well
supported, just use std::collections::LinkedList! But if you're in a no-std
context then you might need this.

------
_bxg1
When I was first getting started with Rust this tutorial was eye-opening. It
gave me a clear view into the somewhat impenetrable world of working with
Boxes (pointers to the heap) directly, in the context of the borrow-checker.

It also convinced me that you usually just want to use the standard library
data structures if you can :P

------
zackify
Can’t wait to follow this. I’ve been going through the official book these
past couple weeks. As someone who’s never used systems languages (mainly a js
/ node person) I was able to write a redis-like database quickly on top of
TCP. I’ve also picked up async / await and how it works in rust.

There’s so much to learn but the compiler is surprisingly helpful. I loved
Typescript and it is like TS on steroids. When I get stuff compiling I have so
much confidence.

------
matthewaveryusa
>You're doing some awesome lock-free concurrent thing.

I can confirm lists are pretty rarely used. The four times I've used a linked
list in 10 years of professional programming were:

-quick&dirty hashmap in C

-threadsafe queue

-keeping track of a set of objects that can't be copied ( threads)

-LRU cache

In C++ at least, the nice thing about a list is you can push/pop in the
front/back without a reallocattion or invalidating iterators.

~~~
chongli
Linked lists are used a lot in game programming where the worst-case behaviour
of std::vector and similar structures is undesirable. Linked lists may be slow
for a variety of reasons but they're simple and predictable which is very nice
when you're trying not to miss the 16.67ms per-frame window.

~~~
estebank
I thought it was the other way around: games using arrays, non arbitrarily
growable vectors to precisely be able to iterate over everything quickly in a
deterministic fashion. Specially because a lot of what games have to deal with
is "visit every node".

~~~
chongli
They use arrays for static vertex, texture, etc. data to send to the GPU. They
don't use arrays for things that are being created and destroyed all the time,
such as units in a strategy game, they use lists for those things.

~~~
estebank
But do they use linked lists for them? I would have assumed you would use an
arena for something like that, precisely because you already need to iterate
over all of them and can represent any graph as keys into a generational
index, so that reading stale keys is trivially handled.

Again, I'm not a game dev and this is just my understanding after reading up
on these topics after watching
[https://youtu.be/aKLntZcp27M](https://youtu.be/aKLntZcp27M).

~~~
chongli
They certainly used linked lists all over the place in StarCraft, WarCraft,
and Diablo [1]. I don't know that many game developers would use complicated
data structures like the one you've described here. Linked lists are fantastic
for insertion/removal in arbitrary places.

Game development is not really about showing off with cutting-edge CS
research, it's about getting things done. Maybe your arenas with generational
indices would be better, but they could take a long time to figure out and
lead to a big mess that doesn't go anywhere.

[1] [https://www.codeofhonor.com/blog/tough-times-on-the-road-
to-...](https://www.codeofhonor.com/blog/tough-times-on-the-road-to-starcraft)

~~~
jfkebwjsbx
You are wrong in several fronts:

\+ Linked lists are a thing of the past.

\+ Game engines use complex data structures and algorithms. Some are cutting-
edge research implemented from papers, specially in graphics.

\+ A generational index is not complex.

Yes, game development (as opposed to game engine development or video/audio
rendering) is a mess. That does not mean the actual technical fields involved
are simple.

~~~
renox
While I don't disagree on your first two points, I disagree with you third:
there's nothing complex about generational index.

------
ubercow13
This is great as an intro to Rust, I preferred this as a Rust starter tutorial
to the main rust book or any other tutorial I tried

~~~
Waterluvian
I'm loving both. The book is comprehensive and teaches you even the most basic
concepts, so it's great for a broader skill range.

But I love this one too because it digs into some CS archaeology and that
helps me dig into the theory and history. I doubt I'll ever have to implement
a linked list but knowing how they work and are implemented and their
advantages and drawbacks is great.

Tangentially, there's a wonderful game called Human Resource Machine which
teaches linked lists and other assembly-like programming without you even
realising it.

~~~
ubercow13
The book is very comprehensive but it didn't click for me as a beginner, I
found much of the exposition left me with unanswered questions while it went
on to cover more ground. I'm not sure if those questions were answered later
but I got lost very quickly. Maybe it's aimed at people with more C++
background?

This linked list tutorial answered every question I thought of almost exactly
as I thought of them, which made it a joy to read. I also liked Rust By
Example [1] over the book. Based on my experience I'd recommend this tutorial
and then implementing something using Rust By Example as reference. But
everyone learns differently!

[1] [https://doc.rust-lang.org/rust-by-example/](https://doc.rust-
lang.org/rust-by-example/)

------
wcrichton
Another great space of exercises in this vein are in-place operations on
binary trees.

Specifically, rebalancing [1] proved particularly tricky for my students. I
think it's a good litmus test for whether you understand ownership, borrowing,
and algebraic data types.

[1] [http://cs242.stanford.edu/f19/assignments/assign6/#22-bst-
in...](http://cs242.stanford.edu/f19/assignments/assign6/#22-bst-interface-40)

------
geraldbauer
A little more light-hearted but the same style to learn Ruby with Entirely Too
Many Fizz Buzzes. See
[https://yukimotopress.github.io/fizzbuzz](https://yukimotopress.github.io/fizzbuzz)

------
naasking
I think the doubly linked list is to Rust's ownership semantics as the double
slit experiment is to Quantum Mechanics: each exposes a seemingly
counterintuitive behaviour, and truly understanding why requires a depth of
understanding indicative of true mastery.

------
caconym_
A while ago now, this was the tutorial that got me to the point where I could
implement linked structures in Rust. Highly recommended.

------
winrid
I'll try this at some point. My first couple forays into Rust made me miss
Java and C.

~~~
rascul
If you're anything like me, rust was a bit difficult at first due to
ownership, lifetimes, traits, and the borrow checker. At some point though,
things finally just clicked in my head and now I have the understanding I
needed and my difficulties went away. After that point, I began writing safer
code in every language because I was using the same ideas that rustc brute
forced into my brain. It took probably a few weeks or so before I got it. Now
I rarely have (those) issues, and when I do it's probably because I'm trying
to do something weird or I was drinking and missed a place where I obviously
should have used .clone() or maybe a reference. I'm just a hobby coder though,
perhaps at a beginner-intermediate level with a background of writing unsafe
python and c. Your mileage may vary.

~~~
winrid
I guess it just doesn't solve problems for me yet. The problems Rust solves I
don't run into. Even with big Java apps with lots of concurrency and crap I
hardly shoot myself in the foot. Oh well.

~~~
rascul
Seems to me that if rust doesn't solve problems you have any better than the
other languages at your disposal, it might not be the best choice to solve
those problems for you. That is very reasonable to me.

~~~
winrid
Right! Also I suppose I'm feeling very disagreeable today. :)

------
ajross
> Mumble mumble kernel embedded something something intrusive.

And this is why kernel/embedded/something/something developers don't take Rust
as seriously as you want them to.

You can't simultaneously declare your language the best choice for system
software development _and_ treat the long-evolved patterns of those paradigms
as a joke.

There are very good reasons for intrusive data structures, not least of which
being their ability to operate in contexts where no heap is available. If you
don't understand them or don't want to talk about them or want to limit your
discussion to situations with different requirements, then say so.

~~~
roblabla
The actual context of your quote:

> Just so we're totally 100% clear: I hate linked lists. With a passion.
> Linked lists are terrible data structures. Now of course there's several
> great use cases for a linked list: > > \- You're writing a kernel/embedded
> thing and want to use an intrusive list.

So I’ve got no clue what you’re railing about. The project specifically
acknowledges that there is a need in kerneldev for those data structures.

I’m a kernel dev using Rust for my kernel. I use both intrusive linked list
and growable vectors in it. The thing is, the sentiment expressed in the
article really resonates in me: in most cases, growable vectors are a better
choice, performance-wise.

~~~
throwaway17_17
The quote that the GP is talking about is included below, which copy/pasted
from the project page, and the Mumble mumble line is the heading for a
paragraph:

‘’’Mumble mumble kernel embedded something something intrusive.

It's niche. You're talking about a situation where you're not even using your
language's runtime. Is that not a red flag that you're doing something
strange?

It's also wildly unsafe.’’’

Also, prior to this author claims the following, where the first line also a
section heading:

‘’’ I can't afford amortization

You've already entered a pretty niche space’’’

These fiat rulings based on one an authors generalization of what is ‘niche’
are what I assume GP was commenting on. These are the kinds of dismissals that
some developers take issue with, as GP states.

~~~
saagarjha
Kernel development _is_ niche. Most of the time you don’t need linked lists.

~~~
varjag
When all you have is Rust everything starts to look like adjustable array.

~~~
saagarjha
What’s an adjustable array? A vector?

~~~
varjag
A vector is a one-dimensional array. Adjustable arrays can have arbitrary
number of dimensions, although am not sure if it's a thing in Rust.

~~~
dpbriggs
I've spent years programming rust and I'm not sure what you mean. Do you mean
arrays of tuples, or nested arrays?

~~~
varjag
I mean a multi-dimensional array.

~~~
dpbriggs
Thanks! Why does everything look like a multi-dimensional array for you in
rust?

~~~
varjag
An array (an ADT) can be arbitrary dimensional, with one-dimensional case
often called vector (and two-dimensional called matrix). An array can also be
adjustable, both in one and multi-dimension variants.

So a vector can be both adjustable and non adjustable, and some language do
have both versions. Some language have adjustable one-dimensional array/vector
as the only dynamic aggregate/ordered datatype.

And to answer the post I replied to originally, a vector in rust is a one-
dimensional adjustable array. To answer your question above, no that's not
what I meant, sorry for not being clear enough from the beginning.

~~~
dpbriggs
Thanks for clarifying.

I think we're just using different definitions. The only real definition of an
array I've encountered in work and school is that it's just a one dimensional
collection of elements, usually of static size. A vector depending on context
is usually the same thing as an array but you conveniently change the size
dynamically. Lists can whatever you need it to be given the context, just
needs to be sequential.

Of course you can represent higher dimension structures by linearizing indices
(x + row_size * y, etc).

I think people are getting confused as most don't consider arrays to be
arbitrarily dimensional without some scheme.

Completely off topic but you've reminded me of this great article:
[https://hypirion.com/musings/understanding-persistent-
vector...](https://hypirion.com/musings/understanding-persistent-vector-pt-1)

------
xlap
Basically anything published about Rust is turning me away from the language.

It could be unfair though: Technical writing (and that includes humorous
opinionated pieces) has declined dramatically in the last 10 years.

Or perhaps writing as a whole has declined.

~~~
Dylan16807
It wouldn't hurt to be specific.

