
Linked Lists Are Still Hard - signa11
https://brennan.io/2017/04/21/linked-lists-are-still-hard/
======
joosters
Ah, Hacker News comments, you are so predictable.

Naturally, this article inspires multiple comments declaring that the way the
kernel allocates or arranges memory is so _obviously_ stupid and wrong, and
that they know the right way of doing it.

I love the _certainty_ of Hacker News posters! So confident in their own
abilities and opinions. Never a moment of concern, introspection or doubt.
What a world to live in!

Also, has someone thought of writing a 'it should be re-implemented in Rust'
comment auto-generator? Or is that already running here?

~~~
fao_
You might like [http://n-gate.com/](http://n-gate.com/)

It's a nice antidote to Hacker News, on days when Hacker News is too... Hacker
Newsy :^)

~~~
Jaruzel
I don't think my eyes are ever going to recover after reading[1] the
n-gate.com about page. :O

\---

[1] 'reading' being a highly optimistic term.

~~~
fao_
Really? I find it easier to read than most modern "well designed" websites.

------
tytso
The real problem is that developer (who is a graduate student) doesn't
understand how to do use the kernel structure. You can't just have "your own
linked list". In order to do what you are doing, you have to add a list_head
structure to the struct sock structure just for that linked list, and that
struct list_head has to be initialized when the struct sock is initialized,
and list_del() is called when the struct sock is released --- oh, and you need
to handle locking properly to avoid races between adding a struct sock to the
list and removing it. Or you can use RCU, but you do need to avoid races one
way or another.

It's not hard, but you do need to understand the idiom. Why isn't there safety
mechanisms? Because the kernel is optimized for speed, and instead of adding
run-time checks which can be expensive, there are various debugging tools,
including slab poisoning and KASAN to find such issues.

~~~
brenns10
Another possibility: the developer understands these things, but wrote a blog
post with an explanation that omits some details in order to appeal to a
broader audience.

------
qeternity
I am hardly a kernel programmer, but it seems the real issue here is still
that the release_sock() call isn't actually doing what he assumed.

~~~
brenns10
Yeah, this is the real problem. To be clear though (the article skimmed over
lots of details to remain generally accessible), I was not discussing the
release_sock() function which releases a lock on a socket [1]. Instead, my
code registers a set of function pointers in a struct [2]. While
release_sock() does have documentation, the function pointer of the same name
does not, and the only examples I have are other files of code within the
MPTCP protocol implementation. The callback is related to the function, to be
sure, and so I should have connected the dots, but this is how we learn.

While I was making a baseless assumption, I like to believe that it was
baseless because there was not enough available information, and not because I
was ignoring obvious sources of information :P

[1]: [http://lxr.free-
electrons.com/ident?v=4.1;i=release_sock](http://lxr.free-
electrons.com/ident?v=4.1;i=release_sock) [2]: [https://github.com/multipath-
tcp/mptcp/blob/e77e2ca0b5339af2...](https://github.com/multipath-
tcp/mptcp/blob/e77e2ca0b5339af20b67d6c8d81169470900e6da/include/net/mptcp.h#L230)

------
andreasvc
Why would one ever prefer a linked list over a dynamic array? I know about the
asymptotic performance, but if you take actual memory performance into account
(locality, cache, pointers are slow, etc), it seems dynamic arrays should be
simpler and perform better.

~~~
chriswarbo
> Why would one ever prefer a linked list over a dynamic array?

(Singly) linked lists are trivial to implement in an immutable way (the
article talks about the difficulty of _doubly-linked_ lists, implemented in a
_mutable_ way).

It's trivial to have linked lists share a tail; this makes datastructures like
cactus stacks really easy.

Linked lists are amenable to reasoning by induction (e.g. in Coq, Agda, Idris,
etc.).

Linked lists are amenable to lazy algorithms (e.g. iterators/generators).

Linked lists can be heavily optimised, e.g. using stream fusion.

That's off the top of my head. Note that in many cases it might be preferable
to use linked lists in the source, but have them compiled to some other
representation like arrays (or, in the case of fusion, a single tight loop).

~~~
biocomputation
* Also, lists are a reasonable way of managing unrelated types.

------
dooglius
There seems to be a lot of jumping to conclusions and knee-jerk reactions in
this thread. As a C developer, and having worked on Linux drivers, the
description here is quite strange; memory is always freed by the same "layer"
that allocated it. The implementation of release_sock ([http://lxr.free-
electrons.com/source/net/core/sock.c#L2536](http://lxr.free-
electrons.com/source/net/core/sock.c#L2536)) appears to have a callback, and
appears to not free memory, so it doesn't look like we have the full story
here. What is clear is that the author misunderstood the "right way" to use
the API in some way, and the lack of good documentation for a lot of kernel
APIs is a big problem.

Perhaps Rust could have prevented the bug? I don't know. I haven't looked into
Rust a great deal, but the impression I get from what I've read is that it
severely limits expressiveness (read: the ways in which you can define APIs)
unless you encapsulate things in blocks annotated "unsafe", at which point you
lose any checking the Rust does. Would it even be possible to port the sockets
code to Rust without fundamentally changing the APIs and data structures? To
the Rust people: it comes off pretty badly when you throw around claims like
"C effectively is unsafe everywhere" because "unsafe" in English doesn't mean
what it means in Rust nomenclature; calling something "unsafe" implies it has
bugs or security holes. It is perfectly possible to write bug-free secure C
code, and also perfectly possible to write buggy or insecure Rust code. What I
think you mean is: C code is roughly equivalent to what you'd find in a Rust
block annotated "unsafe", and thus cannot perform some of the compile-time
checks that Rust can. Calling it something else like "lowlevel", "expressive",
or "tricky" would be better.

~~~
brenns10
> so it doesn't look like we have the full story here

I omitted so many details writing this -- otherwise the main story would have
dragged on way too long. My comment here [1] gives some context.

[1]:
[https://news.ycombinator.com/item?id=14184148](https://news.ycombinator.com/item?id=14184148)

------
noelwelsh
"I guess the moral of this story is that tools are excellent, and you should
probably use them."

This is, in my opinion, a rather generous interpretation of events. What I see
when reading this is a system (the Linux kernel and surrounding infrastructure
like the C language) that is very poorly designed.

For example, resource allocation is a fundamental issue and a constant source
of problems. We've known this as a field for a very long time. Why are
fundamental things like resource allocation at least not standardised in the
kernel? Better would be to enforce safe resource usage by static checking.
Rust is an example of what can be done here, but much more is possible. (I'm
not trying to suggest the kernel should be rewritten in Rust. I'm just using
it as an example that resource usage [in particular, memory allocation in
Rust] can be done in a way that prevents errors.)

Burning huge amounts of human effort is a popular approach to software
development but I really hope we can do better in the next few decades. The
worst thing about the current situation is the programmer blames themself
("This one kept me up until 3AM. In hindsight, every bug looks simple, and
when I figured it out, I was embarassed that it took me so long") rather than
wondering why this problem is even possible in the first place.

~~~
foldr
One issue with Rust (which I know you weren't actually suggesting for use in
the kernel) is that it's difficult to implement recursive mutable data
structures in a way that makes the borrow checker happy. Recursive mutable
data structures are pretty common in the kernel, I should think.

~~~
deckiedan
I thought the idea with rust was that it was low-level enough that such
problems could be expressed with zero overhead in an unsafe block, and then
given a safe interface for users further up the chain?

There's nothing wrong with scary-potentially-misused-unsafe stuff as long as
you can isolate it and therefore less likely misuse it unsafely...

C effectively is unsafe everywhere, and other languages that are safe are non-
zero-overhead, or so far up the stack that they cannot be used in
kernel/system level stuff.

Is that right?

~~~
steveklabnik
Yes, your parent is saying that it would be nice if Rust's safe subset could
understand this, so you wouldn't need to resort to unsafe. You can absolutely
do this stuff with unsafe, it's just, well, unsafe.

~~~
deckiedan
How could a safe system actually do it? Without some kind of run time garbage
collector? Doesn't this kind of problem really require run time (either manual
or automatic) care of memory by definition?

~~~
Manishearth
You can do it inefficiently and kinda manually in rust via weak pointers.

Otherwise GC is basically the only way in rust. You can have tightly-scoped GC
in rust that doesn't affect the runtime of everything else, though. No good
such GCs exist right now.

There are some type system tricks in hypothetical type systems you can do to
make it possible to deal with these without runtime tracking. You can use the
type system to ensure Detach traits get implemented, where Detach is for
breaking cycles in a pass before destruction.

------
nickpsecurity
"It turns out that MPTCP sockets are allocated using a slab cache. I found
this out early on, but forgot about it nearly as soon as I found out."

The real problem isn't linked lists. It's that the author learned how
something worked (the spec), forgot about that, and then essentially coded
against the wrong spec. The title might have instead been: "I forgot something
and broke a kernel." Much more accurate given people who know they're dealing
with slab allocators, linked lists, etc usually use them correctly. They can
also sleep a bit better instead of being up until 3am debugging.

------
ReligiousFlames
There are numerous, tested macro libraries to prevent reinventing common data-
structure/algorithm bugs.

Here's one:
[https://github.com/troydhanson/uthash](https://github.com/troydhanson/uthash)

------
monomaniar
see the resume of the author. good boy.

------
grabcocque
A linked list is almost certainly not what you want either. They're so deeply
cache-unfriendly that their appearance in kernel code should be viewed as
suspect.

~~~
UK-AL
I was under the impression, that linked lists are quite common in the Linux
kernel.

~~~
huhtenberg
They indeed are.

