Hacker News new | past | comments | ask | show | jobs | submit login
Linus Torvalds' good taste argument for linked lists, explained (github.com/mkirchner)
717 points by mkirchner on Dec 6, 2020 | hide | past | favorite | 326 comments



I'm not sure why the article removed comments from the code and replaced variable names like "indirect" with "p". Here are the two code samples verbatim from Linus's presentation:

  remove_list_entry(entry)
  {
      prev = NULL;
      walk = head;
  
      // Walk the list
  
      while (walk != entry) {
          prev = walk;
          walk = walk->next;
      }
  
      // Remove the entry by updating the
      // head or the previous entry
  
      if (!prev)
          head = entry->next;
      else
          prev->next = entry->next;
  }
  
  remove_list_entry(entry)
  {
      // The "indirect" pointer points to the
      // *address* of the thing we'll update
  
      indirect = &head;
  
      // Walk the list, looking for the thing that
      // points to the entry we want to remove
  
      while ((*indirect) != entry)
          indirect = &(*indirect)->next;
  
      // .. and just remove it
      *indirect = entry->next;
  }


I can't help but notice how this code, as well as in the article, there's no checks for a) a NULL list or b) that the item is in fact part of the list. A simple fix is:

  while (!!walk && walk != entry) {
     prev = walk;
     walk = walk->next;
  }
or

  while (!!indirect && !!(*indirect) && (*indirect) != entry)
          indirect = &(*indirect)->next;

...because without those checks it's pretty easy to see where you'll crash... but by putting in those checks, perhaps the second method might actually be less 'elegant'.

Edit: did I miss something? Re-checking this, the second method will still crash if/when next is NULL (because you can't get the address of NULL), which means the 'elegant' seems to be a quagmire of lurking faults.


That code is not a general purpose library function; it assumes the element exists.

If you call that method and the element doesn't exist, presumably something has gone wrong already. Adding a NULL pointer check is not going to fix it. You just silently ignore the error. You'll prevent the crash, but there's no mechanism for handling the error.


And instead of adding a check and crashing visibly here you wait even longer until your program misbehaves due to the error? The longer the program runs after something went wrong, the harder it is to debug, and the more likely it becomes that you get an exploitable bug.


Where you add safety checks like that is a top-down system organization issue, not something you can couch quarterback while looking at twelve lines from a million-line codebase. There is simply no way to determine if an assert is appropriate without further context, and it has nothing to do with the choice of algorithm.


Performance is a feature and that null check has a cost. The code is correct. The null check is unnecessary. The bug is passing null in.


NULL being passed in is not the only problem. Dangling pointers, coding mistakes by other team members and race conditions during initialisation can all contribute to this blowing up. Some argue (correctly) that it's best to have it blow up, other that assertions are the way to go. I'm of the latter ideology. Make your debug build do assertions, but release code be crash proof by doing NULL checks.

There are several comments indicating the performance cost of a NULL check. Go and look at the disassembler and see the code. You should find that it equates to JZ, which at worst case costs 1 clock cycle. Am I incorrect?

So, another comment below mentioned 10 million deletions per second. Ignoring the fact that you should be using a double linked list at that point (like the kernel does), in a single threaded 2.4GHz CPU a 1 step opcode (JZ) equates to a bit over 4ms/sec. With branch prediction, I expect to be far less.

So, the NULL check is almost free. Even on embedded systems, it'd still be a good idea to keep it in, to insulate yourself from flipped bits (eg, solar flares, or degrading memory) and coding mistakes elsewhere.

In these days of security research, I strongly advise for defensive code: it means that each function has had it's edge cases thought about, which means the developer has spent time thinking about the stability of their program, which cannot be a bad thing. Don't make your code a deliberate point of exploitation.


This makes a big assumption that the null check, which is the result of a programmer error, can be recovered. What would you do once you catch the null? Log that there was a null, maybe with some debug information, then halt? I personally would prefer a proper crash handler to do all of that for me.

I guess my question is, what's wrong with crashing, in the case of a real screwup, where something has gone horrible wrong, assuming you have a proper crash handler?


it's one instruction in the generated code and we can reasonably expect the branch predictor to do the right thing, but branches and error handling can inhibit vectorization which is a much higher performance cost. While in this particular case you probably aren't vectorizing much, it is not always the case that error handling is almost free


Safety is a feature and that null check prevents bugs. I'd rather have slow but safe code rather than fast but buggy code.


If the null check silently ignores the attempt to remove a non-existent item from the list, then the null check did not prevent a bug, it merely hid it.

IMO, even though this is C, the Zen of Python still applies, which states:

> Errors should never pass silently unless explicitly silenced.

Its the caller with the bug, not the library code.


Can we all at least agree that if the language could express non-nullable pointers, we'd all be better off? (Something like `Option<T>` in Rust, or Swift's `?` sugar for their `Optional`, etc etc)

Then the type of the function would just declare a that the pointer can't be null, and code which sends a nullable pointer would refuse to compile.


Adding checks creates more code branches and tests. Passing null in is the bug.


Can you show it has any impact on benchmarks? This is something speculative execution should take care of that quickly?

"Premature optimization.... " you get the drift.


Make it work

Make it right

Make it fast

This is an ordered list, and the people who forget that make a lot of work for the people who don’t


I would argue that if you have code that is trying to remove a non-existent item from a list, then making it silently fail is only giving the appearance of making it work.

It's definitely not right, while probably not really working either. The immaterial performance gain from removing the null check is completely irrelevant.

It's a foot-gun for sure, but it's up to you to not pull the trigger.


Are you arguing that we should leave known crash bugs in the code?


I'm arguing that we should use asserts or similar defensive coding mechanisms to ensure that preconditions are met instead of merrily continuing to run our program with cascading error effects.


I've had this argument a ton with people used to new high level languages that make null safety "too easy" like:

  next?.doThing()
Where doThing is never called if next is null.

Languages that do this use "scary" operators to crash on null:

  next!!.doThing()
And it's drilled into people's heads the latter is a Bad Thing (tm)

-

You really need to consider context in null handling.

Imagine an app that alerts a nurse when the patients heartbeat is out of range.

In an application where the UI context might have been closed out, it's common to see

  someUiContext?.showAlert()
But what actually happens if the context is gone?

It's better to crash and have part of your startup procedure be communicating that a crash occurred, and the doctor should check that something went wrong, than silently continuing.

-

The problem is when you tell people this, the kneejerk reaction is always "are you saying we intentionally add crashes"!

Because safe null handling was specifically added to avoid the situation where that crashes...

(For example, if the app was a news reader and the alert was "Article failed to load", you wouldn't want the app to crash just because the user left a certain page before the alert was shown)

But I think the pendulum has swung too far at this point, people are so used to just sweeping nulls under the rug, and it's not great for finding issues


Yes - and that is the point the OP you responded to is making. This is not a generic library function. It used in a specific setting where preconditions exist and presumably are checked _prior_ to calling this code.

Sure you could make the argument that we don't _know_ they are being checked, but it's a pointless discussion. Who cares? _If_ preconditions are met, this code is safe, if they aren't, it's not safe. Since we don't know one way or the other, there's no point in discussing further. The kernel developers know their stuff...


Imo every piece of code (for reasonable definitions of "piece") is supposed to check its own preconditions and not rely on the caller to check them.


It's a bad idea, especially in kernel/low-level/performant code. It's OK to check some values at specific points (like when a value is passed from user space), but in general it's bad practice to check it at every single function. You trust in your program flow.

Imagine if at every function down a complex stack you go with:

    if (!ptr1 || !*ptr1 || *ptr1 > 5 || param1 < 0 || param2)
        /* etc.... */
    {
        return NULL;
    }
(used arbitrary names and values).


This is nice in theory but a bad idea in practice (as a blanket statement, I am all for checking preconditions in general). An easy example is binary search, which I think is a reasonable "piece" of code by your definition. One should never check its preconditions _inside_ the binary search function (that the list of elements being searched is partitioned by the search predicate). Checking the precondition is O(n) while the algorithm is O(lg n).


You're right, but checking for null is not terribly expensive compared to iterating over a linked list.


The thing is that you aren’t really advocating for just that... Since this is pretty general support library code, the result of this philosophy is defensive checks in all support functions and that means checks every spinlock, mutex/semaphore, and god knows how many other places... everything really. We also would want these checks at multiple levels since we shouldn’t trust that other people checked things. This would have a significant effect on performance and many would prefer their system not to run however-many percent slower. All to avoid a possible issue that can’t be very large or systems would be kernel panic’ing all over the place.

From a black box, zero-knowledge, perspective it’s maybe worth remembering that code built this way successfully runs mission critical systems all over the world every day. Thoughtful people would have switched to other (slower, more pedantic) systems if doing things differently was a real overall life improvement. People have had the choice, there are plenty of OS’s out there... some far more formal/pedantic. Linux wasn’t the safe choice back in the day, it was just better by several important metrics consistently over time.

Perhaps these insanely experienced kernel devs know what they are doing.


This isn't possible in many cases - consider the simple C library function strlen, one of who's preconditions is that it must only be called on a zero-terminated string. There is no way to write code in strlen to check that this is true.


Which is one of the reasons why coding standards like MISRA forbid using strlen (at least in C++, I guess even MISRA doesn't want to force you to write safer C with sane string types).


Asserts have run-time cost. What you need is to ensure ensure those preconditions are met in the first place, statically, by formal methods for example.


Which is ideal, but generally beyond reach for now. So assertions it is.


It’s not a bug, in this case. It depends on the behaviour the library is specified to handle, but if you ask me to remove a specific item from a list, and that item isn’t in the list, what should I do?

I can fail so you can catch it, or if you don’t catch it, you’ll at least see exactly what went wrong. You expected the item to be in the list and it wasn’t. Your preconditions were wrong from the beginning and you have an (unknown) bug in the code.

The alternative is that I can’t find the item and I silently suck it up and don’t notify you. If that’s the api we’ve all agreed on, that’s fine. It was on you to check for the thing in the list first and then remove it. But it’s a bit weird if you think about it - now you need to traverse the list twice. Once to check for the existence (and fail out yourself) and once to remove.


I think a better approach is to return a bool or something like that, rather than crashing. But I agree that silently handling this case is the worst of the options.


What you are espousing is known as “defensive programming” which is very much a bad thing (tm).


I'm interested in hearing more about why defensive programming is bad. Do you have some links for me?


Clearly you've never worked with contract developers whose only purpose is to write a crap ton of code to get a "feature" out that crashes as soon as it encounters the real world. As they say, "doveryay, no proveryay" and it's served me well.


> As they say, "doveryay, no proveryay"

I'm sorry if this might be common knowledge, but I've honestly never heard this and can't understand it.

Is it saying that it's better to do something than to (formally) prove it, like the opposite of (I think) Knuth's famous quote?



Then it ought to document that it assumes that entry exists and that this walk will not hit the end of the list. Which also implies the list is not empty.


It should, yeah, but if it can safely assume there will be no nulls, it will save a nontrivial amount of defensive programming checks for each cycle. And IIRC, the Linux kernel uses tons of linked lists, millions of items per second sometimes; any operation that can be removed will make a big impact. Relatively.

But for general purpose code, e.g. libraries where performance isn't super important but stability is, more defensive programming would be a thing to do.


I believe the implicit requirement is that the item is known to exist within the list. With that requirement in place the extra guards aren't requisite (they're implicit) and in kernel code this probably offers a performance edge and is a reasonable requirement. Either having an implementation that has an extra check once at the end of the list, or that verifies the presence of the item and passes that node through to a deletion subroutine, would be good alternatives but still utilize the same pointer to structure view.


That's a pretty big freakin thing to leave out, though. Code that depends on its context is, some would say, bad.


Simply put, safety slows code down. It's a matter of whether you know what's happening underneath or not. The more you try to make C completely safe, the more you slow it down and therefore remove the need to have written it in C in the first place. Whether that's a good thing or not is an exercise for the implementer.


True, but costs of comparison to NULL aren't massive. It is one of the fastest things out there.

It was my experience when tuning a linear algebra library (just for internal use) that such comparisons are almost unobservable in total performance tests.


If you're writing a linear algebra library, then the bulk of executing code should be floating point vector operations in tight loops with known bounds. A NULL pointer check in the prolog where you set up the loop is going to be negligible.

This is very different from kernel code, which by its nature isn't very computational but doing resource management all day long and pretty much all it does is stuff that looks like walking linked lists.


That was a very specific linear algebra, we toyed around with huge but sparse matrices. Not anything graphics-related, rather number theory-related. But there was a lot of integer comparisons in there.

This was around 2001-2002 anyway. Things might have changed.


> costs of comparison to NULL aren't massive

Depends on how often you are doing the comparison. Is it 1 time/second, or 10 million times a second?

There is a difference.

If the code above is the one in charge of putting/removing network packets in a queue, or putting threads ordered by priority for the scheduler, then you should consider the side effects of checking for NULL.

If you are going to implement this function in a library for Jimmy The Programmer[1], then check for NULL.

[1] https://blog.codinghorror.com/new-programming-jargon/


It's written in C because C is popular, has platform support and the code was originally written in C, not because the kernel should be a security nightmare.


That's irrelevant to what Linus is saying. He's not saying "this is good code, but only with the caveat that it's written in C and run in the Linux Kernel". He's saying that changing it to make it far more difficult to read and modify, but cleverer, has made it better code in general.


See, and this is where I (and I'm guessing you) might beg to differ. I'm a web developer, so the vast majority of my time is spent working in JavaScript. Our team has been working in ES6 for the past ~2 years. Our lead developer just LOVES him some ES6 - destructuring, aliasing, lambdas, you name it, and he's all-in.

Me, though? Well, let's just put it like this: when we find some framework-level bug that's written in "clever" ES6 syntax, our first step in debugging is almost ALWAYS to rewrite the given function traditionally, without any of the ES6 shorthand. And the reason we do that is because reading and debugging a whole stack of anonymous lambda calls is a PAIN IN THE ASS. Or figuring out where a certain variable is coming from when someone uses overly-complex destructuring syntax to magically pull a value from deep within a nested object.

I mean, don't get me wrong, I do like and use almost all of the modern ES6 niceties, but I also feel like it's much more difficult to parse and understand code compared to what we were all writing a few years back. People will, I'm sure, be arguing about what constitutes "good code" for decades to come, but to me, when working in an evolving codebase, especially with other people, plain ol' human readability is paramount. If people can't figure out what your code is doing without throwing in a breakpoint and stepping through line-by-line, you've failed at writing good code. And this will be my opinion right up until the day humans stop writing code by hand.


This is why Linus' code shouldn't be copied verbatim and used in the real world. The Kernel is an extremely isolated and contained system.

At the very least, the code should add comments on the the assumptions that are vital...otherwise n00bs will copy the code as gospel and run into all sorts of problems.


The actual code in the kernel is well documented (and different, lines 103-149): https://github.com/torvalds/linux/blob/master/include/linux/...


You don't need to check !!indirect. There is no cases that the indirect pointer is NULL:

  while (!!(*indirect) && (*indirect) != entry)
    indirect = &(*indirect)->next;
Also I would rather use * indirect instead of !!(* indirect).


correct. I was just being overly pedantic


Why the !! ? I thought that only makes sense in JS.


I thought that only makes sense in JS.

In C any nonzero value is considered "truthy", and on most architectures NULL is defined to be (void * )(0) or similar. The logical not operator AKA bang operator will replace truthiness with 0, and falsiness with 1. So applying it twice collapses all nonzero values to 1.


Well, yes, but that does absolutely nothing in this code, as it is immediately used as the argument to an if statement.


Indeed, I was only commenting on the JS vs C part.


I'm not sure about the current state of compiler optimisations, but IIRC in the olde-days, a not not in an if() statement would assemble to JZ, saving a clock cycle (& an opcode?), and wouldn't need to stall to load in the full width of the register. Today's branch predicting compilers are beyond me.

I still use it as a clarification that it's a deliberate boolean operation, rather than implicit.


I highly doubt that any modern compiler would generate different code with or without the !!. Not even sure why an old compiler would do that? It's the exact same semantics, surely.


That had nothing to do with pedantry.


Thank you, I found that much easier to read than the article posted.


Much clearer when read as Linus intended.

Reminds me of the editor(s) who helped "fix" bukowski's poems


I’m not a fan of everything Bukowski wrote, but I wouldn’t try to censor him so that little Jimmy could read it, and I liked the movie Barfly.

Similarly, I’m not a fan of everything Linus wrote, but I wouldn’t enforce bad CS101 code on him so that little Jimmy could read it, and I like Linux.


Who is little Jimmy?


Jimmy is a diminutive form of James. It's the sort of thing where parents would call a kid "jimmy" and when they grow up they will ask people to call them "Jim" or "James."

Adding "little" in front just stresses the fact that we are talking about a kid, so the references are about dumbing it down for kids.

Johnny is probably a bit more common than Jimmy, but I've seen both. See e.g. [1]

1: https://en.wikipedia.org/wiki/Why_Johnny_Can%27t_Read


The poster-child for the movie "Think of the Children!"


It refers to an old American comicstrip with a little kid that didn’t know much. (I think anyways)

It just sort of means any below-average person.


A 1992 poem Bukowski originally entitled “stone tiger, frozen sea,” was completely reworked with just two lines left unchanged. Its new title: “like a dolphin.”

https://www.pbs.org/newshour/arts/poetry/bukowksis-poems-wer...


It's a small class in good coding 101. Concise, well commented/documented, good variables names vs verbose in explanation , terse in code.


Hacker news would benefit from code blocks.


Hacker news has code blocks. Have the block indented by for spaces.

    this line has 4 spaces in front of it.
What Hacker news could use instead are quote blocks. Now people often use the code blocks as quote blocks.


HN codeblocks are just two spaces indented. Any extra spaces indent further.

  This has no superfluous indentations.


What if entry doesn't exist in the list? You need a special case to handle that too.


Normally with an identity-based API like this, you can put it as a requirement that you must not remove items other than those which actually exist in the collection. Another example of an API that doesn't check for presence: c++ iterator-based removal. If you ask a C++ list to remove an iterator element outside its range, it may crash the program. See the exception safety section of this documentation: http://www.cplusplus.com/reference/list/list/erase/ If you look at most "remove this object from yourself" APIs in low-level code, it behaves this way, because it's unnecessarily complex to check for presence when that is a fact the program can condition on instead.


As the other person said, it's not really about complexity. Complexity isn't all that different here; personally I'd find something like the following to be simpler (I imagine others might disagree):

  remove_list_entry(entry)
  {
      for (p = &head; *p; p = &(*p)->next)
      {
          if (*p == entry)
          {
              *p = entry->next;
              break;
          }
      }
  }
However, what the choice does objectively impact is performance. Being able to assume the object exists allows you to avoid checking against NULL, which removes instructions in the loop. I'd guess they anticipate this code might be called from performance-critical code, and that might be why they coded it like this.


Not just an instruction, but a branching instruction.

Branch predictors will probably do pretty well on this, but still.


With linked lists, memory latency dominates, since you're chasing pointers all over the heap. I guess the null check costs nothing most of the time. On the other hand, removing random objects from lists that you don't know are actually in those lists is a smell.


> I guess > most of the time

Two terms that would not fly in kernel level programming; with kernel programming, low-level libraries, you want both optimized speed and predictability.


If you want speed, you don't use a linked list in the first place.


Yeah you do - the kernel uses linked lists to store everything from running tasks to memory pages. Insertion and deletion of entries are much faster in a linked list than in an array, and that's a very performance-sensitive operation in the kernel.


> Insertion and deletion of entries are much faster in a linked list than in an array

That's a really broad statement. Arrays can get surprisingly big while being faster. Performance-wise, you'd usually want to use arrays up to a certain size and then link between entire arrays. I'd say the biggest advantage of a standard linked list is that it can be intrusive.


> I'd say the biggest advantage of a standard linked list is that it can be intrusive.

That's one big benefit. I teach operating systems and, when the semester has been good, we dig into physical memory dumps with a few python scripts, and show the students how to parse Windows process list following _EPROCESS structures.

That's one of the big things they find themselves surprised every year: that the process list is a chain of LIST_ENTRY structures embedded within the _EPROCESSes[0] (or any other structure that is doubly-linked in memory). And LIST_ENTRY is just 2 pointers going hand in hand with list_entry->flink and list_entry->blink attributes and nothing more.

[0] You can read some about this structures here https://www.nirsoft.net/kernel_struct/vista/index.html but beware this is based on Vista and a bit old. Kernel data structures vary with versions of the OS (and not even major versions at that) and target processor (x86, x86-64, ARM too would be reasonable).


Unnecessary is not really the correct term, it’s one of a huge number of tradeoffs. The you don’t want it for safety critical systems, but Linux isn’t designed to be used for such things.


Well, both implementations will crash in that case anyway. You can take that case as out of scope of this discussion.

The purpose of the OP and the Linux's talk is to show a better way to walk through linked list, not to show a full implementation.


Perfect. I almost commented to complain that the “elegant” version is harder to read and understand. This is much much better. Comments are important!


I don't think it's the comment.

Usually variable names in a loop is the object being manipulated not the address of that pointer which means p isn't a pointer to the object which makes it completely confusing.

If the article used pp which is the typical way you describe a pointer to pointer it'd help


This comment seems wrong to me:

      // The "indirect" pointer points to the
      // *address* of the thing we'll update
The "indirect" pointer points to the thing we'll update. See at the bottom, it's updating *indirect, so "indirect" points to the thing being updated. On the other hand, "indirect" points to the address of the thing we'll remove. There's a specific item being removed, and there's a specific thing that will be updated; the thing being updated comes immediately before the thing being removed; they're not the same thing.


The comment is correct. The "thing" is the variable that holds the "entry". The entry is removed by updating the "thing".


Agreed. The 'indirect pointer' points to the memory address of the previous 'next' (or the head). So, as long as neither are NULL, then dereferencing the pointer is the actual head (or the previous 'next').


> The 'indirect pointer' points to the memory address of the previous 'next' (or the head).

I don't think so. I think the 'indirect pointer' points to the previous 'next' (or the head). It doesn't point to the address of the previous 'next' (or the head). What you say is adding an additional level of indirection that doesn't exist.

Reality:

indirect -> previous next -> first element

What you're saying:

indirect -> address of previous next -> previous next -> first element

In reality indirect contains the address of the previous next, but it doesn't point to the address of the previous next.


The "indirect" variable does not point to a list entry directly, like what "head" and "next" variables do, that's why it's named "indirect".


I agree:

indirect -> previous next -> first element

"indirect" points to "previous next". "previous next" isn't a list entry. "previous next" points to "first element", which is a list entry.


> The "thing" is the variable that holds the "entry".

What does "holds" mean? If holds means actually contains the memory, then I don't see the distinction between "thing" and "entry". They both refer to the exact same piece of memory.


What happens at the end of the list? Is next null? And if so, how do either of these methods behave?


This is a circular linked list, where the ->next pointer points back to the head of the list. It's helpful for iteration tasks, since you can always walk the full list when you get access to one node.

In most cases you use a linked list like this, you don't care about the order of things, just that you can iterate over all items in the list.


No, it's not, as can be seen from the diagrams in the article.


Question for experienced C programmers (I'm not one). The comments for remove_list_entry strike me as fluff, only suitable for a didactic piece.

Would you find the comments in the second version helpful, or should they also be removed?

Edit: let me lay my cards on the table. If the comments really are necessary, it doesn’t seem elegant. I’m pro-comments, but that’s because not all code can be readable and elegant all the time.


While I wouldn't write this much comment, I can see a use for this: navigation. If I'm trying to quickly find the part I want to modify, the comments do help me skim through quickly and zoom in on the part I care about. It's the same reason a blog post is often broken up into section headings. Without the comments, I'd have to spend a few extra seconds trying to map chunks of the code with my mental model of how linked lists work. Those few seconds can interrupt the flow of work.

So I'll disagree with the notion that only illegible or unreadable code needs comment.


For my tastes there is way too much whitespace and I usually only use multi-line comments to describe the high level algorithm for a full function. I prefer to pepper single-line comments which describe the sequence of events in a human-readable way.

I've been C/C++ for about 25 years.


My taste is that function level comments describe the "what" and inline comments describe the "why".

So at a function level the comments is giving a highlevel description to the reader as to the functionality contained, and at the line level comments exist only to say describe the programmers intentionality, why it was implemented this way rather than some other way.

I have generally found (there are always exceptions) that if you find yourself needing "what" comments within a function you should be considering breaking it into smaller pieces.

I've been being paid to write c/c++ for about 25 years


The main takeaway here might be that two 25 year veterans both make function level and line level comments. The particulars are somewhat of a matter of flavor preference :)


That is not a particularly useful takeaway, especially when one of them is pointing out that particular comments in a function are problematic.

The particulars matter.


The comments certainly don't detract from the code in my opinion. They may not be _helpful_ per se to an experienced developer, but there may be a chance that a more inexperienced developer will read this code in the future, and it could benefit them.


The top pattern is extremely common so comments are somewhat unnecessary since everyone would know what you're doing. In the second one, comments would be less necessary with a type annotation and perhaps a better name. Maybe "next_ptr" instead of "indirect"---all pointers are indirection, so that name is redundant. But the comments are necessary, to me at least.


All pointers are pointers too, so I don't see much of a difference between "indirect" and "next_ptr", other than the "next" part. In similar situations at my job, I've drawn a small ASCII diagram in the comments, and named the variable something like "splice_point", with a corresponding label in the diagram.


Well "next_ptr" is also a bad name, I agree, but I'm trying to name it something that indicates "this is a pointer to the 'next' pointer in the linked list".


How about next_node?


I would expect to see barely any comments in the "in production" version, other than a comment like:

    // indirect is either &head or &(node->next), so *indirect is equivalent to "next node"
Someone who knows their C should be able to figure it out in a few minutes by just reading the code.


The article refers to p about 30 times, including in a diagram, so I can understand why they'd pick a shorter name.


I'm kinda confused on how this will behave if we want to remove `head`—if `head` is in the stack frame (assuming it's a parameter to `remove_list_entry`), wouldn't `*(&head) = entry->next;` be a no-op as far as the caller is concerned? (Sorry for the n00b question.)


No, this copies the value from next into the memory occupied by head.

For example if the linked list was:

&next: (100, 0xFFFF)

0xFFFF: (101, 0xDEFF)

0XDEFF: (102, 0)

Then calling remove on head would leave us with:

&next: (101, 0xDEFF)

0xFFFF: (101, 0xDEFF)

0XDEFF: (102, 0)

(It would be on the caller to free the memory address 0XFFFF)


Then you will escape from while((* indirect) != entry) check in the first place and ignore * (&head) = entry->next;.


Both of these implementations seem to be missing a call to `free`, or is that something that the caller should be taking care of?


The caller could want to move the element from one list to another, in which case the optimum impl would not free or copy the element.


It's pseudocode. The code as-is is not valid C anyway.


Now you are assuming "remove_list_entry" is not a macro!


As others have said, often you don't want to free the node, you want to do something else with it. E.g., put it in a free list. However, usually if you're not freeing it, you'd null out the next field so that you don't accidentally access it when it's not part of the list (at least I would).


Depends on whether the list owns the nodes. I tend to write mine such that it does. Perhaps this code is demonstrative rather than complete.


Why do you think the values were allocated in the heap?


The author also misunderstands what "head" of a list is, adding more bad clutter to the code.

(Granted, the code in the original TED talk is broken, as it doesn't define "head".)

A linked list in C isn't a struct with a "head"; "head" is just a pointer pointing to the head element.

The OP article should be retracted.


How likely it is for code to be misunderstood depends on the code readability as well, not just on the person who gets it wrong. Personally, I find the style of Linux lists unintuitive. To me it seems a case of sacrificing code readability to workaround limitations of the C language.


I don't see what you're saying.. the OP added that bit (the IntList struct) himself but it doesn't really change the scenario or take away from what he is trying to explain.

But it should be retracted? Really?


I understand the general point (and value) of reframing the problem or the solution in a way that removes special cases ... but in this case I would actually prefer the first solution over the second. The second solutions reminds me of the old-school perl culture, and JavaScript culture, where 'cleverness' (which always manifests itself as terseness as if lines of code were expensive), takes precedence over maintainability and understandability. Your code is going to be potentially maintained years in the future by developers of all levels - make it easy on them by practicing the principle of least surprise. In this case, this means using a traditional implementation of a linked list that most developers would be familiar with.


I generally agree about avoiding cleverness in code.

But this particular finesse isn't necessarily about extra cleverness. Here the situation is that all the linked-list nodes look the same and can be/should be treated the same. The reason the first bit of code has to treat "head" differently is that head is the only node that the rest of the program keeps a pointer to. But that should be incidental, by doing the operation indirectly, you do things uniformly, you should do things uniformly, since the linked list has a uniform structure.

Here, it's not much about combining cases overly cleverly, not leveraging "while (x=y){...}"-type stuff but removing what is an unnecessary kludge; that you have to treat head differently 'cause it's the value you happen to have and you need to update it while preserving it.


I'd also add that in this particular case, it's not so much cleverness as performance; readability should be a priority for "business" level code, but performance for, well, performance level code.

Think of how often the code will be executed; this particular code, or kernel level code, will be in the order of millions in a regular working day. Most code that I for example write will be tens, maybe hundreds; it's mostly glue code between a database and REST api for a relatively low-traffic internal management application.

But it runs on a Linux machine which, for every HTTP request between the browser and my back-end, will run thousands of lines of code and iterations (I guess?). I really don't mind if the code looks clever if it means it's ten times faster than the readable code.

Besides, clever does not exclude clarity.


> you should do things uniformly, since the linked list has a uniform structure

Except that it actually isn't uniform, is it? A linked list has three kinds of nodes: the first node (to which nothing points), the middle nodes, and the last node (whose next pointer is null). Nulls vs populated values really do change the shape of data. Any code to handle a list needs to be well aware of those three cases, even if there's no IF statement for them. I would argue that having code that is organized around these actual differences is better than trying to normalize those differences into a single thing, at which point those meaningful differences risk being overlooked in code maintenance. At a minimum, this kind of code needs to really clearly explain why it is doing what it's doing in comments, and how the multiple cases are handled in the streamlined solution.


Looking at the code there are not just fewer lines, but fewer conditionals too. It's much simpler to read and understand. I got it at a glance, whereas I skimmed the typical approach and would still need to go over it more carefully to be sure it is correct.

I think Linus is correct on this one.


Linus’ solution seems to add a layer of indirection, which I think is harder to grok than a simple condition.


There is no extra layer of indirection.

The idea here is to make the "p" pointer point on the "next" field of the current element instead of at the structure itself. There is the same amount of indirection in both solutions.

The first solution does pointer->structure->pointer, the "elegant" solution does pointer->pointer->structure. The reason the first one may be easier to read is because the C syntax "prefers" it, as it is a more common construct.


But there's comments that explain it, so I think it's okay. I wouldn't leave a thing that has a mix of &, *, and -> without saying what it was supposed to do, but he does explain it.

I'm ok with clever code when it's explained in words.


There's no accounting for taste ;)


But there should be ;)


IMO harder to initially acquire but easier to grok at scale. Imagine combing through the behavior of blocks of code with conditionals in a larger system.


The cs101 solution adds an unecessary extra variable "prev" that you have to study to figure out what it means. That's harder for me to grok.


Conditionals are so simple it’s basically the first thing anyone learns. Anyone can understand the first approach. The latter requires knowledge that is specific to lower level languages,concepts that are certainly not taught universally.

The latter approach is definitely not the easiest to understand. It is, however, what Linus considers «good taste in coding.»


>Conditionals are so simple it’s basically the first thing anyone learns.

That's neither here, nor there though.

The simplicity of a concept does not directly translate to how hard it makes the code (assembly is simpler in the sense of having fewer and simpler constructs than e.g. Python but much much harder to code in).

Conditionals, and the exponential increase in state space they bring, are still one of the biggest source of errors and complexity in code.


Yes, excellently put. That's exactly why Linus' code is easier to understand. The extra control flow operations really increases the time to understand if the code is correct and likewise introduces many opportunities for error.


If you're coding in Linus's world you're going to understand indirection, the example is excellent given the context


There's arguments to be made on both sides, but I think the problem with Linus's solution here is that it doesn't quite clearly establish the assumption being made, which gives it a bit too much of the 'cleverness' flavor that you allude to. A better implementation would be one that does establish why the use of pointers make sense:

   void remove_entry(node_t *entry) {
     // curr_ref is the address of the link pointing to curr
     node_t **curr_ref = &head;
     node_t *curr = *curr_ref;
     while (curr != NULL && curr != entry) {
       // Advance curr_ref and curr
       curr_ref = &curr->next;
       curr = *curr_ref;
     }
     if (curr) { *curr_ref = curr->next; }
   }
Choosing names somewhat more rationally, and making it clear that "curr" always points to the current node, and that "curr_ref" is the address of whatever pointer we followed to arrive at curr, makes it easier to establish the invariant that updating curr_ref is sufficient to insert or remove curr into a list, no matter if it's referring to the head link of a list or the next link of the prior node.


That does make it a bit clearer, but I do hope the compiler optimizes away the redundant curr variable.

Now, on a different note, I am a bit puzzled because I don’t see a free(*ptr) call in Linus’ or anyone else’s code. The code, as-is, would cause a memory leak.

There’s a need to capture the curr_ref before it’s overwritten, and free it after it’s overwritten.


Usually those lists are intrusive -- their nodes are meant to be part of larger objects. They will be freed in other contexts, e.g. if protected by RCU, memory reclamation is postponed until all threads have quiesced. Or the nodes are allocated by one amongst many allocators (i.e. for performance, or better concurrence), so it is best for the structure itself not to make any assumption about where the memory should be returned to.

So generally, list implementations in C will not free the node, only remove references to it and return it.


The assembly code produced by my variant and Linus's variant is exactly equivalent, assuming you write the guards the same way. This effectively means that both curr_ref and curr will be live, although the compiler will note that it can substitute entry for curr and short curr's lifetime to only exist within the loop body in the form written here.


At the end of the function curr == entry, so the caller already has a separate reference to the object. Also, removing a node from a list does not necessarily mean that you want to delete it. For example, you might want to put it in a different list.


The Linux kernel uses intrusively linked lists a whole lot - that means memory management is a separate concern from managing the list links.


I have to disagree. This code is simply concise and elegant.

Generally less code is better, since there are fever lines that can have a bug.

While pathological terse code can be obscure - to prove that the author is clever - and then it really is a pathology - simple terse code is just beautiful.


One way to define "clever" is something you can do but that relies on something you don't expect most readers to have already loaded into their head. Like a riddle, it makes sense if you know the trick it relies on but is baffling if you don't. The difference between "clever" and "smart" then is based in large part on what you expect your readers to already know. Different people have different expectations there and they reasonably vary across teams and time.

Pointers and addresses are pretty fundamental to C, but many C programmers only use them in certain fixed patterns: An address is what you get back from malloc() and pointers are variables to heap-allocated objects. Or pointers are the type you use for parameters when you don't want to copy the value.

There is a deeper understanding you can have: pointers make storage locations first class entities. A common refrain in programming is that if you want to increase the expressiveness and flexibility of your code, you make some concept first class because then you can abstract over it.

In C, storage locations are first class because you can take the address of anything. This lets you abstract over which storage location an operation modifies. Linus's "trick" relies on understanding that using a pointer (to a pointer in this case) lets you abstract over where the node pointer is the head pointer or a next pointer.

If you have Linus's mental model of C, what he's doing isn't clever. It's smart. It uses a fundamental concept of the language to avoid error-prone control flow and edge cases. But if you're only used to using pointers within a handful of proscribed patterns, it likely seems very strange.

I won't make any judgements as to whether thinking of C in this way should be something that people do. Anecdotally, a while back I wrote a blog post on writing a garbage collector from scratch: http://journal.stuffwithstuff.com/2013/12/08/babys-first-gar...

The mark-sweep algorithm isn't super sophisticated, but this is fairly tricky low-level stuff. A lot of people have read it over the years and basically the only negative feedback I've gotten is around where I use the same pointer-to-a-pointer to remove from a linked list. It confuses a lot of people.

So, in my own personal code, I'm fine with stuff like this. But when coding with others, I tend to avoid it.


What really made monads click for me is when I read Monads for Go Programmers [0], drawing a comparison between pointers and functors:

> We can think of a functor as a container, which contains one type of item. ... > A pointer: *T is a container that may be empty or contain one item;

Despite that comparison probably making seasoned FPers groan, that was a big clicking moment for me.

Thinking along the lines of nullary-function:pointer::return:dereference, frames Linus' abstraction in an interesting light. You're manipulating the "function which yields the struct", not the struct itself. In fact it looks much closer to a map than the cs101 blurb in that now all nodes can be treated symmetrically.

[0] - https://awalterschulze.github.io/blog/post/monads-for-goprog...


Yes, I think that's the key.

I'm not a C programmer, and though I do have a basic understanding of pointers, I still find them rather confusing because of the indirection they introduce. I could follow the first example just fine (because I know how to deal with linked lists from Lisp), but the second gave me rather a headache (two levels of indirection!).

However, I think that once you've really understood the concept of pointers - and I'd expect that from all kernel developers - the second example really wouldn't be any harder to understand than the first. And yes, then it is the cleaner way of doing things, and therefore to be preferred.


I started reading the article agreeing with you. Hearing Linus talk about "taste" made me expect something obscure and esoteric, with a marginal performance benefit, and a subjective/objective debate about whether the value of micro-optimizations at scale throughout the linux kernel adds up to meaningful benefits, and if they are traded-off with less readable code.

Surprisingly, I ended up in a different spot.

I actually think this is a low-level C-version example of the best practice of "use idiomatic language concepts".

When C# first got LINQ, when Java first got Streams, and when both got inline anonymous Lambda functions a lot of old-school developers resisted both.

"The syntax is confusing!". "The behaviour is hidden!". "What's wrong with a good ol'd for/while loop?"

I know because that was my first instinct too.

But I quickly embarced these ideas because they allowed me to write more terse declarative code and spend a larger percentage of my TLOC on business logic. There was a small learning curve to new developers to understand the libraries, but after that everyone was more performant.

I feel the same way about C and pointers and address manipulation magic. No, it has no place in Java or C#. But for C developers, these are their idiomatic principles and concepts. Pretending otherwise, and not leveraging all the capabilities possible with direct pointer and address manipulation is not utilizing the benefits of C to it's full potential.

NOTE: I am not a C developer. This is just how this comes off to me. I'd love to hear from actual C developers if they would say that this is the equivalent of running around with a loaded shotgun and languages like Rust have been designed with solving these things in mind (or something).


I think that you're on to something here.

I write C full time, and have done so for many years. It didn't occur to me that the pointer to a pointer aspect was the source of any of the confusion until just now. Actually, I don't think that I consciously registered its presence when I read the code. There is only one interpretation of the code that makes any sense, and that's at least easy for an expert to see quickly.

I am sure that people that think that the terser variant is overly clever don't get it. To be fair, it's hard to generalize from this one example. But I think that I know exactly what Torvalds intended to draw attention to.


> "which always manifests itself as terseness as if lines of code were expensive)"

They are expensive! Your code maintained years in the future, every developer has to potentially read every damn line.

The alternative to "clever" code isn't Java and FactoryFactoryFactories, it's short clear code, which is different from short codegolfed code. The main difference is nicely designed libraries and abstractions.


For example, right now it's Advent of Code, day 6. The puzzles are still reasonably simple. Have a look around today's Reddit thread where people can post their answers:

https://old.reddit.com/r/adventofcode/comments/k7ndux/2020_d...

These are self-selected people, free to use a language of their choice. Ask yourself some questions:

- Which of these do I think is "most readable" vs "least readable"?

- Which of these am I happy is correct, and bug-free as far as it goes?

- Which of these would I like to make a change to, confident that it won't break?

- Which took the author most / least time?

- If I could only pick one to run on my puzzle input and submit the one answer which came out, which of these would I pick?

- Which of these would I like to maintain long term?

Another way to see the same ideas is to pick a task on RosettaCode - https://rosettacode.org/wiki/Category:Programming_Tasks - and see how much / little code people write in various languages to solve the same problem.

IMHO it isn't "the longest one" that is clearest, by a long shot. Nor the golfiest one. But it tends to be the shorter ones done in languages that have higher levels of abstraction, more libraries, nicer looking naming, more standard patterns.


your comment about "short" vs "code golf" gets to the point of the comment you are responding to, which is one that Linus is downplaying - not all lines of code have an equal "cost".

Potentially, using a code-golf implementation where you're using pointer-to-pointer to get rid of a single, easily understandable "special case", is more "expensive" in coder-time than just using the standard off-the-shelf linked-list that (even Linus admits) everyone knows from their data structures 110 class. Because now every developer who ever reads that bit of code for years into the future, now has to understand your code-golf solution, how it works, why it was done, and the implications for the rest of the code. Whereas they could probably skim the "standard" solution and say "yes, that is a standard linked list" and move on to solving the actual problem instead of trying to understand the code golf.

In many cases: your clever solution isn't solving a difficult enough problem that it's worth the cleverness, because cleverness is often expensive.

edit: I agree elsewhere that this is probably a question of "vocabulary" and whether a particular bit is "clever" or "code golf" depends on your particular team.


Lines of code are expensive, because we expect human beings to reread them with almost no help from tools. Concise code is a temporary problem for novices, until they grow into experts. Bulky code is an ongoing problem for everyone and we don’t have a solution.


While I mostly agree with you, I think that use of double indirect pointers is rare enough that in effect each use of one probably counts as a "line of code" when trying to understand what's going on.

The one Linus doesn't like likely runs faster (at a microarchitectural level), it's also the thing I've done for 40 years now, it's what comes out of my fingers when I code linked lists, for me at least it's more understandable because I already understand it


It's very unlikely it is faster.


A good compiler should keep *p in a register because there are no aliases that might modify it, but a poor compiler might follow p a second time.


Human beings are bad at case analysis, so removing case analysis makes the code much more comprehensible. It's a huge improvement. Dijkstra wrote extensively on the subject and his arguments are compelling[1].

[1] https://www.cs.utexas.edu/users/EWD/transcriptions/EWD07xx/E... and https://www.google.com/search?q=site%3Ahttps%3A%2F%2Fwww.cs.... for more


I think your problem is with C as a language, which can look really ugly to the untrained (or trained) eyes, and is way too permissible in terms of what you can with pointers and memory in general.

But I see this as a paradigm that you can apply in more programming languages: you should aim at writing code that removes edge-cases. This is not about code reduction in my opinion, I would have been equally happy if the code had been longer. It's about more secure code: because there is no edge-cases to think about.


I find the "elegant" version easier to read and understand. But I think it also requires the reader to understand pointers pretty well, which, frankly, the majority of developers (who tend to work with interpreted or GC'd languages that don't have pointers / pointer arithmetic) probably don't have much familiarity with.

If a beginner C programmer reads this code, they might not understand until they have a little more experience under their belt. But I think that's ok; if we limited ourselves to writing code that beginners in the language can easily understand, we're going to miss out on a lot of important, useful techniques.

I do absolutely agree that cleverness should be avoided. We should optimize for later readers of our code. But I don't really see this as falling into the "clever" camp, at least not in way we mean "unmaintainable code".


Fever lines aren't automatically better, but fewer lines means less code to maintain. More often than not, fewer lines is easier to read too.

The example here is a good one. The shorter version reads much better and is more obviously right.


Having been programming for almost forty years, in everything from machine code over C to functional languages, I do not find the shorter version "more obviously" right. It requires more commentary to explain here and depends on pointer manipulation above and beyond understanding the structure of the list and the actions of the algorithm. It may be easier if you think about pointer manipulation all the time, but I would consider such code a land mine, just waiting for someone to accidentally change something without enough understanding.


I have one to bring up "for loops".

I think they are infinitely worse than while loops. (I ran into this specifically with Pandas dataframes/series)

You save 1 or 2 lines and have ambiguity on wherever "each" is.

Maybe static typing solves this, but I've decided I hate syntax sugar and dynamic typing.


Although for loops might all be while loops at heart, I think the distinction is more than just syntactic. Properly used, the use of a for loop conveys particular information: the code block is intended to run a set number of times or over a particular group of objects. This allows while loops in general to be used specifically for running the code block an indefinite number of times until the criteria is met. (Certainly both types of loops can be misused and abused.)

It also splits apart two different kinds of criteria. "Has this run ten times" is a different sort of question than "is the error within the given margin" or "does the temperature now read 60 degrees". (At least, if you're intentionally running the thing 10 times. If it's running incidentally so you don't even know if it will run 10 times then it can be the same sort of question, but in that case I'd argue you should use a while loop….)

> You save 1 or 2 lines and have ambiguity on wherever "each" is.

Like the sibling poster, I'm not certain what you mean by this.


You might have a good point, but you haven't given enough code to expresss your idea.

What is "each" ?


It seems to me that what is at issue here is the difference between easy to understand and simple. They're not always the same thing. Simpler code tends to be easier to audit for bugs (as a practical matter).


It's not cleverness for cleverness sake, it reduces complexity, and yet the code is still self explanatory. The extra level of indirection means one variable can be used to access what was previously stored in two. The only confounding part of it is that this didn't become the canonical way to manipulate a linked list over the naive implementation.


i think a lot of the reason it looks more "clever" than elegant is because c's syntax makes the "get the address of this field in a struct" operation so hard to read at a glance.


It's a lot easier to "get the address of this field in a struct" in C than in Java or Python or JavaScript.


Not really, but the real question is why would you need or want to in those languages?


in a lot of other languages you would be passing things by reference; the concept of an address would not be needed to implement a linked list.


Sure, in those other languages you could just define a single-field class to wrap the reference to the next node, since you can only pass objects by reference and references themselves are (usually) not objects. Unfortunately this adds a lot of overhead, both in lines of code and runtime. Instead you would probably just use the less efficient algorithm with more special cases to work around the lack of indirect references.


Only in this case the second solution is clearer, and not clever in the "old school Perl" way...


I was about to post a comment but then I think someone propably post this view already, and yes. When I quickly perceive visual of:

  block
    prev = cur
  /block

  if(prev)
It immediately tells the taste that's bad. Especially for ones who come from writing expression rather than statement


On the other hand, "clever code" repeated enough times is a "design pattern."


I totally agree, it’s about proving you’re clever rather than communicating ideas. Maybe for something hard like Linux kernel development this is a good thing but in most cases it just leads to messy code that people can’t understand or follow.


While there certainly are such cases, I think here it’s just about being proficient in a language. Pointers, referencing and dereferencing are the bread and butter of any C code, and applying them in a way to reduce complexity is certainly something to strive for - if this isn’t readable then I’d argue the reader shouldn’t be touching the codebase anyways.


You are right, for a c programmer this is bread and butter. The readability of the example would increase a lot with better named variables, however.


Fortunately or unfortunately you don’t always get to decide who touches the code, so unless performance is a concern as it is in kernel development optimising like this in the day job will eventually lead people making mistakes. Basically I say be as explicit and clear as you dare, even at the cost of some CPU cycles.


Adding extra dereferences is rarely performant. The example is written this way because it's more elegant, not because it's faster.


I've written code like the first example many times, and I always felt a little "dirty" for doing it - I could tell there had to be some way to avoid treating head as a special case, but I was too lazy to actually work out how. The author of the second example may be "clever", but he also didn't give up.


I was only ever taught the second, more concise one, so this discussion is kind of confusing. How is it not shorter and more explicit?


Many Programmers who don't use C are afraid of pointers and prefer superficially simpler (but overall more complicated) things like Java's abstraction towers, despite the extra expense.


To be fair, many programmers who do use C have that character flaw as well.


Lines of code are expensive! They're the only thing that's ever been shown to be correlated with the number of bugs in a given codebase.


I completely disagree. Lines of code, long variable names, and meaningless hierarchies are some tasteless taxes. Some people like each of them(!), and I don't get it.


Why are long variable names a tax when there are such good auto completion tools? I get short variable names in algorithm code but in business logic long variable names can make code vastly more self-documenting.


A tool can write the name for you once, but you have to reread it many times and we don’t have a tool to help with that.


That's not how reading works. You don't parse every character like a computer does. You look at the starting letter (maybe the ending one too) and then recognize the shape of the word used. There isn't too much difference in reading a long or short variable name as long as there aren't variable names that are too similar to one another.


Actually what you need to recognize are expressions, and expressions with short variable names are much easier to recognize. For example, if I write:

  theDependentVariable = theCoefficient * theIndependentVariable + theIntercept
it is much harder to recognize than if I write:

  y = a*x + b
So, longer variable names might be "autodocumenting" but they also make code harder to read.


I think you're offering a very particular case with a well-known idiomatic presentation and trying to apply it to the general case. In this case a, b, x, and y are both letters and good variable names for their purpose because of their long, familiar use in mathematics. But calling your main GUI window "m", your data "d", and your server connection object "b" doesn't make for readable code.


Yeah, letters don’t have much of a cost, but extra words do, which is why people complain so much about needing to plow through SimpleBeanFactoryAwareAspectInstanceFactories.


Yes, "as long as there aren't variable names that are too similar to one another"...

It isn't safe to trust that to be true. That way lies bugs, including security holes. You have to carefully check the names.

EthAccHdlrSubsMacPhysRegWrite and EthAccHdlrSubsMacPhyRegWrite

Shorter is better, within reason.


Maybe slightly off topic, but..

I don't think this is really about taste. It's about "common ground" between programmers. If you've seen and used the second pattern many times before, it becomes part of your vocabulary. You're then able to express yourself more succinctly. It is then obviously a better solution to you.

If you see another programmer use the same pattern, it is common ground between you, and you like it. As long as every or most programmers on the team share the same common ground, you are more effective for using it. If you don't share it, you're less effective.

Everyone here commenting that they like the first solution better probably doesn't share the necessary common ground with Linus.

The big question is: what common grounds should you expect when writing your code?


Also it's the other way around! Using common idioms builds culture how we structure our code. Then that common culture help make the idioms part of the shared vocabulary.

Working with such a large piece of code as the kernel, it's invaluable to have a certain sense of shared ideals how the code should read, and it's probably a good thing to have maintainers that nitpicks about such details.

People sometimes say things like "just a matter of taste" as if that somehow made it less important, but taste is probably the most important trait we can share when programming.


this is the big point. when coding alone I strongly prefer Linus's version.

but it seems to cause friction and get called out whenever I use it in a group environment. so I don't. or I change it.

the code itself isn't important. its about the functioning of the group, the velocity, and the ownership.


Well, yeah. This sort of code acts as a natural filter.

Someone's who finds it difficult to learn or to use it may not be ready for the kernel development. So it works as expected on more than one level.

The same goes for intrusive containers and few other things that many people learn in school, but get at first confused by in practice. If this creates a friction at the group level, then it's the problem with the average group skill rather than the code. And it's the former that (ideally) needs addressing. Just like you wouldn't use bubble sort because the "group" has trouble understanding the alternatives.


> the code itself isn't important. its about the functioning of the group, the velocity, and the ownership.

It isn't either/or. It can be all of the above, to varying degrees, that change over time.


The same argument was made recently about the reduce function: it's a known pattern for some and a hard to understand trick for others.


Yes, and you get used to reduce. map, filter and reduce were hard for me to follow… until I used them two or three times and now I understand them easily and they naturally come to my mind when I need to solve a problem which they can help solving.

They are good tools (especially in codebases shared with people who are allergic to loop statements!)


Agreed for map, filter etc. but reduce is the exception. There has been a fad of overusing it, until that article popped up which called it out.


I "got" reduce not long ago. It's occasionally handy, and it makes sense, but sure, it should not be overused. It is the most complicated between the three.

A good old loop in code bases that are not afraid of them are totally fine and probably clearer in many cases.


How I wish more programmers would understand this. It's like learning to eat broccoli. At first, its bitter and unfamiliar, but if you are exposed to it, you learn to recognize it and like it, and it's actually good for you despite the initial unpleasant taste.


Like others here, I prefer the first.

The first one -- I read it, I know what it does, it seems intuitive to understand, and I expect it to be bug-free especially because the edge case is explicitly accounted for. If someone else has to modify it later, I'm not particularly worried they'll mess it up.

The second one -- it took me about 4x longer to understand what it does. It works too, but it doesn't match how my brain naturally thinks about it, so it leaves me with the uneasy feeling "what if there's a bug?" and I'd be more worried someone else would introduce a bug if they had to modify it later.

I don't want "elegance" or "good taste" in code. Unless it has a good reason to be hand-tuned for performance, I want code that is written the way you'd expect an average programmer to write it. Nothing clever, just a straightforward translation of requirements into code. Not "transforming" them into something more "elegant".


From watching the Ted talk, I think that Linus was using a cs101 example to effectively communicate to a large audience of programmers about good design for system level work such as for an Operating System.

His example, I think, can be extrapolated to explain the design of Linux’s “clone” system call for threads, which creates a new process that uses the same virtual address space as the parent process, with a different stack location; but more importantly, those “threads” are scheduled by the OS’s scheduler like any other process is. I’m unaware of any other OS which implements threads this cleanly.

From his talk, “Sometimes you can see a problem in a different way and rewrite it so that a special case goes away”

https://eli.thegreenplace.net/2018/launching-linux-threads-a...


> I’m unaware of any other OS which implements threads this cleanly.

I believe this comes from Plan 9.


> so it leaves me with the uneasy feeling "what if there's a bug?"

The second solution takes me a bit longer to wrap my head around, but once I do, I like it more, because there's fewer conditionals, and so there's less chance that the function breaks on unusual conditions.

Using fewer local variables is also a nice plus.


Consider though you would need to extend both versions to use doubly-linked lists.

I would much rather extend the latter than the former. (Though it could be I've used a similar solution in the past.)


In case of the optimized version, I would have liked to see comments explaining how much faster it was vs. the obvious one in a profiling experiment, and see it accompanied by a unit test to take care of the "what if there is a bug" concern.


I've been teaching CS101 for two decades. I've never shown (nor seen another instructor show) a two pointer implementation for list traversal, except perhaps as an example of bad practice (probably by someone who doesn't understand the syntactic sugar of the -> operator). "Every pointer to node is a list, including the trivial case of a null pointer being an empty list" is the way I was taught in CS101 in the early 80s, including a one-pointer implementation of this algorithm in Pascal.

I suspect this might a case similar to that of the "Waterfall Method", where we have this unexamined belief that in olden times or academia they just weren't capable of comprehending really obvious things. CompSci instructors, and most of their students, are quite capable of recognizing that two pointers aren't needed, if only because every algorithms textbook they've ever read discusses it.

I think that this might be like Bubble Sort: it's discussed in class as a way of illustrating a point. In students' fuzzy memories of school, they remember that bubble sort and 2-pointer traversal were taught, but they forget that they were taught only to show why selection sort and single pointer ("look ahead") traversal are better algorithms.

As a side note: I don't like the "elegant" version's use of double indirection. It's not necessary if you return a pointer instead of void, which is also nice because you can treat calls to your list functions as lists themselves and chain them together. It also allows you to avoid warts like `p = &(*p)->next;` at the modest cost of needing `p=remove(p,t)` instead of `remove(p,t)`.

Finally, the use of two structs is unnecessary. IntList is just a wrapper around a pointer to IntListNode. Why not just use a naked pointer to IntListNode like the gods intended?


> As a side note: I don't like the "elegant" version's use of double indirection. It's not necessary if you return a pointer instead of void, which is also nice because you can treat calls to your list functions as lists themselves and chain them together.

Agreed.

> It also allows you to avoid warts like `p = &(p)->next;`

Does it? Isn't having `p = &(p)->next;` in the while loop necessary?

> at the modest cost of needing `p=remove(p,t)` instead of `remove(p,t)`.

I may disagree since an API is used many more times than the API function is written.


The & isn't needed if you're returning a pointer and no longer need a pointer to pointer to return the list as an argument. Also, if you get rid of the struct containing a single pointer and just use pointers, you can have the following instead…

` p=p->next

As for the API, returning the pointer allows a function to be an argument for another function, which I find more flexible. ` p= remove(p,find(p,value))//

I also like the option to be able to do stuff like:

` p=remove(remove(p,t),t); //remove two

or:

` if (!remove(p,t)) {//that was my last node }

But that may be an artifact of my preference for functional programming.


Nod, nod, nod. Thanks for clarifying :)


> Finally, the use of two structs is unnecessary. IntList is just a wrapper around a pointer to IntListNode. Why not just use a naked pointer to IntListNode like the gods intended?

So Say We All.


Without the container struct, you'd have to pass the pointer to the pointer of the first element, remove(&head, elem), or assign the return value back to it, head = remove(head, elem). A container also allows you to maintain a counter and a tail pointer, for fast append and insert. Above all, a container makes it harder to return a pointer to just somewhere in the middle of the list and treat that as the head.


Semi-relevant side note: Within some constraints its possible to remove an element in a singly linked list without walking the list. The purpose of walking the list is not to find the element (which you already have) but to find it's previous element so you can fix the "next" pointer on the previous element. What you can do instead is to copy the data from the next element onto the one to delete, making it a "copy" of the next one. And then delete the next one after fixing the pointers. Hard to say in english but here's the idea:

    remove_list_entry(entry)
    {
        next = entry->next;
        entry->data = next->data;
        entry->next = next->next;
        free(next);
    }
You do need to handle the case of deleting the last element in the list. That can be done by keeping an EOL element at the end just for this purpose. The real caveat is that it breaks external references to list elements. If you are managing the list yourself however and can deal with these issues you can avoid list traversal (or save a pointer on every list element vs a doubly linked list). Just something to keep in your bag of tricks.


Neat idea, but yeah like you say you break referential integrity pretty hard with this solution. And tbh I feel like everyone prefers tons of local links in their data, and since we have memory to spare with wasteful representations, we pick the convenient, link heavy one, a world in which this algorithm variant does not apply.


To relate this topic back to Computer Science and Math, the linked list traversal problem can be related back to Proof by Induction -- There's a base case (a starting condition) and then the inductive steps: "...proves that if the statement holds for any given case n = k, then it must also hold for the next case n = k + 1"

We can prove using proof by induction that Linus's implementation works because of the base case "indirect = &head", and the inductive step which is the rest of the program.

It'd take a lot more work to mathematically prove that the branching program "works" due to the if statement -- you'd have to construct a proof of each branch and prove that the combination works.

I'm not a mathematician, but this was a major point in CS courses at university. Being able to prove an algorithm works can extend into automated program checking / auditing, etc.

It's not just "taste" -- there's a lot of practical theory to be applied.

https://en.wikipedia.org/wiki/Mathematical_induction


I think this hits the nail in the head. Lots of comments here seem to dislike option 2 on the basis that it’s a “clever trick” and that those should be avoided for reasons of readability and maintainability. But this second approach would be the _right_ answer in a exam about algorithms, and that’s what Linus is really hinting at here.


Indeed. It also show us that using "good taste" or "code smell" arguments are wrong because there should always be good technical reason to justify the choice. Sometimes we have the good intuition and the technical is hard to express but we have to try understand why we think something is good taste and express it in a technical way.

Arguments on good taste will always finish in ego fight when people disagree.


Although I liked the 'elegant' code after reading it, in general I would try to shy away from 'elegant' solutions that are actually harder to understand. Unless you are working with high-performance programs, most of the time you don't care about one or two extra branches. It's usually more beneficial to write code that can be easily understood by a future-you or by another person in your team: less time invested in understanding code, more time dedicated to actually fixing issues that matter. If performance becomes an issue, you'll catch that in profiling easily (because everybody here is profiling before trying to improve the performance of a piece of code, right?).


I think this particular case has two reasons the optimization makes sense; first, it's some core kernel code that possibly gets called a ton - so high performance code. And then second, profiling this kind of code is not easy; I don't know what tooling there is nowadays but when I was playing with Linux in the 90s each iteration would involve a reboot, and staring at printk output. So there is a tendency to write code the author thinks is faster - which may or may not be true but is probably likely for experienced developers.


There's usually a trade off between easiness of understanding and elegance, because elegance usually means "works very well given advanced understanding of the domain, task and tools". You can usually shift the problem a bit and get a bit more of both elegance and easy understanding with a good comment giving the general idea. That isn't to say things shouldn't be laid out and named sanely to make thing obvious where possible, but any time you make assumptions about what someone else looking at your code know or is thinking, you're opening up the future for more bugs. It makes sense to attempt to limit that in some respect.

The fact that we have all looked at code and not known WTF is going on and stepped away and come back a little later and it was obvious should be all the evidence needed not to to assume too much about what some other programmer will understand just by looking the code itself, especially elegant code.


The more succint version of that code is easier to read and review and it makes the code surrounding it easier to read and review. The extra conditional blocks are superfluous and take up a lot of lines. This is most definitely not about performance but about coding style.


In general, this is good practice for working on a team.

However, given that we're talking about Linus Torvalds' opinions on what makes code 'elegant' or 'good taste', I'd say it's implicit that we're talking within the domain of low level, high performance code, such as kernel code.


I think what's missing in the mental model of the elegant solution is, that the IntList-pointer does not point to a complete list or an element of the list, but to a tail of some list (that might be the complete list). This explains why the head is not really a special case, as it points to the tail that is complete. And you can just exchange a tail.

So I think the image with the blue boxes is misleading and it should be

->[4->[12->[3->[6->[2->[]]]]]]

It is now obvious that you can always point -> to a different [...]

And as it turns out, that is basically the Cons/Nil view of a list from functional programming or lisp, if you're so inclined. And in those languages you would pattern match on the constructor once and do the tail-recursive call.


Right, I think the key insight is that the special case is simply unnecessary if you have the abstract model of the data structure correct. A linkedlist is just x::xs.

The c code being weird is more of an implementation detail that shouldn’t be of much focus.


These functions can be thought of as two steps:

1. Find the thing to update

2. Update it

Linus's version does just that.

But the first version is more complex because step 1 instead emits something that may or may not be the thing to update, and so step 2 has to reason about that. Additionally, it introduces a bookkeeping variable -- cur -- which becomes redundant before step 2 (by which point it's equal to target).

IMO Linus is right here. The form of his solution directly matches the problem and allows you to look at the code as two clean steps -- no reasoning about the output of the first step and no bookkeeping cruft variable that you have to ignore or remind yourself that it's the same as another variable after some point in the function. At least once you're used to working with double pointers I think that's a much easier function to read.


Most people prefer the first over the second. But I think that Linus really should prefer the second over the first. Let me try to explain why.

There is a well-known saying attributed to David Wheeler, "All problems in computer science can be solved by another level of indirection." Except the problem of having too many layers of indirection. Also both quotes are often seeing with "abstraction" instead of "indirection".

There is a not well-known saying that you can attribute to me, "Any method of abstraction that you have internalized becomes conceptually free to you." (So for the sake of others, choose wisely which you will expect maintenance programmers to have internalized!)

The key to the elegant solution is understanding how to manipulate data through pointers.

That makes the elegant solution inappropriate to use in CS 101. It involves a method of indirection/abstraction that is emphatically NOT free to beginning programmers.

It also makes the elegant solution inappropriate for most people on HN. We do not directly deal with manipulating things through pointers very much. Therefore most of us have not internalized how to do that, and the technique is very much not free to us.

However Linus is a kernel developer. Anyone maintaining his code will also be a kernel developer. Kernel developers have to internalize how to handle manipulating data through pointers because their APIs require it. For example look at https://www.kernel.org/doc/html/v4.14/filesystems/index.html and see that pretty much every function gets a pointer to a data structure, and then manipulates that data structure through the pointer.

Therefore every kernel developer should internalize how to manipulate data through pointers. And the elegant solution therefore becomes not just less code, it becomes conceptually simpler! And yes, any time you can replace a block of code with less code that is conceptually simpler, this shows good taste.

BUT, and this is very important, it is only conceptually simpler if you've already internalized concepts around manipulating data through the indirection of a pointer. Which makes it conceptually simpler for kernel developers like Linus, but not for most programmers in other languages.


You make a very good point, and that it has to do with the relative foundational conceptualisational ability of the types of mainteners.

That's great because it kind of speaks to the crux of the problem.

But the first solution does use pointers :)

The second uses double pointers.

I'd argue that 'even kernel maintainers' may not be so easy with the second in reality.

It's probably worth it if there is a performance gain, because it's so low level. But not otherwise.

But good point.


The simple solution is faster.


Which solution do you think of as simple?


The cs101 is faster (and smaller), you can try yourself with gcc.


The second version seems more elegant but will scare non-C people away with all those pointers ;-)

What I do not understand is why one should use an "IntList" struct in the first place? As the explanation of the second method suggests, a List is the same thing as a pointer to its first element, so why not do this?:

typedef struct IntListItem* IntList;

Also, could it be that both methods fail terribly (infinite loops?) when they are given wrong input such as elements not in the list at all?


> Also, could it be that both methods fail terribly (infinite loops?) when they are given wrong input such as elements not in the list at all?

You could just say "has undefined behavior unless target is an element of the list".

Whatever your choice though - this is slide code. It should be obvious that it's not production ready.


Typedefing pointers was done so Pascal programmers didn't have to be scared by C syntax. This is why Microsoft went overboard with this practice. It unfortunately hides the language for no real benefit.


Um, yes. That jumped out at me. If there's a no-find, the code will de-reference null and crash. That's just not acceptable. Try to write that in Rust, using Some(ref) for the forward link, and the compiler will force you to test for None and detect the end of the list.

This is pre-1990s programming style. I've seen such code in assembly programs. Because I was reading crash dumps where it failed.


It's 100% acceptable. You should not remove something from a list unless you know it is there. Since you won't attempt to remove something that doesn't exist, any code to check for that situation is an unjustified performance loss.

(separate code exists for searching a list)

This... is not Python. C programmers, particularly kernel developers, have that style even in 2020.


I hope not. Kernel programmers need to be paranoid.

This is a classic idea for doubly linked lists. The empty list has the head element linked to itself in both directions. Buffer rings are sometimes organized that way. The cases for doubly linked lists are messier.

Bear in mind that in modern CPUs, branches to nearby code are almost free, but indirection to far memory is expensive.


I used to use linux's list.h quite a bit, that is the "good taste" implementation, where the head is the same as the elements. My only problem with that implementation is the fact it is non-typed. Heads are generic, and the code using them has to use container_of() macros to recover the containing type.

I've since discovered bsd/queue.h [0], which is very similar in purpose, but is not "good taste" (which I don't mind at all) on the other hand it is type safe, has quite a few variants for single and double lists, and oh also, it's not GPL.

[0]: https://github.com/freebsd/freebsd/blob/master/sys/sys/queue...


It's also worth noting that in sys/queue.h double-linked containers have an elegant trick where 'prev' is not the traditional pointer to the previous element. Instead, 'prev' is a pointer to a pointer, it's an address of 'next' pointer in the previous element (or the head).

As a result, remove_element() is as simple as *(e->prev) = e->next;


The following is the code I wrote before reading the example pieces of code.

    void remove(IntList* l, IntListItem* target) {
      if (l->head == target) {
        l->head = l->head->next;
        return;
      }
      IntListItem* prev = l->head;
      while (prev->next != target) prev = prev->next;
      prev->next = prev->next->next;
    }
Skimming the comments here, I was surprised not to see an equivalent piece of code mentioned. To me my code is more readable than both of the first and second examples presented in the article. Does that mean my taste is peculiar?


I kinda like this version best. It makes clear we're updating a ->head pointer in one case and ->next in the others. I like the elegant version, too, since it can update both kinds of pointers in one fell swoop, but you have to grok that p starts as a head pointer and later becomes something's next pointer.

I'd say C syntax for double pointers is a lot less kind than the syntax for single pointers. Your version thankfully lacks any line like so, without making me think about parens

    p = &(*p)->next;


This is exactly the right answer.

There are two cases here (1. when the target is at the front of the list and necessitates changing the head of list, and 2. when the head doesn't need to be changed.) You handle both the cases separately.

Both the classical and the "elegant" versions are worse than this one.


Linus' point is that this way of thinking produces this "edge case".

There is no "head case", all members of the list are the same. The head isn't a special element.


Your version does a lot more memory writes.


The trick employed by the "good taste" solution is to ignore the type of the object containing the pointer, which is different for the head, and focus only on the type of the object pointed to, which is the same for all the pointers. This frame of reference shift makes the solution look tricky to people unaccustomed to working with pointers, or languages in which pointers are not explicitly represented. It is also not quite true, as the pointer in the last element in the list does not point to a list element.


I've seen the "better" version a few times before. I wrote something equivalent from first principles.

It is NOT better. It is much more difficult to understand. Software engineering is about making the code intelligible for the people who follow you. The simple two pointer with conditional is MUCH easier to read and understand.


Author here.

To add context that seems lost on some: this is a cleaned-up version of my own notes from when I tried to understand the technical detail behind what Linus called "good taste" in the TED talk.

The main contributions of the writeup (if any) are the two conceptual insights that using an indirect pointer yields a homogeneous data structure and a convenient handle to the list item and its predecessor.

The article is not intended to show an example of clean code, there is no checking for NULLs, there are implicit expectations (the target needs to exist). It's just not the point.

I also strongly agree with the sentiment in the discussion that simple is often better than elegant. If it takes an entire article to figure out what's happening, that says something about how careful you should be with putting it in production code.

Anyways, thanks everyone on HN for a great discussion and for all the insights, comments and suggestions!


FWIW, I think you get way more correct than some of the comments here give you credit for.


I gave a lot of thought on that example when I first watched the video many many years ago. And I think it all comes down to experience and it has far less to do with good taste. To a junior developer the first solution is perfectly valid and easy to read for everyone. And it is true to a certain degree. But it takes some experience and sooner or later you start getting this feeling that doing something like this probably has a simpler and easier solution, for after all, this is a common thing. Once people wrap their head around that concept(the "surely I'm not the only one that's faced that" thought), they start finding it very easy to jump between languages and learn new ones with ease.


Another aspect is that as a codebase develops, memes inevitably appear; patterns and strategies that replicate themselves throughout the codebase.

The elegant utility may be difficult to deal with standing alone, but when you see it 2,3,4 different places (by same author, or mimics) it becomes normalized, and once normal there’s really no reason not to use it.

The main problem with clever code is when it stands alone. Which is also the context in which these debates occur.


Can't forget the Linux kernel VFS interface of 20 years ago, designed by Linus. It was one of the worst abstraction I ever worked with. Linus is a genius, but not the kind of genius that is able to make things obvious and clean.


Some would argue that the git UI falls into the same category.


I used to ask during interviews to just describe what a linked list was and less than half the candidates for mid/senior positions could. I think it's a really revealing question. Same with hash tables.


Switch to B-Trees if you ever need to reject all candidates :P (And astonishingly, a comprehensive description of them including the tools for complexity analysis in a form practical for real-world implementers - not just CS student writing toy exercise projects - is very hard to find.)


The "avoid special cases" argument comes up a lot in math too. For example in combinatorics, some people define 0^0 = 1, so that lots of formulas with "if x == 0 then this else that" just collapse into "that". (Other people insist on inventing a different symbol for this "special exponentiation", but I think that's unnecessary myself.)

For the same reason, an empty sum is defined as 0 and an empty product as 1, so your base cases of some inductions don't require an extra "if".


I have an odd structure that links objects into "boxes" where each object can be in more than one box and each box can contain multiple objects. I made a struct to link an object to a box via a pointer to both the object and the box. These links are each part of 2 lists, one from the object and one from the box. The main job is to find all the objects in a given box, which is done by following the list from the box. To delete an object involves following the list off the object and removing all the links - each of which is in a list from a box. Each link needs two next pointers since it's part of 2 lists. It also needs a PREV pointer for the list from the box for easy deletion. I originally had an empty link object within the box object to be the previous node, but then realized a better way was to have the PREV pointer point to the previous NEXT pointer which means a box only has a pointer instead of a link. This pointer is why I decided to post.

Lastly my link object contained 5 pointers, so I XORed the object and box pointer to cut it down to 4. I always start traversal from one of those, so the XORed value can always be used to reach the other. This did not really impact performance much.


This certainly falls under Rich Hickey's definition of "simple", which is one virtue. But:

> it is not immediately evident how the more elegant solution actually works

another virtue is ease of comprehension, and the more elegant solution lacks that in my (and seemingly Linus') opinion. Maybe if you're used to working with pointers to pointers you might have an intuituon for them, but I at least had a difficult time gaining an intuitive grasp on the second solution, which could potentially nullify the bug-resistance of having fewer branching cases. In short, calling the second one objectively better is overstating it I think.

It's worth noting that in a language with union types, you can have the best of both worlds (using Rust here because it's the one I'm most familiar with):

  enum LinkedList {
    Null,
    Node { value: i32, next: Box<LinkedList> }
  }
In the same way that the pointer to a pointer homogenizes the head-case with the rest, a union type means that any given linked list "is just a node", and the head can therefore be treated the same way as any later node


Wouldn't you end up with a Box of Null? Wouldn't it be better to:

    struct Node {
        value: i32
        next: Option<Box<Node>>
    }


Yeah that's true, but the problem with your version is it loses the homogeny because an empty list can't simply be a Node, it has to be represented in some other way (as an Option<Node> or something), so we're back to square one. Maybe I was wrong and this can't be done perfectly even with union types.


In that case wouldn't you just:

    struct List {
      head: Option<Node>
    }

    struct Node {
      value: i32,
      next: Option<Box<Node>>
    }


Yep, but this replicates the original inelegancy (from the article): Option<Node> and Option<Box<Node>> are not the same type, and so they can't be treated the same way.


I guess you could just allocate the head node on the heap too.

I think rust probably has better tools for this.

Probably some sort of

   let node = node.next;
   while let Some(n) = node {
       node = n.next
   }


Which one required writing an article about to explain?


In most cases clarity should win out over succinctness. (Sometimes succinct is more clear).

I absolutely prefer the first one in almost all cases, and would probably reject the second one on a code review.

Unless we're dealing with such a core, hyper-sensitive part of the system wherein the compiler would not find rough equivalence anyhow, and the material gains from supposedly 'fewer instructions' would be better.

i.e. a pragmatic performance optimization that was realized in the real world, due to the pervasive utilization of the code ... this would be acceptable.

But for the vast majority of what we do, this won't be the case.

Double-pointers are like flame throwers - they are very 'cool' to some, and technically, they do 'burn things faster', but are just excessively dangerous and almost assuredly not the right too. Unless, they actually are, wherein you get to be the dude who uses the flamethrower, but again, that's rare.

Reading this it seems more clear to me why git has such tremendous - and mostly unnecessary UX problems. There's a dimensionality of the craft being ignored.


Then there was the day I wrote a function that started off like this:

    char *fn(char **foo, char ***bar){
        char *baz = *++*foo ? *foo : *++*bar;
Double-pointers are just normal. The strtol function has one.


While that function may have utility in some context, I would immediately assume that there's something existentially wrong with the context in which such a function was needed.

Ok, it's possible the entirety of it makes sense, but given that C/C++ is full of absurd shenanigans, I think odds are something is wrong with a system that needs that kind of function in the first place.

That such things are common enough doesn't make them a good practice.


I knew I have seen this code somewhere explained years ago and suspected Slashdot. Search for "favorite hack":

https://meta.slashdot.org/story/12/10/11/0030249/linus-torva...

Here using actually "pp" ;)


This sounds like a classic case of "simplicity is in the eye of the beholder".

If you're someone who has experience with, or can easily grok concepts such as "pointer to a pointer", then the second snippet seems obviously simpler. After all, there are fewer branches to consider.

Unfortunately, as someone who stopped working with pointers a long time ago, my mind has to work in overdrive to understand any implementation that relies on a "pointer to a pointer". Hence why the first implementation is far easier for me to understand.

I'm sure we'll see many debates around which solution is simpler, but these debates will never reach a resolution. Depending on the skills and experiences you bring to the table, different people will objectively benefit from very different implementations.

https://software.rajivprab.com/2019/08/29/abstractions-are-i...


I've seen this presentation, and if I remember correctly, this example was more an attempt to convey what good taste can relate to, more than a definitive answer on whether the second is actually better than the first.

Many comments here are arguing that the first answer is actually better because it is clearer. I think it feels clearer to many of us not because it is actually simpler, but because it is the one we learnt at school / are used to see and therefore know by heart.

Linus is probably aware of this and may have meant to surprise the public with the second solution.

Why the second solution is better? Not because it does less branching and is more efficient. This misses the point. Not because there are fewer lines of codes and is terser. This also misses the point.

Fewer special cases means less ways to screw up, and also easier to follow. In the general case. Not only in this specific linked list example. And also clearer code.

It's just that in this specific case, we are used to the first solution that we are able to recognize at a glance (we "pattern-match" it).

Do you remember when you had to grasp this first solution the first time you encountered it or tried to write it? Many of us probably screwed it up and wasted time making it work. We might have forgotten the exact edge case Linus Torvalds was pointing out in this presentation. At least for me, I remember it was hard. I probably would have had easier time understanding the second solution by the way. The difficulty is a pointer indirection, but you better really understand pointers correctly when you are manipulating linked lists in C anyway.

Comments here also speak about leaving maintainable code for future developers on the project and avoid clever solutions to make their life easier, but it is the whole point of the second solution: let them not have to think about edge cases as much as possible.

Don't stop on this linked list example. We are all used to the first solution and Linus Torvalds probably picked this example because many people know linked lists. The message is: fewer edge cases is better. The goal is not to be "clever", in the negative sense.

Also see the original code with comments, which is way more readable: https://news.ycombinator.com/item?id=25327066


> [...] I don't want you to understand why it doesn't have the if statement. But I want you to understand that sometimes you can see a problem in a different way and rewrite it so that a special case goes away and becomes the normal case, and that's good code. [...] -- L. Torvalds

I think the idea of this extends much beyond a linked-list implementation, into software design and architecture.

Sometimes, you find more elegant solutions to something, that inherently do away with edge cases. I think this is the original intent, to show that you can find beauty, much as chessplayers do in chess. These solutions may be harder to understand completely, but you can actually encapsulate them in a function or use them as patterns!

A point is also made, there's often a rewrite involved. You usually don't need to find this stuff on first try.


Back in the 80's, I learned Pascal, and learned about its dynamically allocated records, then I went on to learn C, and got used to its pointers and arrays.

Then I went back to Pascal, and designed a program in my head with some dynamically allocated linked list data structures, and another data structure that had a member that pointed to the head of the linked list.

Then I started typing in the Pascal code, and hit a wall, because Pascal has ^ which is like C's * operator to dereference a pointer, but doesn't have anything like C's & operator to make a pointer to an arbitrary field in memory, so you can't actually make a pointer to anything except the beginning of a record that you dynamically allocated!

That was when I gave up on Pascal.

Programming Pascal is like riding a bicycle with only one leg.


Most programming languages don't have an "address of" operator. That's because it's a potentially unsafe operation. I think your usage of Pascal sounds more like trying to operate a motorcycle like a bicycle.


Pascal has what it call pointers, but doesn't have what C calls pointer arithmetic, or a way to take the address of any field of a structure or any element of an array.

So Linus's elegant linked list solution is possible in C, but not Pascal.

I don't get your metaphor that Pascal is a motorcycle and C is a bicycle. Just the opposite.


That must have been a very old Pascal

Nowadays there is the @ operator


Yes, very old Pascal, not even TurboPascal! I started with Apple ][ UCSD Pascal, but that program was for HP3000 Pascal, which might be even older.

There was a trick for doing PEEK and POKE in Apple Pascal, using a union that contained a pointer to some type you could dereference to read and write, and also an integer so you could set the pointer. I had the hardest time understanding it, and just could not get my head around the "record case boolean of", but I just typed in the magic code with a bunch of weird ^'s and it worked somehow.

You can use type punning tricks with "union" to implement arbitrary unsafe pointer arithmetic in Pascal, since it lets you convert between pointers and integers.

https://en.wikipedia.org/wiki/Type_punning#Pascal

Here's an article about that trick, in German (which makes it sound even cooler):

https://www.robert-tolksdorf.de/book/Tolksdorf-UCSD-Pascal-C...

p. 63: 3.1 "PEEK" und "POKE" auch in Pascal

[...]

    type byte = 0..255
    spchrinhalt = packed array [0..0] of byte;
    spchrstelle = record case boolean of
                    true: (adresse:integer); ( Zieger )
                    false: (inhalt:^spchrinhalt) ( Inhalt )
                  end;

    procedure poke(adresse:integer; inhalt:byte);
    var dummy:spchrstelle;
    begin
      dummy.adresse := adresse;
      dummy.inhalt^[0] := inhalt;
    end;

    function peek(adresse:integer): byte;
    var dummy:spchrstelle;
    begin
      dummy.adresse := adresse;
      peek := dummy.inhalt^[0];
    end;


> A particularly beautiful outcome is that the implementation has consistent semantics for the edge cases

Not sure if this is actually a benefit or not. Edge cases are notoriously hard to debug, so it's sometimes actually nice to have a branch that specifically handles edge cases. Conceptually, it's also much more difficult to wrap one's head around. I would be interested to see how much of the cs101 solution is compiled away and if there are any tangible benefits of being clever here.

PS: If the linked list is stored in contiguous memory (if you're using a slab allocator, for example), you can actually be even more clever (I'll leave that as an exercise to the reader).


> PS: If the linked list is stored in contiguous memory (if you're using a slab allocator, for example), you can actually be even more clever (I'll leave that as an exercise to the reader).

This might be a dumb question, but if list elements are stored contiguously, is there any advantage to using a linked list instead of a data structure that is designed for contiguous storage (something like C++'s std::vector)?


  > if list elements are stored contiguously, is there any advantage to using a linked list
Storing them contiguously probably implies that you consider "freed" space in the middle to also be part of the contiguous area. Otherwise you can't remove in O(1) time.

It is straightforward to maintain a list of freed nodes which you can add back later.

If you don't mind not being able to remove in O(1) time, you still have the advantage of passing the handi-capped (contiguous) linked list to interfaces that expect a linked list, but still get the cache locality of a plain vector.


>The standard CS101 approach makes use of two traversal pointers cur and prev

times has changed. Back then the indirect was the standard approach (in particular you wouldn't want to waste registers). It requires just a bit more complicated reasoning, and the CS and the people in it were just a bit closer to math back then. Today it is basic engineering/craft, and thus the standard is the much simpler for reasoning approach with simple pointers and the simple explicit special case handling - ie our modern enterprise C code.


Solutions that feature extra levels of pointing naturally arise in many cases. E.g. you search some data structure for an element. Initially you may do this so that the search returns the element itself or NULL if it's not found, but then you realize you'll also need to use the same searching logic to insert and delete elements. In this light you rewrite the search to return a pointer to the location instead; this way you can use it as a subroutine for all the operations.


This seems like the classic argument of whether approaches like Duff's Device[1] are a good implementation idea.

I would offer that there is no shame in doing something a bit more advanced, as long as there are test cases and documentation proportional to the advanced nature of the technique available.

[1] https://en.m.wikipedia.org/wiki/Duff's_device


Duff's device is a bad idea in modern code, because it's effectively a sign to compilers saying "Hi, please don't optimize my code or this loop in any way." In general, microoptimization of code to produce particular assembly sequences is a bad idea, because the actual assembly the compiler generates is only loosely related to the actual code.


This is great, I often run into these type of things in Rust where an unwrap() is used when there doesn't need to be if you refactor the code correctly.

EDIT: interestingly, I see a lot of comments here focusing on the reduction of lines of code (LOC). But IMO, the solution would still have been deemed "better code "if it had increase the LOC instead of reducing it. The idea is about eliminating edge-cases, not about reducing the number of LOC.


I agree that the if-less solution is more elegant, but I was surprised at the way it was achieved. I was expecting to see head implemented as an IntListItem. I.e., an empty list would be just the head IntListItem, with next = NULL, and value undefined.

I believe that this approach has the advantage of being clearer.

Admittedly, one drawback of this approach is that it uses more space, which could be an issue in an application where you have many lists, nearly all empty.


There shouldn't even be an IntListItem type, just IntList, or it could be typedef-ed to make the API clearer. In that case there no need to be for special case, and NULL is the empty list.


Linus isn't the first nor only person to discover this simplification, although I've usually encountered it as a "virtual head" using only a single indirection:

https://news.ycombinator.com/item?id=18997420

It's interesting to see how divisive the opinions are. I see it as the difference between the "growth mindset" and not.


Isn't it in Knuth, vol. 1?


Good article. The key insight was a bit buried though:

By using the pointer to the current element, you lose information (that element's ancestor), forcing you to introduce a "prev", and to track and update 2 variables.

By using a pointer to the pointer of the current element, you have access to all the information you need -- the "prev" and the "cur" -- just by following the pointer trail one or two steps.


  item** find_parent(item** list, item* item) {
    item** potential_parent = list;
    while(*potential_parent != item) {
      potential_parent = &(*potential_parent)->next;
    }
    return potential_parent;
  }
  
  void remove_item(item** list, item* item) {
    item** parent = find_parent(list, item);
    *parent = item->next;
  }


This keeps getting reposted/rehashed regularly. And the code is there for those wanting to look into system include headers, like sys/queue.h (I haven't actually checked for chicken and egg scenario in commit history of various libc implementation, but would hope that double pointer has always been there)


The most elegant way, imho, to join collection elements into a comma separated string: http://avodonosov.blogspot.com/2012/01/maybecomma.html


Why not the concise version?

for(p=NULL,q=head; q!=entry;p=q,q=q->next); *(p?&p->next:&head) = q->next;


Because we have big, high resolutions screens and we don't "list" (i.e. print) our source code anymore?


The elegant solution has two instructions less, but they are outside the loop. The performance difference is most likely minimal, no branch is eliminated.

https://c.godbolt.org/z/fdh461


I remember reading Sedgewick's algorithms book long ago. As I recall, he always suggested using a dummy node as the first entry in a list, which removed basically all head-based edge cases. But yeah, Linus's approach is pretty neat, esp. for fans of C.


The second solution may not be obvious at first but the idea is that in the first solution you are finding the node that needs to be removed whereas in the second solution you are finding the "next" of a node that points to the node to be removed.


Computerphile has a video showing a nice visual explanation of this technique, albeit for inserting values into a linked list rather than deleting them.

https://youtu.be/0ZEX_l0DFK0


Been programming since 1965. Strongly object to #2.

I respect the cleverness of Linus' solution. However, it really has no place in production code where not all journeymen are at the same lofty level.

The fact that it requires a detailed explanation exposes its impracticality.


off topic, I've enjoyed the talk referred to

https://www.ted.com/talks/linus_torvalds_the_mind_behind_lin...



So, as I gathered from this, Torvalds's solution takes advantage of the fact that the element scan is always started from the head, while the cs101 solution treats the head as a special case?

Both are smart IMO.


List of nodes vs List of pointers Data node with pointer vs Pointer node with data

Sometimes I improved my code a few days after I wrote them, because I found another interpretation of the original problem.


2013 - stop using linked lists: https://news.ycombinator.com/item?id=5751702


Your knee-jerk reaction is misplaced. There are valid use cases for intrusive linked lists [1]:

a) When your nodes can belong to multiple lists. You can't do this with arrays.

b) When you need fast removal from the middle of the list and you already have a pointer to the list node [2].

[1] In an intrusive linked list, the prev/next pointers are members of the payload node, as opposed to a "simple" linked list, in which the list nodes contain a pointer to the payload.

[2] This happens all the time in kernels. For example, you receive an interrupt and need to remove the corresponding task from the IDLE queue and append it to the RUNNING queue.


I seriously don't understand why this elegant example is a 'clever' code at all. It is just much easier to understand.


Good code is code that everyone in a team can understand, safely modify and maintain.


Both algos will crash if the target is not on the list, so both are equally bad :)


What Torvalds made is not a good taste argument for linked lists — he made a linked list argument for good taste. Or rather, a linked list argument about what good taste means.

I assume the author of this article had spent too much time being confused by the ‘Windows Subsystem for Linux’ nomenclature.


Good post. Not a "show hn" though, is it?


Correct—reading material doesn't qualify for Show HN. Otherwise every submission could have "Show HN" on it. We've taken that out of the title now.

Submitters: before putting Show HN on a title, please read the rules: https://news.ycombinator.com/showhn.html.


Author here. Thanks for fixing the title, I should have seen that!


Thank you, very well written.


I don't disagree with it being written well. I don't feel like using C and pointers is helpful for getting the point across.

After reading for a minute I realized it's all about pointer and C specific stuff, I am not going to revisit that just for an article...


When I interview prospective employees I sometimes ask them to explain the rudiments of linked lists in C.


Pointless and bad article.

I compiled the presented code out curiosity on arm gcc 8.2 on godbolt.org with -O3 op and the results are:

(the original remove is even faster then the so called elegant or the elegant with inline)

  remove_cs101:
          ldr     r2, [r0]
          cmp     r2, r1
          bne     .L3
          b       .L9
  .L6:
          mov     r2, r3
  .L3:
          ldr     r3, [r2, #4]
          cmp     r1, r3
          bne     .L6
          ldr     r3, [r1, #4]
          str     r3, [r2, #4]
          bx      lr
  .L9:
          ldr     r3, [r2, #4]
          str     r3, [r0]
          bx      lr
  
  
  remove_elegant:
          ldr     r2, [r0]
          cmp     r1, r2
          bne     .L12
          b       .L13
  .L14:
          mov     r2, r3
  .L12:
          ldr     r3, [r2, #4]
          cmp     r1, r3
          bne     .L14
          add     r0, r2, #4
  .L13:
          ldr     r3, [r1, #4]
          str     r3, [r0]
          bx      lr
  
  remove_elegant_with_inline:
          ldr     r2, [r0]
          cmp     r1, r2
          cmpne   r2, #0
          bne     .L17
          b       .L18
  .L19:
          mov     r2, r3
  .L17:
          ldr     r3, [r2, #4]
          cmp     r1, r3
          cmpne   r3, #0
          bne     .L19
          add     r0, r2, #4
  .L18:
          ldr     r3, [r1, #4]
          str     r3, [r0]
          bx      lr

EDIT: Hmmm, so much downvote without any reply.

Well if your code needs an article to be explained, rather than the plain solution and brings nothing, even 1 instruction slower, as the compiler can't make the same (faster) result as the plain solution, than your code is pointless. And stating that this is the elegant solution is IMO bad practice.


First, you haven't demonstrated at all that the "elegant" solution is slower. Between `remove_cs101` and `remove_elegant`, it looks like they are equivalent (in particular, the inner loop has the same number of instructions) except the former uses more instructions.

Second, I think you're missing the point by focusing on the code generated by the compiler. Clearly the two algorithms are equivalent, and a good compiler can generate equivalent code for them, up to an instruction more or less. The bigger concern here is about the source code: how can we express the algorithm in the most elegant (think: clear, concise, and correct) way? The exact generated bytecode is irrelevant, so long as the compiler is able to generate something reasonable, which is clearly the case here, as you've demonstrated.


The key here is "it looks like they are equivalent".

In fact, I am not sure they are equivalent (in general, without taking into account the context), and even if they were, proving this would be S.F. for a compiler today (2020). Even coming up with such an optimisation without hardcoding it does not seem plausible.


I don't see the so called "elegant" solution any special at all, but if the level 1 programmer in your org needs more time to understand, and maybe making bugs or mistakes, coz of not fully understand the code, than it's a problem. See Occam razor.

The "elegant" solution tries to be smart, or optimize where there is no need at all. Yes the simple "elegant" solution is just 1 instruction slower, and longer only in the end, but the inline version is 1 instruction slower in the loop, which is stated more elegant in the article.

EDIT: Or you can get the same principle as the Occam's razor, which is KISS keep it simple, stupid, silly and straightforward


You are probably right regarding speed, but the article is definitely not pointless and bad.


Let's draw a list

  IntListItem -> IntListItem -> IntListItem
and then draw the position of the list head:

  IntList -> IntListItem -> IntListItem -> IntListItem
You can see that the if-branch in the cs101 answer comes because there is a ('virtual') element of the list (the IntList head) that is different from the other elements of the list.

If we were to make them the same (C# code):

  interface IListItem 
  {
     IntListItem Next {get;set;}
  }

  class IntListItem : IListItem
  {
     int Value {get; set;}
     IntListItem Next {get; set;}
  }

  class IntList : IListItem
  {
     IntListItem Head {get; set;}
     IntListItem Next 
     {
        get { return Head; }
        set { Head = value; } 
     }
  }
then our cs101 code simplifies itself, folding into a prettier algorithm:

  void remove_cs102(IntList l, IntListItem target)
  {
     IListItem p = (IListItem) l;
     while (p.Next != target) 
     { 
        p = (IListItem) p.Next;
     }
 
     p.Next = p.Next.Next;
  }
the use of indirect pointers was masking the real issue: some algorithms look better if you add a virtual head (or a virtual tail) to your linked list. Emphasis on look though -- they work almost the same.


That looks interesting. The code is not simpler to understand nor is it more efficient, but it seems to work, assuming target is in the list. How would you represent the empty list?

Not sure why you are being downvoted.


The representation of the empty list does not change, it's still an IntList object with a null Head field.

The downvotes probably come from my blatant disregard of the real-world performance in the search of what I considered the real take-away from the article.


You've just introduced the extra indirection of virtual function pointers, perhaps through an extra indirection layer of interfaces, which will be a big performance hit, so you've hardly hit upon the "real issue".


How does an interviewer measure good taste? You the interviewer could have been the result of a variety of metrics, none of which are good taste related. Bad taste is endemic in corporate and corporate startups(if you enter the millions in funding budget) for the simple reason that adequate taste is more reliable.

Edit: Within 30 seconds this got downvoted by cowards with no response. Enough with lurker culture. Say something.


This is actually really easy imo.

1. For people who have worked on open source - just look through their code, their commit and you'll see how they think and operate.

2. If 1 isn't applicable, give them homework, not a test. Give them a very simple but very well documented task. Something along the lines of an authentication system, with password reset which is time restricted, basic encryption and security and that's it. This can be easily achieved in just about any language in several hundred lines of code. But given the adequate amount of time to think it through and develop it, you'll see if they come up with clever solutions to simple problems or a pile of duct tape hacks.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: