
Applying the Linus Torvalds “Good Taste” Coding Requirement - Jerry2
https://medium.com/@bartobri/applying-the-linus-tarvolds-good-taste-coding-requirement-99749f37684a#.7axbqnfqm
======
jprzybyl
I'm reminded of a quote from Moore in "Thinking Forth":

"A lot of conditionals arise from fuzzy thinking about the problem. In servo-
control theory, a lot of people think that the algorithm for the servo ought
to be different when the distance is great than when it is close. Far away,
you’re in slew mode; closer to the target you’re in decelerate mode; very
close you’re in hunt mode. You have to test how far you are to know which
algorithm to apply."

"I’ve worked out a non-linear servo-control algorithm that will handle full
range. This approach eliminates the glitches at the transitioning points
between one mode and the other. It eliminates the logic necessary to decide
which algorithm to use. It eliminates your having to empirically determine the
transition points. And of course, you have a much simpler program with one
algorithm instead of three."

"Instead of trying to get rid of conditionals, you’re best to question the
underlying theory that led to the conditionals."

That's part of a _chapter_ of the book called Minimizing Control Structures.
Forth guys are crazy about taste, and if I've learned anything from reading
their stuff, it's that chasing tasteful programming to its end gets very hard.

OP is right on the money. The hard thing is that it is a creative process, and
takes a real understanding of the problem you're solving to do it. Worst of
all, aside from the feeling of solving a puzzle well, the benefits only begin
appearing much later. I'm glad the kernel team takes it seriously.

~~~
bdrool
> That's part of a chapter of the book called Minimizing Control Structures.

Since the book is creative commons and freely available, I decided to download
the book and have a look. Figure 8.1, the "automatic teller" example is just
downright hilarious. The author presents the code and throws a challenge at
the reader: "Easy to read? Tell me under what condition the user’s card gets
eaten." Here's the code:

    
    
      IF card is valid DO
        IF card owner is valid DO
          IF request withdrawal DO
            IF authorization code is valid DO
              query for amount
              IF request <= current balance DO
                IF withdrawal <= available cash DO
                  vend currency
                  debit account
                ELSE
                  message
                  terminate session
              ELSE
                message
                terminate session
            ELSE
              message
              terminate session
          ELSE
            IF authorization code is valid DO
              query for amount
              accept envelope through hatch
              credit account
            ELSE
              message
              terminate session
        ELSE
          eat card
      ELSE
        message
      END
    

(hopefully no transcription errors from copying out of the PDF…)

Yes, it's easy. It gets eaten if the owner is not valid. Took one glance.

The thing that makes me laugh is that the example is contrived. This is the
sort of thing code editors solve. Even though I could quickly glance at it and
decipher the conditionals by eye, if I were working with this I'd still throw
it into an editor that shows me code scope. That makes it trivial and removes
any chance of error. This is a problem that has been solved.

I'm sure I will get a lot of responses saying I've missed the point, and yes
I'm aware that I've dodged it, but the author acts like the problem of
navigating nested conditionals is somehow _impossible_ , which I think is
ridiculous. Not only do lots people do it every day, not only are there tools
that help us do so, but there's pretty much no way to exist in this world
without depending on software that is written this way (ever looked at the
source for GCC, to pick one example?) Not saying it's right, but it's clearly
not a deadly problem.

~~~
kazinator
Just the raw formatting of that could be improved, by cuddling up ELSE IF from
separate lines.

Aren't there some END tokens missing? Let me put them in:

    
    
      IF card is valid DO
        IF card owner is valid DO
          IF request withdrawal DO
            IF authorization code is valid DO
              query for amount
              IF request <= current balance DO
                IF withdrawal <= available cash DO
                  vend currency
                  debit account
                ELSE
                  message
                  terminate session
                END
              ELSE
                message
                terminate session
              END
            ELSE
              message
              terminate session
            END
          ELSE
            IF authorization code is valid DO
              query for amount
              accept envelope through hatch
              credit account
            ELSE
              message
              terminate session
            END
        ELSE
          eat card
        END
      ELSE
        message
      END
    

Okay, now:

    
    
      IF card is valid DO
        IF card owner is valid DO
          IF request withdrawal DO
            IF authorization code is valid DO
              query for amount
              IF request <= current balance DO
                IF withdrawal <= available cash DO
                  vend currency
                  debit account
                ELSE
                  message
                  terminate session
                END
              ELSE
                message
                terminate session
              END
            ELSE
              message
              terminate session
            END
          ELSE IF authorization code is valid DO
            query for amount
            accept envelope through hatch
            credit account
          ELSE
            message
            terminate session
          END
        ELSE
          eat card
        END
      ELSE
        message
      END
    

Okay, just one ELSE IF; not much opportunity for that.

~~~
kazinator
On that topic, I did a somewhat weird thing in C yesterday. I took code like
this:

    
    
      for (;;) {
        /* original big loop */
      }
    

and turned it into this:

    
    
      if (compatibility_with_old_version) for(;;) {
        /* original big loop: same indentation level! */
      } else for (;;) {
        /* rewritten new loop */
      }
     

I.e.

    
    
      if (...) for (...) {
    
      } else for (...) {
    
      }
    

Works for me.

------
akkartik
I dunno, the first example seems unsatisfying. The original has that ugly
condition, the "good" version seems overly clever. And for all the talk of
taste and aesthetics, both versions ignore an elephant in the room:
defensively dealing with `entry` being absent from the list.

Not that I've never abused addresses like this. But having written this
multiple times, I currently prefer something like this:

    
    
      remove_list_entry(entry)
      {
          if (head == entry) {
              head = head->next;
              return;
          }
    
          for (prev = head;  prev->next;  prev = prev->next) {
              if (prev->next == entry) {
                  prev->next = prev->next->next;
                  return;
              }
          }
      }
    

There's still an `if`, but it isn't so bad since it's an early exit. What is
"tasteful" about hiding the fact that one of the cases is _far_ simpler than
the other? There's conditionals and then there's conditionals. There's cases
and then there's cases.

The other benefit of this approach: you eliminate segfaults by construction,
because the `while` has been replaced with a much more _normal_ `for` loop
iterating through the list. Keeping it normal is 80% of the secret to avoiding
security holes.

Rather than appealing to something nebulous like "taste", I prefer to focus on
concrete functional benefits. What is a situation where someone using or
modifying this function might be subtly led astray? That seems a better guide.

 _(Earlier version posted on
reddit:[https://www.reddit.com/r/programming/comments/59cq8r/applyin...](https://www.reddit.com/r/programming/comments/59cq8r/applying_the_linus_tarvolds_good_taste_coding/d97pcrl)
)_

~~~
rukuu001
Screw 'taste', go for _obvious_.

If you're at BigCo and your codes going to be maintained by disinterested
drones/ random contractors/etc, obvious code is better.

And lack of checking on _entry_ pissed me off too.

~~~
agentgt
I say screw 'taste' and 'obvious' and instead go for _it has high coverage
unit tests and lots of them_.

You can always go back and make it look pretty safely with good tests.

I just love code snobs that think their code is so good they don't need tests.

~~~
deong
> You can always go back and make it look pretty safely with good tests.

Yeah, but you won't. I think that's part of the reason why people care about
"taste". I want to maximize the chances that essentially the first draft is
good enough to be maintained for as long as possible, because years of hard-
won experience has taught us that it will.

~~~
agentgt
> Yeah, but you won't. I think that's part of the reason why people care about
> "taste".

Ahh but you eventually will (at least in my experience). If you don't need to
the code either doesn't matter (dead code) or works just fine.

I think I have looked at my own companies entire code base a couple of times.
Do I see nasty crap... all the time. And I often fix it for good taste when it
is really bad but I have to say the value prop isn't very good compared to
adding tests or other automated validation (including automated profiling and
dead code analysis).

Kernel code and in particular C are sort of special cases because they are
hard to test and performance is generally a high concern. But for many other
higher level languages and problem domains this is not the case.

I have been meaning to look how at how much code bases change overtime. In the
case of Linux I wonder how much of the Kernel code has stayed the same (lets
say over a 5 year period). Probably not a ton but I bet for other domains
particularly business software the code either just dies or changes
dramatically over time.

Then there are I suppose other domains where the code really can't physically
change often (ie space probes and embedded devices).... for those systems I
really really hope they have lots of tests.

------
johan_larson
For my money, Linus's example of "good taste" gives up rather a lot of clarity
to achieve succinctness. The original is simple and clear. His preferred
version is shorter, but also harder to understand because of its use of a
complicated indirection. And that's not good taste. It's just showing off.

“Programs must be written for people to read, and only incidentally for
machines to execute.” ― Harold Abelson, Structure and Interpretation of
Computer Programs

~~~
TickleSteve
I completely agree (tho you're getting downvoted by others).

Linus's code here is 'clever'... but not good, simple code.

~~~
kwhitefoot
It's a long time since I wrote much C but Linus' version seems like idiomatic
C to me. The use of pointers in C is an ordinary thing and those who write a
lot of C should be fluent in their use.

It's interesting to apply the same technique to other languages. I have to use
VB.net most of the time so here are implementations of the tasteless and
tasteful versions in VB (untested so there might be bugs). Even in VB the
tasteful version is shorter and, I think, clearer.

    
    
        Module Module1
    
          Public Class ListEntry
            Public value As String
            Public [next] As ListEntry
          End Class
    
          Public Head As ListEntry
    
          ''' <summary>
          ''' Straight translation of Torvalds' tasteless version.
          ''' </summary>
          ''' <param name="entry"></param>
          Sub RemoveListEntry(entry As ListEntry)
    
            Dim prev As ListEntry = Nothing
            Dim walk = Head
    
            ' Walk the list
            While walk IsNot entry
              prev = walk
              walk = walk.next
            End While
    
            ' Remove the entry by updating the head or the previous entry.
            If prev Is Nothing Then
              Head = entry.next
            Else
              prev.next = entry.next
            End If
          End Sub
    
          ''' <summary>
          ''' Straight translation of Torvalds' tasteful version.
          ''' </summary>
          ''' <param name="entry"></param>
          Sub RemoveListEntry1(entry As ListEntry)
    
            Dim indirect = New ListEntry
            indirect.next = Head
    
            ' Walk the list looking for the thing that points at the thing that we
            ' want to remove.
            While indirect.next IsNot entry
              indirect = indirect.next
            End While
    
            ' ... and just remove it.
            indirect.next = entry.next
    
          End Sub
    

End Module

------
PJDK
I was given a piece of advice very early on in my career that I've always been
grateful for, which is fundamentally the same as this. IF and FOR are both
code smells.

One case of this is just simplifying loops with some functional goodness

    
    
      var listOfGoodFoos = new List<Foo>();
      for(var i = 0; i< listOfAllFoos.Count; i++)
      {
         if(listOfAllFoos[i].IsGood)
             listOfGoodFoos.Add(listOfAllFoos[i]);
      }
      return listOfGoodFoos;
    

VS

    
    
      return listOfAllFoos.Where(x => x.IsGood);
    
    

But perhaps a more interesting point is it can also be a a sign of DRY gone
wrong - two things that aren't actual repeats of each other, but just similar
code, are smushed together with lots of IFs hang around to make it actually
work.

A connected piece of advice was "it is easier to push things together than
pull them apart" so err on the side of assuming two bits of code are not the
same, knowing you can refactor them together later (probably only a few hours
later) if it turns out they are in fact pretty similar.

~~~
twwl
It's not simplifying it though, it's just hiding the complexity in syntactic
sugar. I prefer the first way of doing things vastly over the second. Yeah,
it's more code, but it's also more or less "what is actually happening",
instead of an euphemism which has to be unpacked.

~~~
RubyPinch
> it's just hiding the complexity in syntactic sugar.

And that complexity is only written once, instead of multiple times for a
single project

You can have as many FOR and IF branches that you want, but only if they are
all bugless!

~~~
twwl
Actually, it's written exactly as often as the original construct.

The idea that writing more than 5 letters, or a for loop here and there, is
this giant cesspool source of errors is totally alien to me. Of all the ways
in which my programming sucks, this was never among them. So thanks for you
permission, I'll use it wisely :P

------
AceJohnny2
These are great examples, and they hint to, but do not mention, the big
counterpoint: development time. In his own examples, the author admitted that
though the code was ugly, it worked. He then _spent extra time_ reworking the
existing code to make it, well, prettier.

"If I had more time, I would have written a shorter letter." \-- Voltaire

The problem is that, in many (most?) professional settings, the developer is
under huge pressure to _get shit done_. Not get it done perfectly, not done
beautifully, but just done in the first place. We rarely have the leeway to
spend extra time refining existing code, existing _features_ or _bugfixes_ ,
to make them prettier. It's gotta be done, and it's gotta be done yesterday
because maybe some other part of the project is blocked because of it or some
customer paid for it and it was supposed to be done last month or it broke and
why the hell isn't it fixed yet!?

Some projects, I daresay mostly open-source projects, can afford to be
detached from the pressure of deadlines that corporations require, and reject
code that isn't to their quality standards. Sadly, that's not the case for
most of us.

~~~
palunon
"If I had more time, I would have written a shorter letter." \-- Voltaire

You're quoting Pascal, not Voltaire ;)

~~~
majewsky
"The worst thing about the internet is how quotes are always misattributed."
\- Albert Einstein

------
troydj
Linus' "good" version has a McCabe cyclomatic complexity of 2, whereas the
"bad" version has a value of 3. So, objectively, one could argue there is
improvement there (albeit small). Validation of the "good" version will be
easier (e.g. code coverage testing) with fewer paths through the code.
Additionally, a lower cyclomatic complexity typically implies less stress on
the developer's working memory while reading code (since you don't have to
consume an internal brain "register" holding the result of a conditional while
following the flow of control).

~~~
k__
I thought the same.

On the other hand it took me a bit of time to figure out what's going on.

But I'm not C dev, so that could be the issue here.

~~~
dwc
Serious question: after figuring it out, did you have a somewhat better grasp
of linked list mechanics than before?

~~~
rocqua
I had a better grasp of how to work with them (am not who you replied too).

I never had to build my own linked list though. I never took those basic
courses.

------
no_protocol
> it only performed 256 loop iterations, one for each point along the edge

 _alarm bells_

There are only 252 points along the edge. This code will act on each corner
twice. If you were performing an operation like `+= 1` on each edge element,
this code would be wrong. When you copy and paste it later and change all the
`= 0` to something else, you might end up with an unfortunate surprise.

Once I saw this mistake in the 2nd code example, I guessed it would also
appear in the 3rd one. Sure enough, it does. This is just as _unsavory_ to me
as the original code.

I'm not sure if the author is here or not, but if you are, can you see the
fix?

~~~
tomerv
Here's my fix:

    
    
      for (i = 0; i < GRID_SIZE - 1; ++i) { // Note the "-1"...
      
          // Top Edge
          grid[0][i] = 0;
      
          // Bottom Edge
          grid[GRID_SIZE - 1][GRID_SIZE - i - 1] = 0; // Reverse the order: from left to right
      
          // Left Edge
          grid[GRID_SIZE - i - 1][0] = 0; // Reverse the order: from bottom to top
      
          // Right Edge
          grid[i][GRID_SIZE - 1] = 0;
      }
    

This way, for every row/column, the loop acts on all but the last cell, which
is actually the first cell of some other column/row.

~~~
danielsamuels
I feel like for the each side you could actually handle two items per
iteration, one forwards _and_ one backwards. That way your iteration count
could be halved.

~~~
tbabb
That doesn't reduce the number of cells touched, and would make the algorithm
more complicated. It would also probably hurt cache locality. Going backwards
is also pointless complication, you could just as easily cut into two sections
and go forward in both.

And furthermore why split in two? Why not split the range into three sections,
or four, or N and completely unroll the loop?

Probably not a good-tasting construct at all.

------
tempestn
Not being used to C syntax, I took this as a typo at first:

    
    
        indirect = &(*indirect)->next;
    

The position of the parentheses make it look like you're dereferencing only (*
indirect). But on second look, you need the parens there so that it's (*
indirect)->next as opposed to indirect->next. Then the & operates on the whole
thing. I'd be tempted to wrap it in a second set of parens for clarity, but
perhaps it's entirely obvious to a programmer who's actually used to using
pointers. IE:

    
    
        indirect = &((*indirect)->next);

~~~
TickleSteve
To me, its 'clever' code.... i.e. it should be simplified.

This is not good code, it requires too much thought to realise what its doing.

~~~
tempestn
It's busier to look at due to the syntax of the language, but if you're
comfortable with pointers one could argue that it's conceptually simpler than
the first example. And since linked lists are all about pointers, operating on
the pointer directly seems logical.

That said, for someone unused to the syntax (as I am!), I agree that it is
definitely harder to mentally parse. So which is objectively better probably
depends on your audience.

~~~
palunon
>So which is objectively better probably depends on your audience.

I guess when your audience is kernel developers, this isn't that much of a
problem...

~~~
TickleSteve
not true at all.... kernel developers are not mythical beasts, they're just
regular low-level developers. Nothing special.

I've got 20 years experience developing this kind of stuff and to me, its not
as simple as it could be, therefore it could be better.

~~~
RandomOpinion
You have 20 years of experience on this kind of stuff and don't recognize the
textbook example code for handling the head pointer case of a linked list?
Well, okay...

(It's at least 23 years old because I learned from textbooks that have this
example code way back then. It probably dates all the way back to K&R.)

~~~
TickleSteve
What textbook would that be exactly?? (not that it matters)

The point isn't "can I understand it"...

the point is "Would someone who hadn't seen it before have to do another
mental operation to understand it?"

...and yes... dereferencing _is_ another operation you have to do in your head
to understand it.

~~~
RandomOpinion
Well, I'm at work so I can't dig around my library but it was easy enough
using date range restrictions to find an academic example in Google that dates
back to '99:
[http://cslibrary.stanford.edu/105/LinkedListProblems.pdf](http://cslibrary.stanford.edu/105/LinkedListProblems.pdf)
. (It was actually the first hit.) Searching further backwards in time is left
as an exercise to the reader.

Multi-level pointer dereferencing is a fundamental C / assembly language skill
and Linus' example was just about the most basic example there is.

~~~
TickleSteve
We will have to agree to disagree on that point.... And I have just as much
experience in c as Linus.

------
deanCommie
I don't know how to reconcile Linus's track record is of indisputable
brilliance and success (I use Linux and Git on a daily basis and am eternally
grateful for both), with the fact that I would absolutely DESPISE working with
a peer with the kinds of attitudes and opinions on coding that Linus has.

I think the problem I have is that a lot of people use Linus's examples and
stories as justification for their own suboptimal interpersonal skills,
overly-clever code that only they understand, but without the same level of
brilliance or track record to justify it.

Ultimately I think Linus's contributions to Software are pantheon, but he
should not be looked to for imitation or lessons. His lessons for success are
detrimental to the vast majority of software engineers.

~~~
Noseshine
You make claim after claim, and I think they _do_ need substantiating
evidence. After all, I find it highly unlikely that if he were even 10% as bad
as you make it sound he would be the head of one of the most successful
projects - especially since everybody working with him does so voluntarily.
There would have been breaks long ago. The fact that the Linux kernel held
together under the original author loudly speaks against assumptions of "bad
interpersonal skills" on the side of Linus.

~~~
majewsky
I won't argue with your main point, but a fair share of kernel devs are
probably not working there voluntarily, since they write kernel code for their
employer.

~~~
ptomato
I'd expect that most devs capable of writing reasonable Linux kernel code
would not have difficulty finding employment somewhere else doing something
else.

------
sverrirs
I don't know why I'm even replying, this will get so much hate here, oh
well...

1\. Just because things aren't done "the way you would do them" doesn't mean
they're "bad" or "wrong".

2\. If you're not on a solo project, I've found writing correct but less
"clever" code to help shorten ramp-up time for new devs and be more beneficial
to future maintainability of the code-base and system.

TL;DR; Swapping values by XOR'ing may look elite and clever but hurts you in
the long run.

~~~
eps
Re: #2 - in Linux case, double-pointer list removal succinctly illustrates the
minimum C proficiency needed for tinkering with kernel code.

That is, not every project has "shortening ramp-up time" as a priority. In
quite a few cases non-trivial code doubles as filter for less skilled devs.

~~~
phee
>In quite a few cases non-trivial code doubles as filter for less skilled devs

Right, until the less skilled dev it's you, months later after you moved on to
other stuff, and have to get back and fix some clever shit you wrote but don't
understand anymore.

Unless you're doing numeric or performance critical work, code should be
optimized for readability and maintainability.

~~~
smitherfield
If this example is too "clever" for someone, even sight unseen, they aren't
(yet?) cut out to be a kernel hacker. I don't disagree with your general
principle, but the linked example is idiomatic and understandable C for
someone with a decent knowledge of the language, which is (I'd imagine/hope)
the standard expected of contributors to Linux.

~~~
phee
Actually I believe this particular example is basic and pretty understandable
C.

I was talking about the more broad idea of knowingly writing clever code to
raise the entry barrier for new devs. Or just writing clever code for whatever
the reason. It's almost guaranteed to backfire, to new devs who will have to
maintain it or to yourself, when you will have to read it in a less _smart_
moment.

~~~
smitherfield
Entirely agreed.

------
Smaug123
I've been told many times that it's better to eliminate edge cases. I still
don't really believe it as a universal law.

When explaining the algorithm "remove an element from a linked list" to
someone else, I would say "starting at the head, go along the list until you
find the element; then delete the element". When explaining the algorithm
"delete an element from a linked list", I would say "if you're at the head,
update the head to be the next element; otherwise update the `next` property
of the previous element to be the `next` property of the current element".
It's so naturally a two-case problem that I'd be really surprised if anyone
came up with the one-case answer first. Therefore, the two-case answer is the
more readable to someone who has not seen the code before, because it
corresponds to their intuition about how the algorithm should work.

As a bonus, it's clear to me that the naive algorithm is correct, but I have
to do mental work to convince myself that the one-liner algorithm is correct.
(I'm saying nothing about the code; only the algorithms.)

I agree much more with the reduced complexity of the later example
(initialising the edges of a square), and I actually think it's worth double-
counting the corners in this instance because the code is so much cleaner.
(EDIT: Although I suppose the easy fix doesn't detract from the ease-of-
reading, so actually it's avoidable.) In that instance, there's doubt about
which is the natural algorithm to pick: one could conceivably come up with
"initialise the top and bottom, then initialise the left and right", or
"initialise the zeroth element of each side, then the first of each side,
then…" as one's first attempt at solving the problem. Therefore I like this
particular simplification.

~~~
apk17
> It's so naturally a two-case problem that I'd be really surprised if anyone
> came up with the one-case answer first.

[Raises hand] I always did it Linus' way in C (way before him); I don't
remember if I ever had the need in assembly before.

~~~
Smaug123
Fair enough. I stand surprised! How would you describe the algorithm to a
friend, then?

------
smnscu
I think "competitive" (i.e. solving algorithmic challenges for fun) coding
really gives you some insight into how to write code that's short and to the
point. The user with the most reputation on LeetCode, for example,
consistently posts solutions that are surprisingly short, efficient, and
readable.

[https://discuss.leetcode.com/user/stefanpochmann](https://discuss.leetcode.com/user/stefanpochmann)

(some random examples)

[https://discuss.leetcode.com/topic/18731/7-lines-
c-c](https://discuss.leetcode.com/topic/18731/7-lines-c-c)

[https://discuss.leetcode.com/topic/16988/7-lines-3-easy-
solu...](https://discuss.leetcode.com/topic/16988/7-lines-3-easy-solutions)

[https://discuss.leetcode.com/topic/33430/6-lines-o-log-
min-m...](https://discuss.leetcode.com/topic/33430/6-lines-o-log-min-m-n-ruby)

~~~
sandos
The code might be short and to the point, but it definitely is not easy for me
to understand!

The code feels like a bit like clever perl one-liners.

~~~
fiveop
Or you could just loop n-1 times and adjust the array indices in the loop
body, so that each statement only deals with one corner.

------
Animats
Both versions suck. If "entry" is not found in the list, the code will run off
the end of the list, either de-referencing zero and faulting or going off into
junk, depending on how the list ends.

Writing low-level list manipulation more than once sucks. It leads to bugs.
Such manipulation should be encapsulated. That's why we have containers in
modern languages. The Linux kernel is still C, not C++, which leads to too
much of this sort of thing.

~~~
clarry
I think practicing C programmers are using the list implementations that were
given to us two decades ago. Or any of the newer alternatives. And if they
aren't, it's not because of the language.

------
sytelus
Here's even more "tastier" 2 lines version:

    
    
      remove_list_entry(entry)
      {
          for (indirect = &head; (*indirect) != entry; indirect = &(*indirect)->next);
          *indirect = entry->next;
      }
    

The big problem: both Linus's and above versions don't handle the case if
entry wasn't found, for example, if entry was null. This is the fundamental
issue with trying make code overly compact: sometime you _lose_ the sight of
important edge cases. Segfaults are not tasty. I generally prefer to write
code that reflects my thought process and avoid unnecessarily try making it
compact. As Knuth had said "Programs are meant to be read by humans and only
incidentally for computers to execute”. Uglier versions allows to explicitly
documents edge cases. This enables future maintainer to make sure these cases
are covered when s/he makes code changes. Obviously taking this to another
extreme would ruin this. The better taste lies somewhere between the compact
Linus's version and some zeolite's too verbose version.

Small problem: Another thing to think about is extra pointer redirection
required in Linus's code. This is fine for most cases but if I was dealing
with very large list over and over then that's unnecessary perf hit.

~~~
thomasahle
The 'entry wasn't found' case is an entirely different issue than what Linus
is showcasing. It may or may not be relevant for the particular function,
depending on its contract and documentation.

------
kccqzy
I would argue for the last example it is better to write split the loop into
four. I believe the intention is even clearer (because you are doing four
different things), and that it is also faster even when optimizations are off
due to better cache behavior. It may also make a smart but not "sufficiently
smart" compiler to transform two of the loops into simple memset.

------
stirner
The author doesn't mention that the grid being initalized is square, and
simply describes it as grid[rows][cols]. However, all his solutions are only
correct for a square grid.

One solution for arbitrary dimensions could be

    
    
        for (i = 0; i < rows; ++i) {
            grid[i][0] = 0;
            grid[i][cols - 1] = 0;
        }
    
        for (i = 0; i < cols; ++i) {
            grid[0][i] = 0;
            grid[rows - 1][i] = 0;
        }

~~~
dghf
You're updating the corners twice. Not really a problem with a simple
assignment, but as no_protocol pointed out elsewhere [0], it could be if you
ever wanted to change that to, say, an increment.

Replace your second for statement with

    
    
        for (i = 1; i < cols - 1; ++i) { ...
    

and you're golden.

[0]
[https://news.ycombinator.com/item?id=12794235](https://news.ycombinator.com/item?id=12794235)

~~~
stirner
You're right, my version is unoptimized because I was trying to keep the code
neat as in the article. I think that optimizing the code for hypothetical
future reuse is a bit premature. You could make a performance argument, but
this optimization is going to prevent just 2 memory accesses, which aren't
going to matter much compared to all the rest. If I was chasing performance I
would start by memset'ing the top and bottom edges since they're contiguous in
memory.

~~~
stirner
My mistake - the optimization would prevent four memory accesses. Still O(1).

------
sde
The code in the examples is not valid in any historical version of C. The use
of the -> operator with an implicit int as the left argument ("entry" in the
examples) was only valid from 1975 in Unix C (but not in GCOS C) until K&R C
in 1978. The use of C++ style comments ("//") is only valid in standard C
since 1999.

There's no point in arguing over pseudocode as if it's C.

------
solipsism
The root of the problem is that "the thing that points to an entity" is not a
consistent concept in this kind list. Sometimes it's "head", sometimes it's an
entity's ->next. I wonder what Linus would think of implementing the linked
list with a dummy head node, having no value and pointing to the first entity.
Personally, I think that would be tasteful. It allows for simple loops like
the "good taste" one, but you don't have to think so hard about how to
implement a simple operation like _remove_. You don't need the added
indirection.

~~~
majewsky
It's one more pointer that you have to resolve, everytime you do something
with the list. Sounds smelly to me.

~~~
dpark
Depends on where you store it. If you're just passing around node pointers,
then you could create the dummy node on the stack only when you need it and
it's the same indirection cost as the "indirect" pointer. If you've got a
"list" struct that you're passing around (pretty unlikely in C), you could
embed the dummy node directly.

------
jordigh
The example there about edges on an array is something I've had to directly
deal with myself when I implemented a multidimensional image-processing
function for GNU Octave. The problem is to find connected components in a
binary image, where voxels are either 0 or 1. You want to find all the islands
of ones, and you want to be flexible if diagonals count as being connected or
not.

The problem, of course, is that along the edges you don't want to check for
neighbours outside of the image boundary. I found no good way to do this for
an n-dimensional image, after a few attempts of writing very complicated code.
In the end, I ended up padding the whole image with a boundary of zeros,
iterating over the padded image with an "if(current_voxel)" conditional that
skipped checking for lit neighbours around the boundary, and when checking for
lit neighbours at the original image's boundaries would give no neighbours at
the padded zero boundaries.

The code was cleaner, but I incurred a big realloc, because N-dimensional
images in Octave are stored as contiguous Fortran-order (column-major) arrays.
I'm still looking for a better solution to this problem.

So, how do you do this cleanly?

~~~
arghbleargh
Isn't it simple enough to make three bounds checks to essentially do the same
thing you did with the zero padding? You can even make it a function

    
    
        is_in_bounds(x, y, z)

~~~
jordigh
It's n-dimensional, not 3-dimensional.

Also, it's really slow and wasteful to be doing bounds-checking for each voxel
of an image that is almost completely not a boundary, whereas you can't avoid
checking if each voxel is lit. :-)

------
snovv_crash
One anti-pattern I've seen a few times, across multiple codebase and
companies, is the for i in 1...cases: switch(i)

To me, his first example fits this perfectly, except he is using ifs instead
of a switch. Basically, if you are generating a big, trivial sequence and then
filtering it down afterwards, think harder and probably there is a way to
generate the sequence you are actually after directly.

~~~
Dylan16807
> One anti-pattern I've seen a few times, across multiple codebase and
> companies, is the for i in 1...cases: switch(i)

Also known as the "for-case" loop. Version 2 is a prime example of this, and
what not to do. It processes case 1, then case 2, then case 3, then case 4.
The author rewrites it as one loop of 64, but four loops of 64 each would also
be acceptable.

> To me, his first example fits this perfectly, except he is using ifs instead
> of a switch. Basically, if you are generating a big, trivial sequence and
> then filtering it down afterwards, think harder and probably there is a way
> to generate the sequence you are actually after directly.

Now you're talking about a different problem. The conditionals in the first
version do not iterate through like code doing "switch(i)". I agree that it's
better to directly generate than to filter, but that's a totally separate
anti-pattern.

------
leothekim
This article made me think of this Gordon Bell quote:

"The cheapest, fastest, and most reliable components are those that aren’t
there."[1]

That quote is a reminder to me that there is a certain artistry behind
engineering anything.

[1]
[http://quotes.cat-v.org/programming/](http://quotes.cat-v.org/programming/)

------
e19293001
Maybe I'm wrong about this but I noticed that the code didn't call free() to
the element that has breen removed. That would cause a memory leak. Am I
missing something here?

~~~
halomru
The caller still holds a pointer to the element and can free it. I would argue
that delegating the responsibility of freeing the element to the caller is
more performant, since that allows the code to reuse that list element and
insert it into a different (compatible) list.

That's the kind of performance/ease-of-use trade-off that sounds reasonable in
an open source project with strict code reviews (but in very few other places)

------
facundo_olano
All due respect to Linus, but the second example is much less readable. Proof
of that: the code was reduced in half, but the amount of comments was doubled.

I think concepts like "elegance" and "good taste" of code should exist, but
they are obviously subjective. My choice of good taste always considers
readability first.

~~~
dpark
I'm pretty sure that the comments are for people who Linus _doesn 't_ want
contributing to the kernel. This code would likely have no comments at all if
checked into the Linux kernel because it would be obvious to everyone allowed
to do commits.

------
tbabb
Oh good lord. That edge-initialization code made my jaw drop. Did he not think
about the problem _at all_ before putting code down?

------
saosebastiao
I know Sufficiently Smart Compilers are a running joke, and I know a lot of
people think the idea is a bad one, but come on...this should be the job of a
compiler. The first example is _way_ more understandable. Yes, it branches,
and yes it is more code and less efficient. But a compiler _should_ be able to
derive the more efficient version from the more understandable code. And it's
a shame that it doesn't.

~~~
zodiac
I think Linus argued that the second version is more understandable.

In fact if you read the article there's no claim being made that the second
version is more efficient, and I don't think it is

------
WesternStar
For the curious like me who wanted to see the assembly for the Linus examples
and compare [https://godbolt.org/g/sqdBrf](https://godbolt.org/g/sqdBrf). Most
importantly they have the exact same number of branches and their loops are
exactly the same size 4 instructions on all opt levels greater than 02. The
setup code for the first one is much longer.

~~~
WesternStar
Here is also a somewhat representative example of various grid initialization
pay attention to version 3 where the compiler unrolls the loop in O3 its
likely the fastest version.
[https://godbolt.org/g/mMBhuW](https://godbolt.org/g/mMBhuW)

------
nurettin
Often I find myself solving problems in a single performant SQL with
lateral/cross joins before hitting the business logic code in order to
simplify the logic part into a single loop with no conditionals/edge cases.

Turns out this is not the best practice because you will have to explain all
about your clever use of olap functions and lateral joins to someone less
knowledgeable who has to append to your code.

~~~
dragonwriter
I would argue that having developers that don't understand and won't bother to
learn SQL working on application dependent on SQL-based DBs is the not-best-
practice there, not proper and efficient use of SQL by developers who do know
what they are doing.

~~~
nurettin
You are right, of course, but they have "working knowledge" of sql which is
adequate for most reports and operations.

Just not the "high level" SQL someone armed with the knowledge of functional
programming could conjure when faced with the alternative of writing lots of
nested conditional statements and loops in an imperative language because it
would bore him to pieces.

------
licorna
You can think of your grid array as a contiguous piece of memory. I would do:

    
    
      bzero(grid, sizeof(int) * COLS);
      bzero(grid + COLS * (ROWS - 1), sizeof(int) * COLS);
    
      for (int i=1; i < ROWS-1; i++) {
        *(grid + (i * COLS)) = 0;  /* set left col to 0 */
        *(grid + (i * COLS) + COLS - 1) = 0;  /* set right col to 0 */
      }

------
sssilver
> To the best of my ability to discern, the crux of the “good taste”
> requirement is the elimination of edge cases, which tend to reveal
> themselves as conditional statements. The fewer conditions you test for, the
> better your code “tastes”.

 _looks at his Swift code and gulps_

------
stevebmark
Eventually you will realize Linus is a bad person to take advice from and stop
idolizing him.

~~~
sigjuice
Examples of good persons to idolize, please? :-)

~~~
stevebmark
I'm an anonymous internet poster, I don't have a suggestion for who you should
idolize. But picking Linus is a red flag. His man-child entitled attitude,
condescension, arrogance, etc, make him a parody of a toxic developer that
ruins teams from the inside out. Normalizing and uplifting that behavior makes
us all worse off.

------
delinka
Does it follow that attempting to write code that is closer to functional
style also improves the code's "tastefulness?" I'm under the impression that
better functional code contains fewer branches.

------
usefulcat
Is it just me or does the grid initialization example seem extremely
contrived?

~~~
vilya
I thought that too. I found it hard to believe that anyone would write either
of the first two versions without having thought of the 3rd - or, you know,
just having 2 or 4 separate for loops. Maybe there was something about the
problem's context that the author left out of the post though?

------
harry8
Both are correct, one is considered better based on a subjective assessment.
People who disagree with that the version a person selects is __objectively
__better have no taste.

------
LordHumungous
I'm a bit confused. If the entry is not in the list, does Linus's while loop
throw a seg fault? Or are we assuming that the list contains the entry?

------
k__
Well, the "good taste" example has a lower cyclomatic complexity, so it is at
least a measurable change.

------
jaakl
Every time you write an 'if' and do not stop for a moment to consider it, a
kitten dies.

------
vacri
Speaking of coding niceties, when I started branching out from bash into a
'real' language, Python, I couldn't find a switch statement. A colleague of
mine said that Python didn't have one, and that switch statements were a 'code
smell'. I didn't really understand that, and asked about those times that you
genuinely could use a switch statement, and the answer was "just use a long
if-else function". I can't remember his justifaction for it, so clearly it
didn't stick

I still don't see the real pragmatic difference between a switch statement and
a long if-else function - anyone feel like ELI5'ing it for me? Switch
statements just seem more concise and readable to me...

~~~
gknoy
Switch statements have fall-through behavior that you need `break` to avoid
(in most languages); sometimes this is what you want, but only rarely.

One trick I've seen people do in Python is to use a dict; define lambdas for
each case, and then key the dict on your value that you would switch on, and
invoke the lambda (or use existing functions).

    
    
        def fake_switch(self, thing):
            {
                'foo': self.do_foo,
                'bar': self.do_bar,
                'frog': self.blast_vent_core
            }[ thing ]()
    

I am doing a poor job of explaining _why_ one might want to do that, but one
advantage is that you can verify that the right callbacks were called in
tests, and also verify that each callback behaves right when invoked, since
they are just references to methods or functions. I'm not sure if doing this
would get marks for cleverness and testability, or would get one criticized
for excessive cleverness. As the reviewer of the code when a coworker wrote
it, I recall liking it, though.

~~~
vacri
Thanks to both of you for the info. I haven't played with lambdas much yet,
perhaps this is the time to start :)

~~~
vetinari
Lambdas in Python are somewhat limited - the syntax allows for only one
expression. You can use inner (nested) functions however.

------
kutkloon7
I think the main lesson that's there, is that in edge cases, it is sometimes
helpful to work with pointers which point to the structures you're actually
working with, instead of considering the structures themselves. This
especially seems to work for getting rid of null checks.

This takes some getting used to, but it allows you to write more efficient
code. I would do this in kernel development (smart people, high demands on
performance), but I would not recommend to write code like this is enterprise
software.

When you zoom out a bit, and describe the task with natural language, you can
also see that this might be a good idea: "Find the node that is equal to
entry, then, if the previous node exists, set the 'next' pointer of the
previous node to the 'next' pointer of the current node." sounds a lot more
tedious than "Find the pointer that points to entry, and set it to the 'next'
pointer of entry.".

This also works for more complicated tasks:

    
    
      // removes all instances from the linked list and return the number of entries removed
      int removeentries(valuetoremove, list)
      {
      	removedentries = 0;				// how many entries we've removed
      	prevpt = NULL;					// pointer to previous 
      	entrypt = list.head;			// get pointer to first node
      
      	// walk the list (be careful, head can be a nullpointer)
      	while (entrypt) {
      		if (*entrypt == valuetoremove)
      			if (prevpt) {
      				prevpt->next = entrypt->next;
      				removedentries++;
      			}
      		prevpt = entrypt;			// save current node as prev for next node
      		entrypt = entrypt->next;	// move to next node
      	}
      	return removedentries;
      }
    

can be re-written to:

    
    
      int removeentries(valuetoremove)
      {
      	removedentries = 0;				// how many entries we've removed
      	indirect = &head;				// points to the current node
    
      	// walk the list
      	while (indirect) {
      		// remove current entry from list if its value equals valuetoremove
      		if (indirect->value == valuetoremove) {
      			*indirect = *indirect->nextpt;
      			removedentries++;
      		}
      		indirect = indirect->nextpt;
      	}
      	return removedentries;
      }

------
rimantas
Sandi Metz has a talk where she speaks about her dislike for _if_
[https://www.youtube.com/watch?v=OMPfEXIlTVE](https://www.youtube.com/watch?v=OMPfEXIlTVE)

~~~
majewsky
Thanks for the link. I found the talk very insightful.

I would imagine Linus to disagree with Sandi's approach, though. She is never
eliminating the _if_ (as in: the conditional jump), just moving it from plain
sight into the magic of dynamic method dispatch.

