
Two Star Programming - rix0r
http://wordaligned.org/articles/two-star-programming
======
muyuu
I don't see how Linus' version is better or more readable. By adding an
indirection you have the equivalent of a dummy node, both ways are textbook
examples of singly linked list deletion.

Note that you save no memory by not naming the implicit (in this
implementation) dummy node. A pointer to a pointer remains a pointer to a
pointer whether you name the intermediate pointer or not. Personally I think
it's more readable to name it. Also, I prefer curr != NULL to simply putting
*curr in a condition context. The compiler is going to do exactly the same.

I don't favour Linus' version because it's not readable. Most people who
haven't seen this before will need to take some time to check this code. Out
of the three basic ways of doing this I'd place this one dead last.

~~~
adamtj
This is exactly what Linus was complaining about.

Your argument is like trying to claim that twenty lines of string munging is
more readable than a straightforward regex. That's only true if you don't
really get regular expressions.

If you understood pointers well enough, you would see Linus's version as more
readable. The other one has so much more unnecessary fluff that makes it extra
complicated and less readable. If you don't understand pointers well enough,
and "haven't seen this before", of course you'll find it confusing.

Readability is in the eye of the beholder.

~~~
mtoddh
_If you understood pointers well enough, you would see Linus's version as more
readable._

I agree - and as someone who makes his living as a C programmer, I'm sure
someone could easily come up with a terse piece of Ruby code that I would view
as too clever, and therefore confusing, but which the vast majority of
programmers on here would look at and think, "a beginner may be confused by
this, but a competent Ruby programmer would not be."

I know some of this comes down to stylistic preferences, and likely some have
had bad experiences maintaining code that went overboard with this sort of
thing. But I also think that if these snippets of code were being discussed in
a forum with a bias towards low-level development, rather than HN, which seems
biased towards web development, you'd be seeing some very different responses.

~~~
muyuu
That's the Perl school of thought that basically finished Perl as a language
for big projects.

Personally, I have 20 years of C programming experience and I prefer to avoid
clever tricks when they add no effective advantage. Making code shorter is not
the end-all of programming.

Code is written for people to understand rather than for computers to execute
(quoting Sussman). It's fine to strive for efficiency since we're talking
about OS development here, but it's not the case in this bit of code really,
is it?

What is more likely to confuse a maintainer, an extra level of indirection or
an extra trivial "if"? However if you are committed to do all the maintenance
yourself, then you can pick as you wish.

~~~
mtoddh
_That's the Perl school of thought that basically finished Perl as a language
for big projects._

It's a great point, and I hated Perl for this very reason. And I'm sure some
of it is my defensiveness since this is how I program. But some of it is also
my concern that it might signal, during an interview say, that the candidate
might not really have a good grasp of pointers. For instance, given the
following snippet of code, candidates that didn't really seem to grok pointers
wouldn't see that the append() function could not be modifying the list in
main():

    
    
      struct node {
        int val;
        struct node *next;
      };
    
      static void append(struct node *list, int val);
    
      int main(int argc, char **argv)
      {
        struct node *list = NULL;
    
        append(list, 4);
        ...
      }
    

And my preference for fixing such an implementation would be to change append
to take a pointer-to-a-pointer:

    
    
      static void append(struct node **list,  int val);
    

with an implementation like:

    
    
      static void
      append(struct node **list, int val)
      {
        struct node **ppn, *pn;
    
        if ( (pn = malloc(sizeof *pn)) == NULL) {
            perror("malloc");
            exit(1);
        }
    
        pn->val = val;
        pn->next = NULL;
    
        for (ppn = list; *ppn; ppn = &(*ppn)->next)
            /* find tail */;
    
        *ppn = pn;
      }
    

and update the call in main() to append(&list, 4).

But as I said, you make a good point, and I agree that your method is the
safer of the two options from a maintainability point of view.

------
stiff
With one of my friends at the university we used to compete for a
shorter/nicer implementation of each data structure in the algorithms class we
were taking, and after using pointers-to-pointers in some previous cases
(linked lists, binary search trees etc.) and each time ending up with an
implementation I really liked, I tried to implement AVL trees[1] the same way,
for example an AVL rotation would be done by a call like this:

    
    
      avl_tree_rotate(avl, &t->left->right->left, &t->left->right, &t->left, t->left->right);
    

Where avl_tree_rotate looked like this (avl_tree_update just recomputes the
height after left or right branch height changes):

    
    
      static void avl_tree_rotate(avl avl, V **from, V **to, V **parent, V *child) {
        if(*from) (*from)->parent = *to   ? (*to)->parent   : NULL;
        if(*to)   (*to)->parent   = *from ? (*from)->parent : NULL;
        avl_tree_swap(avl, from, to);
        child->parent = (*parent)->parent;
        if(!((*parent)->parent))
          avl->root = child;
        else if((*parent)->parent->left == *parent)
          (*parent)->parent->left = child;
        else if((*parent)->parent->right == *parent)
          (*parent)->parent->right = child;
        (*parent)->parent = child;
        *from = *parent;
        avl_tree_update(avl, *parent);
        avl_tree_update(avl, child);
      }
    

It was one of nicer programming workouts I ever got (the whole class was a
great experience and I think every professional programmer should go through
implementing all the basic data structures and algorithms in plain C one
time), but I never managed to get deletion right, despite spending quite some
hours on it, it just went beyond my head. Not that it matters for linked
lists, and in this case I made it deliberately harder on myself, but I think
the moral of the story is that when you are too clever with implementation
details, you might run out of your cleverness when starting to deal with the
actual problem domain.

[1] <http://en.wikipedia.org/wiki/AVL_tree>

~~~
ZoFreX
> I think the moral of the story is that when you are too clever with
> implementation details, you might run out of your cleverness when starting
> to deal with the actual problem domain.

That reminded me of this quote: "Everyone knows that debugging is twice as
hard as writing a program in the first place. So if you're as clever as you
can be when you write it, how will you ever debug it?"

<http://en.wikiquote.org/wiki/Brian_Kernighan>

~~~
koenigdavidmj
The corollary: code stoned, debug caffeinated, and document drunk.

~~~
SatvikBeri
But watch out for the Ballmer peak!

I had one particularly bad period of crunch time where I was trying to code a
3-month project in 4 weeks, all while learning a new language/framework.

One night, after several hours of being stuck I got sick of it and poured
myself a rather large amount of bourbon. Now, I was rather naive when it came
to alcohol. I thought the bourbon was 47 proof...turns out it was 47
_percent_.

All I know is that the next day, building maintenance reported hearing singing
all through the night, two weeks of snacks were gone, and I woke up in the
office at 7 AM, and I found some of the most beautiful and well-written code
I'd ever seen, accomplishing about 3 weeks of work in one weird night.

------
kabdib
The "two star" solution looks like it generates more memory traffic in the
case of many deletions, and may run afoul of load-hit-store performance
problems on some platforms, since the write to '*curr' in the remove case may
still be in flight on its subsequent read and cause a pipeline stall.

Then again, the 'rm' predicate may dominate, or removes may be infrequent, and
then who cares? Or your CPU may not care about LHS stalls (by virtue of being
very whizzy, or just so terrible that the optimization doesn't matter).

Which is to say, measure and be prepared to be flexible rather than statically
clever.

~~~
demallien
Yup, potential performance issues were the first things that came to my mind
on seeing that code too. Plus, it's just silly to say that someone that just
wrote a linked list doesn't understand pointers - pretty much by definition
they _do_ understand pointers, otherwise their linked list isn't going to
work.

My personal rule of thumb concerning pointers is that two-stars should never
be seen except in the case where you want a function to be able to modify a
pointer that was passed in as a parameter. That's it. If I get handled a
module that has two-stars in it under any other circumstance, I refactor it. I
do this because:

a) lots of people find it difficult to thing about pointers to pointers
(although I've never understood why empirically I note that this is the case).

b) Having to dereference two pointers to get to the information that you want
is not just slow because of extra work needing to be done, but also because
you've just increased the need for doing a page change in memory, which on
some architectures can be surprisingly expensive.

~~~
to3m
Where does the double dereference come from in the second case? curr->next and
(* curr) both perform a dereference. (If anything, (* curr) will be at least
as efficient, if not more, because there's guaranteed to be no offset.)

(Does anybody know how you type an asterisk?)

~~~
demallien
Most CPU architectures offer special optimised instructions for grabbing data
in a structure - it's done as an offset from the base pointer. A double star
dereference can not benefit from this optimisation.

~~~
to3m
I think there are the same number of dereferences in both cases? One to fetch
the next pointer, and one to update the previous next pointer if a removal is
required.

You could probably make the second bit of code worse by not holding the value
of star curr in entry.

------
chrisbennet
Unless I _really_ need the performance, I avoid being "clever" when coding.
The simple non-2 star example is going to be a lot faster for someone to
understand, maintain and debug in the future.

"Debugging is twice as hard as writing the code in the first place. Therefore,
if you write the code as cleverly as possible, you are, by definition, not
smart enough to debug it." --Brian Kernighan

~~~
nicholassmith
Being clever when coding is one of the biggest headaches for long term
maintenance unless you _really_ _really_ need that extra performance. I've had
to do a bug hunt on a piece of code recently that was very, very clever but
used all sorts language features that are marked 'use with caution' when a
longer function with more simplicity would have worked as well.

Readability and maintainability trump developers showing how clever they are
by far.

------
andrewcooke
i had to write very similar code (deleting from a linked list) a couple of
days ago. i wasn't the easiest code to write, and i needed several iterations
before my tests were ok, but i remember feeling happy that the final result
was clear and correct.

i just checked it and, yes, i used "two stars".

now i'm reading all the comments here, which are predominantly critical,
calling this "clever" code. and it struck me just how lucky i am to work with
decent programmers. where this kind of thing is considered quality work, not
"showing off".

what i don't understand is why the other commenters here use pointers at all.
couldn't you rewrite this to use arrays with integer indices? then you
wouldn't need any stars (or very few), and it would be even "easier".

well, that's not true. i do understand. everyone thinks their own level is the
correct one.

but still, i have sometimes been lucky enough to work with people who knew a
lot more than i did. when i was in that position i didn't think they were
"showing off". i thought they were awesome and tried to learn from them.

~~~
betterunix
"i had to write very similar code (deleting from a linked list) a couple of
days ago. i wasn't the easiest code to write, and i needed several iterations
before my tests were ok, but i remember feeling happy that the final result
was clear and correct."

How was this not easy code to write? I imagine it would have looked like this,
but in your language of choice:

    
    
      (defun remove-if (lst pred)
         (if (null lst)
             nil
             (if (funcall pred (first lst))
                 (remove-if (rest lst) pred)
                 (cons (first lst) (remove-if (rest lst) pred))
                 )
             )
          )
    

This is straightforward and basically follows from the definition of a list:
either null (empty) or a pair whose second element is a list. The C version is
uglier because of the need to deallocate removed nodes, track pointers, etc.,
but it still follows from the definition of a list.

~~~
andrewcooke
eh, what do you want me to say? everyone else here is calling me a smartass
and you're telling me i'm an idiot.

it was c. i made some mistakes. i fixed them.

~~~
betterunix
Well I don't mean to say that you are an idiot; I really am curious about what
made removing nodes from a list difficult. One of the things that makes linked
lists great to work with is how easily nodes can be removed or inserted.

Personally, I would be inclined to blame C as a language, rather than you as a
programmer. You basically did that anyway:

"it was c. i made some mistakes. i fixed them."

------
antirez
The idea that you can classify a programmer's understanding of pointers based
on doing linked list tricks is absurd. Usually Linus says things that make
sense, but not this time IMHO.

Also the code proposed may work for linked lists that is a very well
understood topic, but code like this in general is hard to understand and
debug, without actually providing some _serious_ advantage.

I remember some code on the Linux kernel implementing the inode caching in the
virtual file system layer that was a mess like that, but multiplied by 1000.
Very "Smart" indeed, but for the sake of what?

~~~
corresation
I am not even sure that this is what Linus was trying to say: I extracted a
very different intention from his comment than the author of this post.

------
jules
Is this really so amazing? I "invented" this long ago when I first implemented
a binary tree. First you code the ugly version, and then in your mind or on
paper you execute it and you see that there's nothing special about the root
node, yet it is handled by a special case. Then you remove the special case,
that's all. So keep in mind that contrary to what several people here are
saying, it does not cause extra pointer dereferences.

------
loup-vaillant
Something 's not right. Still too complex. Ah. Lists are recursive. Their
treatment should be recursive as well. Let's see…

    
    
      void remove_if(node ** head, remove_fn rm)
      {
          node * entry = *head;
          if (rm(entry))
          {
              *head = entry->next;
              free(entry);
          }
          else
              head = &entry->next;
          remove_if(head, rm);
       }
    

Ahh, much better: Now we can refactor:

    
    
      void remove_head(node ** head)
      {
          node * entry = *head;
          *head = entry->next;
          free(entry);
      }
    
      void remove_if(node ** head, remove_fn rm)
      {
          if (rm(*head))
              remove_head(head)
          else
              head = &entry->next;
          remove_if(head, rm);
       }
    

I'm sure GCC will be able to optimize away the tail call.

Edit: oops, I forgot to test for empty lists, which was implicit in the for
loop I set out to remove…

~~~
JoachimSchipper
Tail calls are considered very bad style, for good reason. gcc can indeed
optimize them away, at a sufficiently high level of optimization; but it's not
very nice to write code that, e.g., runs out of stack space _only_ when
compiled for debugging.

~~~
AngryParsley
"...very bad style"? That's quite an overstatement. You can enable individual
optimizations with gcc. Just add `-foptimize-sibling-calls` to the debug
CFLAGS and you're set. Tada! Straightforward code and debug-ability.

~~~
greenyoda
But that would be making your code dependent on a particular feature of a
particular compiler. If you then want to port your code to Windows (lots of
open-source code runs there too), you'll have to deal with the problem again
there. If you wrote the code with an explicit loop, it would just work
everywhere.

------
ambrop7
Just for fun I made a benchmark for "two-star" and "one-star" versions of
remove_if() and build() (to link an array of nodes). Code:
<http://ideone.com/D7bzmo>

Results with {gcc-4.7.2,clang-3.2} -march=corei7 -O3

    
    
      build_onestar+remove_if_onestar: 0m0.309s 0m0.357s
      build_onestar+remove_if_twostar: 0m0.302s 0m0.324s
      build_twostar+remove_if_onestar: 0m0.310s 0m0.361s
      build_twostar+remove_if_twostar: 0m0.302s 0m0.320s
    

The two-star version of remove_if() is a bit faster here, but both build()
versions are the same, across both compilers.

------
jiggy2011
Not a C programmer , but if I was asked to implement this I would be more
comfortable doing something like the first. The second took me a few moments
to figure out.

Perhaps this is a result of many people learning in languages like Java where
the idea of double indirection isn't quite so explicit so this style feels
somewhat unnatural?

I could easily imagine introducing very weird bugs by mixing up my pointers
and pointers-to-pointers. To mentally visualise a linked list I find it easier
to just imagine a stack of boxes attached by string and work on the basis of
cutting and reattaching the string to different points, this is more difficult
under the 2nd approach.

------
huhtenberg
The double-pointer trick makes for a more concise code, but it doesn't always
lead to the code that is _faster_.

That said, it's actually an excellent C interview question. Combined with
"write a binary tree iterator" it reliably measures one's exposure to
thoughtful C programming.

~~~
mtoddh
_That said, it's actually an excellent C interview question_

This was actually one of the very first questions we used to ask during
interviews at Tenable. I'm honestly kind of amazed it's now seen as some sort
of tricky technique. I would think (or hope) most C programmers would see it
as basic competence with pointers.

~~~
huhtenberg
"Hope" is more like it. I went to one interview where they asked to traverse a
list, and since it was for a senior C position I gave them the * * version.
There was this pause and then their lead dev said that this _cannot_ possibly
work. It was really awkward.

~~~
mtoddh
Wow. I tell you- reading through some of the responses on here, I'm feeling a
lot more confident about my job security as a C programmer.

------
csl
Reminds me of the very funny description of a Three Star Programmer over at
<http://c2.com/cgi/wiki?ThreeStarProgrammer>

EDIT: I see it's also linked in the article.

------
twp
A nice example of the old adage that "most problems in computer science can be
solved by adding an extra layer of indirection".

~~~
raldi
"...except too-many-layers-of-indirection."

------
bjoernbu
Why is this supposed to be such a big deal? The second example uses the
language's features to be able to use shorter code.

Usually there are countless possibilities for the exact same thing in every
program. Personally, I would never say those who produce code with unused
possibilities do "not understand" the corresponding language features.

Especially when writing example code or working with less experienced
programmers, I even think that using less language features for the sake of
making your algorithmic idea as obvious as possible is better practice.

For exmapel, I find it far more important that one decides wisely when to use
a linked list at all or, if there are multiple fuctions for removal, chooses
the right one for a particular use-case. Therefore everyone has to understand
whats happening.

------
ruv
Honestly as somebody who reads/ writes C every couple of months, I don't see
what's so clever about it. This level of understanding for a C programmer
should be the bare minimum, I think...

~~~
qu4z-2
Which is very much what Linus was saying (and I agree).

------
lsiebert
You can just use a dummy node and track the previous entry. Then you don't
have to ever pass back the front of the list, or treat the front of the list
as special at all.

It's more elegant then the double level pointer, imho.

~~~
huhtenberg
How is it more elegant if it effectively doubles the number of pointer
dereferences?

------
rjknight
This makes me miss programming in C.

