
The "Clockwise/Spiral Rule" in C - llambda
http://www.c-faq.com/decl/spiral.anderson.html
======
tptacek
In practice, nobody uses a "clockwise/spiral rule" to read C declarations;
they typedef complex declarations into bite-sized components.

C declarations are hardest to understand when they involve (1) "arrays"† of
"arrays" of "arrays", (2) function pointers, and (3) multiple layers of
indirection.

In systems and networking code and in most application code, (1) "multi-level
arrays" are uncommon, and they're uncommon in roughly the same way tuples of
tuples of tuples are uncommon in Python: they're a symptom that you're missing
a level of abstraction.

Function pointer declarations (2) are common and annoying to read. However, in
virtually all cases, the core of what you're trying to express with a function
pointer declaration is "what arguments does this function take" and "what does
it return". More importantly, most idiomatic C code typedefs function
pointers, so you're not looking at prototype decls with nested complex
function pointer decls inside of them.

So instead of

    
    
        void (*signal(int sig, void (*func)(int)))(int);
    

a modern C programmer would expect

    
    
         typedef void (*sig_t) (int);
    
         sig_t signal(int sig, sig_t func);
    

The qsort(3) man page is another example of an archaic declaration (its
authors presumably think this is the clearest way to convey qsort to an
experienced programmer); check out the man page for pcap_loop(3) for a better
example.

In most C code, multiple layers of indirection (3) are one-step pointers-to-
pointers used to work around call-by-value in C. If you store in your head the
notion that a pointer to a pointer is just there to provide a return-value
argument, that's probably all you'd need to remember about it. Pointers to
pointers to pointers are rare.

† _C pedants: I'm using "array" in an intentionally vague sense._

~~~
DannoHung
I like types. I do have a question though: Why do you call it "sig_t" as
opposed to "sig_h" or something. Based on its positional appearance, it's
pretty clear that it's a type. Or does t stand for something else?

~~~
wrl
It's a POSIX convention to indicate that it's a typedef.

~~~
stephencanon
User code built against POSIX should _never_ use the _t suffix; POSIX
explicitly puts *_t into the reserved namespace
([http://pubs.opengroup.org/onlinepubs/007904975/functions/xsh...](http://pubs.opengroup.org/onlinepubs/007904975/functions/xsh_chap02_02.html)),
meaning that future revisions may add arbitrary type names of that form, which
will break your code if you use the same name and include POSIX headers.

~~~
tptacek
Thbthbthbthbthbththbthtbhtbhthbt! I say to POSIX. This convention is
practically universal.

~~~
stephencanon
Just so long as I don't see you filing any bug reports when POSIX changes a
header and your code stops compiling. =)

~~~
spc476
That's why I use *__t (two underscores).

~~~
obiterdictum
That's not ideal either.

<http://www.doc.ic.ac.uk/lab/cplus/c++.rules/chap5.html>

"The use of two underscores (`__') in identifiers is reserved for the
compiler's internal use according to the ANSI-C standard."

~~~
Someone
Are you sure that statement is correct? In my memory, it's only double
underscores at the start of identifiers. My final draft of ISO C 11 seems to
confirm that:

 _"7.1.3 Reserved identifiers 1 Each header declares or defines all
identifiers listed in its associated subclause, and optionally declares or
defines identifiers listed in its associated future library directions
subclause and identifiers which are always reserved either for any use or for
use as file scope identifiers. — All identifiers that begin with an underscore
and either an uppercase letter or another underscore are always reserved for
any use. — All identifiers that begin with an underscore are always reserved
for use as identifiers with file scope in both the ordinary and tag name
spaces. — Each macro name in any of the following subclauses (including the
future library directions) is reserved for use as specified if any of its
associated headers is included; unless explicitly stated otherwise (see
7.1.4). — All identifiers with external linkage in any of the following
subclauses (including the future library directions) and errno are always
reserved for use as identifiers with external linkage.184) — Each identifier
with file scope listed in any of the following subclauses (including the
future library directions) is reserved for use as a macro name and as an
identifier with file scope in the same name space if any of its associated
headers is included. 2 No other identifiers are reserved. If the program
declares or defines an identifier in a context in which it is reserved (other
than as allowed by 7.1.4), or defines a reserved identifier as a macro name,
the behavior is undefined."_

~~~
obiterdictum
You are right, it appears to be C++-specific and looks like a typo in the
link.

C++98, 17.4.3.1.2 Global names

\- Each name that contains a double underscore (__) [...] is reserved to the
implementation for any use.

------
stephencanon
The "spiral rule" makes for pretty pictures, but unfortunately it isn't
correct. One simple example:

    
    
        int *foo[3][4];
    

The spiral rule would have us read this as "foo is an array [3] of pointers to
array [4] of int". What it actually is is "foo is an array [3] of arrays [4]
of pointers to int".

The correct rule* for parsing declarations is "Start from the thing being
declared, and read right until you hit a closing parenthesis, then read left
until you reach the matching paren, then go back to reading right, and always
skip over any tokens you've already read." This is sometimes called the
"right-left rule"; it's not as pretty as "spiral", but it actually works.

(*) the really correct rule is "follow the syntax and semantics specified in
the language standard", but for some reason that confuses people.

~~~
tedunangst
I think it works if you treat all the [] as a single token. foo is a 3x4 two
dimensional array of pointers to int.

~~~
revelation
But its not, there are no two dimensional arrays. There are arrays of arrays.

~~~
tedunangst
The sizeof operator begs to differ.

~~~
sltkr
How does it differ? It just gives the size of the whole object -- that doesn't
mean it's not an array of arrays...

~~~
tedunangst
Hmm, I'm not sure what to say. A two dimensional array is an array of arrays.
I'm objecting to the idea that two dimensional arrays are some mythical
construction.

For instance, the only way to construct a ragged array is with pointers, and
then sizeof doesn't "work" to give the complete size. Also, the [][] syntax is
one of the places where pointers and arrays are less equivalent than usual.
You can't pass a [][] array via star star.

------
malkia
<http://cdecl.org/>

~~~
binarycrusader
Cool, but it doesn't handle the last example:

    
    
      void (*signal(int, void (*fp)(int)))(int)
    

...claims there's a syntax error, and there might be, but I don't know off-
hand.

~~~
cnvogel
It's confused by the parameter name fp.

    
    
        cdecl> explain void (*signal(int, void (*)(int)))(int) ;
    
        declare signal as function (int, pointer to function (int) returning void) returning pointer to function (int) returning void

------
lomendil
I can't imagine using this for real. Perception is so much more parallel than
that. If I see

    
    
      char *(*fp)(int)
    

I see that it matches the pattern

    
    
      returntype *(*function)(args)
    

It's the "* (* )()" that one sees all at once, and that signifies "this is a
function pointer".

This actually reminds me of some research where kids are asked to trace the
direction of interlocked gears. They start by looking at each one, but soon
they move to more general strategies. Edit - this is the one, in case anyone
is curious: Dixon, J. A., & Bangert, A. S. (2002). The prehistory of
discovery: precursors of representational change in solving gear system
problems. Developmental Psychology, 38(6), 918.

~~~
yshklarov
But before the kids could use general strategies, they had to work out the
details.

You're right, the spiral rule is probably unwieldy in practice. But until a
programmer can recognize the patterns, the rule can be used as a learning aid.

------
PeterisP
This is why most programmers are more productive with newer languages. Syntax
and readability matter.

~~~
tptacek
"Readability" has virtually nothing to do with why people are more productive
in higher level languages. Pointer declarations address a problem most high
level languages don't even allow developers to engage.

------
tedunangst
This is great. Functions like _signal_ are rare, so I just memorize what they
do without reading the decl every time. Also, typedefs for function pointers
go a LONG way towards readability.

The point at the end about const is good too. I see a lot of people screwing
that up and consting the wrong piece.

------
dryman
Great post. But it's lack of explanations in type qualifiers and storage
specifiers. C declaration is not easy for beginners but surely is one of the
most important subjects towards optimisations. Correct usage of const and
restrict keywords let compiler optimise your code super fast for free. Check
my post for example usage of type qualifiers and how the syntax is structured
in C: <http://www.idryman.org/blog/2012/10/29/type-qualifiers/>

------
abcd_f
"signal"... pfft... entry level kindergarten stuff. Try parsing array of
pointers to array of function pointers taking pointer array of arrays _with
some const and __restrict sprinkled on_. And that'd be a warm up. Then name
some of the function pointer arguments _const, _restrict and _ - and now we
are talking. Spiral out of that!

* This is loosely based on an interview question I got few years ago. I did answer it, then asked similar clusterfuck of a question back. Needless to say it didn't go anywhere.

------
ajwerner
<http://ieng9.ucsd.edu/~cs30x/rt_lt.rule.html>

------
deltasquared
This is why I like S-Expressions

~~~
ajross
S-expressions don't declare types, though. A sexp grammar for the same thing
would be basically isomorphic to a fully-parenthesized C type expression.

C has some glitches: the fact that the pointer type decoration goes on the
left while the array decoration goes on the right is one. The precedence rule
that requires parentheses for function pointers is unfortunate. And the
putting the decoration on the variable and not the type is just simply a bug
IMHO.

But the complexity here is inherent. Complicated types require complicated
expressions no matter what grammar you use.

~~~
deltasquared
I think the gist of your argument is this: both C languages and s-expression
based languages are touring complete and thus isomorphic.

For any given C compiler I can write a lisp program that gives the same
answers for input. Likewise for any Lisp program I can make input for a c
compiler that yields the same behavior. I know this because both c and lisp
are touring complete. Yay formal languages !!!

But touring completeness and isomorphism are not exactly the same thing. When
you run into undefined behavior in your c/c++ program, there does not exist
exactly one s-expression to cover the possible interpretations of every
compiler. You may need to write several compiler specific s-expressions to
cover all the behaviors from the same c expression.

These compiler specific difference break the mappings required for
isomorphism.

Lisps also have two different ways of evaluating input. Lazy and strict. So
one s-expression may require several c expressions to deal with edge cases.
Nothing is perfect.

Sadly, either way you look at it, c expressions are not quite isomorphic to
s-expressions.

I agree with you about left to right vs right to left precedence. That's
actually why I prefer s-expressions to order of operations and infix notation.

Also some complexity is inherent, some is self imposed. Sometimes static
typing is nice and catches errors. Sometimes dynamic typing can save a lot of
keystrokes.

------
bproctor
This is supposed to be a joke, right?

