
C/C++ Pointer Declaration Syntax – It makes sense - princeverma
http://hamberg.no/erlend/2012/02/21/cc-pointer-declaration-syntax-it-makes-sense/
======
Suncho
Yes. This is how I first learned pointers when I was learning C. It makes
perfect sense in C so long as you don't get crazy with typedefs or function
pointers. It really breaks down in C++ where you have operator overloading
that results in a situation where you have multiple types of iterators and
smart pointers that all dereference to the same type.

Once you add references into the mix and it gets really confusing. I'm not
really sure why they reused the ampersand to represent a reference type. Were
there no other symbols left? It doesn't even follow the same rule from C. "int
_a" means that_ a is of type int. "int &a" does not mean that &a is of type
int. Blah.

The cruft is ever building, but it started out very intuitive. Oh well.

~~~
eru
My cynic guess for all the re-using of existing C syntax in new and confusing
roles for C++ is that the creators of C++ shirked away from writing a new
parser as much as possible.

~~~
Suncho
At least in the case of dereferencing, I think it's more likely that C++
generalized something that C took for granted as having one specific meaning.
C was created with the idea that the only thing that could ever be
dereferenced was a memory address. C++ added flexibility to the system thereby
violating this precept.

------
FaceKicker
Understanding this little subtlety of the pointer declaration syntax was the
epiphany that led me to feel like I actually, truly understood pointers a
couple years ago (never been much of a C programmer). This syntax always
confused me - I couldn't figure out why it wouldn't be something like "int& p"
to declare an int pointer, but rather includes the dereferencing symbol, which
made it seem anti-pointy to me at the time. Pretty much the instant I figured
out why my thinking was wrong - that "int * p" should be read as "(* p) is an
int" rather than "p is an (int *)" (as this post explains nicely) - pointers
just completely "clicked" for me.

~~~
eru
If you had started with something more sane, like Forth or I guess even flawed
Pascal, you would have understood pointers much earlier. C is such a horrible
language to learn first, second or even third.

~~~
dbattaglia
What about learning C first compared to languages that don't have the concept
of real pointers (like, say, Java)? I started out with C++ (and still write
C/C++ from time to time), and I feel like it's given me a better understanding
of memory management and reference types that a lot of programmers I meet and
work with don't seem to really grasp.

In C# code I constantly see the "anti-pattern" of new-ing up a variable at
declaration, only to immediately overwrite it on the next line with the
results of some function. I can't help but think that the person writing it
has no concept of what "new" is actually doing (with regards to the heap and
the GC, for example).

~~~
eru
> What about learning C first compared to languages that don't have the
> concept of real pointers (like, say, Java)?

If you want to learn pointers, you have to learn a language that provides
them. So C, or like I said, Forth or Pascal. C is fine as a first language, if
you don't mind the pain. C++ is a horrible language, no matter when you learn
it. Never cripple the minds of beginners with it.

About your C# anecdote, does the language allow something like the following?:

    
    
        type variables = someFunction ();
    

Nowadays I usually program in Haskell for my day job. There lots and lots of
nice concepts to be understood there. You could even use pointers, if you are
desperate enough.

~~~
dbattaglia

      does the language allow something like the following?:
    

yes, exactly. The problem is you see stuff like this:

    
    
      var obj = new SomeType();
      obj = GetSomeType();
    

Basically creating 2 objects, the first of which is immediately garbage since
it no longer has any roots. Anyone who understands pointers/references and
memory allocation should understand that "new" is actually creating a new live
object on the heap, rather than just initializing a pointer to null (or a
stack-based value type to zero).

These days C# is actually a decent language to code in, it has a nice
functional aspect to it with linq/lambdas, and you can still go down to C
style pointers/"unsafe" code if necessary (with lots of restrictions of
course). It's really the Microsoft dependency that kills it more than anything
else.

~~~
eru

       var obj = new SomeType();
       obj = GetSomeType();
    

You could almost be tempted to write an automatic converter to

    
    
       var obj = GetSomeType();
    

but you can't in general, because you don't know whether `new SomeType()'
doesn't have side-effects. But if it has some that are relied on, that's
probably a bug.

I know that there's a free .net runtime called Mono. What about the compiler
side of free C#?

------
jhrobert
Nice point. Yet...

What about "references"? int& a, b; or int &a, &b; ?

So, I think that I'll stick to int* pa; int* pb;

i.e. one variable per line.

Additionnaly it makes it easier too lookup for a variable's type when reading
the source code because one does not need to walk thru a list to find the
variable, it is always after the type. I actually use "auto" to make it even
easier to find variable declarations, going as far as grouping such
declarations to some extend to make them easier to find.

Styles vary.

~~~
eru
Which meaning of `auto' do you use there?

~~~
ehamberg
In C++11 you can use ‘auto’ when defining a variable and the compiler will
figure out its type for you.

~~~
eru
Thanks. Your use of `actually' made me think that you were using `auto' in the
old sense just to make grepping easier.

------
Create
<http://www8.cs.umu.se/~isak/snippets/rtlftrul.txt>

The "right-left" rule is a completely regular rule for deciphering C
declarations.

It can also be useful in creating them.

please help: <http://blog.gitorious.org/2011/12/08/private-repositories/>

------
scott_s
I learned C++ before learning C, so I tend to use the C++ convention of

    
    
      type* ident;
    

The reasoning in K&R (also presented in the article) has always bothered me.
When the compiler builds a symbol table, it has "ident" as the identifier, and
"type*" as its type. When the compiler emits type related error messages, it
uses that identifier and type to communicate with you.

~~~
slowpoke
My thoughts exactly. Actually, I'd go as far as to say that using the same
character for both the _type declaration_ of pointers and the _operator for
dereferencing pointers_ was a mistake in the first place.

Imagine they'd have used this instead, for example:

    
    
        int. a;     // declares a pointer to an int named a
        *a = 42;    // dereference the pointer
    

Yea, looks weird on first sight (because we're used to current style), but the
difference is _glaringly obvious_ : it's visually apparent that the two do
_completely different_ things. It doesn't need to be the dot, by the way. My
point is that any symbol that's different would have been better than using
the same one.

------
grn
This is a part of philosophy that _declaration should look like use_. The way
of declaring arrays follows from it too.

~~~
bo1024
int *p = &a;

?

------
feralchimp
The best advice I received about understanding C pointer (usage) syntax was to
translate to English via the following substitutions:

* == "at location"

& == "the address of"

There are cases where this doesn't work well (or at all), but it works for
many of the cases that initially look like jibberish to a C noob.

For declarations, just put variables on separate lines.

------
ithkuil
same goes for more complicate type signatures like pointers to functions:

    
    
      void (*func)(int, int) = f1;
    

where f1 is

    
    
      void f1(int a, int b);
    

Another aspect that might be confusing, but which becomes apparent once you
see it through this light, is typedefs.

At the beginning it might seem that typedefs alias some A to some B.

    
    
      typedef int myint; // OK
      typedef myint int; // WRONG
    

Which could lead to confusing which side is the one you are defining and which
is the produced.

But typedefs basically work as variable decl but they declare a type, so:

    
    
      typedef void (*myfunc)(int, int);
    

this will declare myfunc as a type which is a pointer to a function expecting
two integers and returning void.

~~~
gsg
I've always found it unsurprising that people find typedefs confusing. They're
a storage specifier, which allows for some questionable constructs:

    
    
        int typedef x;
        typedef what;
        struct { int x; } typedef *x, y;
        enum { x } typedef;
    

Storage specifiers are pretty weird. This nonsense is allowed at function or
parameter list scope, but not file scope:

    
    
        void f() { enum { x } register; }
        void f(enum { x } register);
    

I've never understood why Ritchie designed C's declaration syntax to have so
many dumb edge cases.

------
syaz1
Why do I get a blank page with "Wow. Wow." as the only text?

~~~
dekz
I never really liked the way pointers are declared in C/C++:

    
    
        int *a, *b, *c; // a, b and c are pointers to int
    

The reason is that I am used to reading variable declarations as MyType
myVar1, myVar2, myVar3; and I always read “int _” as the type “integer
pointer”. I therefore wanted the following

    
    
        int* a, b, c; // a is a pointer to int, b and c are ints
    

to mean that a, b and c all were of type int_, i.e. pointers to int. and I
therefore found it slightly annoying to repeat the asterisk for every
variable. This also meant that the symbol * had two slightly different
meanings to me: (1) It declares a pointer or (2) it dereferences a pointer.

I usually don’t declare a whole lot of pointers in one line, but still, this
is a (minor) annoyance I have briefly discussed with few fellow programmers
over the years.

Today I started reading C Traps and Pitfalls by Andrew Koenig and after
reading one sentence, in chapter two, the pointer declaration syntax suddenly
makes sense:

    
    
        […] Analogously,
        
        float *pf;
        
        means that *pf is a float and therefore that pf is a pointer to a float.

Of course! If I instead of looking at it as a variable a of type int _, we
read it as_ a – i.e. “a dereferenced” – it makes sense. That is indeed an int,
and that also means that * always means “dereference”.

------
loup-vaillant
Yet another example of why C type declaration syntax could be better. First, C
should have followed this golden rule:

    
    
      <scope> <type> <name>;   // variable declaration
      typedef <name> = <type>  // typedef declaration
    
      (with <scope> = "static", "auto", or nothing)
    

That would be a first step. A second one would be to use parentheses to denote
grouping. Such that extraneous parentheses does not screw up the whole type
declaration:

    
    
      // the following two lines are equivalent
      []*int   p; // array of pointers (but this is not clear)
      [](*int) p; // Ah, now this is more obvious.
      
      int*[]   p; // alternate syntax, with postfix notation
      (int*)[] p; // (personally, I prefer the prefix one)
      *int[]   p; // mixing postfix and prefix does no good.
    

Same rule for const:

    
    
      const *int i; // constant pointer to mutable int.
      *const int i; // mutable pointer to a constant int.
      *(const int) i; // again, we can clarify.
      const (*int) i;
    

(By the way, I think we should make const the default, and use a "mutable"
keyword instead. But that's another fight.)

Functions could be declared in simpler ways:

    
    
      bool (int i, float x) f;
      (int i, float x)    bool f; // alternate syntax
      (int i, float x) -> bool f; // alternate syntax 2
      bool <- (int i, float x) f; // alternate syntax 3
    

C doesn't do currying by default, so I don't really care wether the return
type goes before or after the arguments. But the name of the function should
definitely be on the right of its own type, so we still follow the golden
rule. Function definition would be equally simple:

    
    
      bool (int i, float x) f =
      {
        return 2 * i + x;
      }
    

Note the similarity with

    
    
      int i = 42;
    

Now you want a pointer to a function? Easy: you just prefix (or postfix,
depending on your ultimate choice) the type with a star:

    
    
      *(bool (int i, float x)) fp = f;
    
      // Note: this one is ambiguous without precedence rules:
      *bool (int i, float x) fp = f;
      // And this one clearly denotes a function wich returns
      // a pointer to bool
      (*bool) (int i, float x) fp = f;
    

Structure (and class) declaration could also use a bit of makup:

    
    
      typedef Foo = struct {
        int   i;
        float x;
        bool  b;
      };
    

There, the syntax of types is much nicer, and easier to parse (it doesn't
really matter for C, but C++ could really use some love).

Also worthy of mention is, this syntax above is less restricted than the ANSI
one. However, the semantic restrictions still hold. This declaration for
instance would be syntactically valid, but semantically bogus:

    
    
      void (int i, struct { float f; bool b; }) f;
    

It could work if C were ducked typed. :-)

~~~
eru
You can probably hack together a nice language around C. Sort of like
CoffeeScript around JavaScript. While you are at it, might as well throw the
option of significant indentation into it---like in Haskell, but not
mandantory like in Python.

~~~
groovy2shoes

        > You can probably hack together a nice language around C.
    

That's been done at least twice already -- for some (very debatable) values of
'nice'. See C++ and Objective-C.

~~~
eru
Oh, but they tried to introduce lots of new stuff. Just stick to giving a
nicer syntax.

------
alinajaf
This is identified in K&R C on Page 1 about pointers, though its subtle.

I'm paraphrasing, but I believe the explanation is that `int a* = 5;` is a
mnemonic, i.e. that dereferencing the pointer a will evaluate to 5.

Must have read that section five or six times over the years but only a few
months ago did it sink in.

------
chmike
C -> C++ -> D (problem solved)

------
shmerl
It's just better to void pointer declarations like:

int* a, b, c;

Just use: int* a; int* b; int* c;

The type is really int* but C legacy treats b and c as int in int* a, b, c;

