

Typedefs & Linus - signa11
http://lkml.indiana.edu/hypermail/linux/kernel/0206.1/0402.html

======
jrockway
This is fair. In C, you can't ever let yourself forget whether you have a
pointer or an integer or a signed value or an unsigned value. The compiler
will not protect you from mistakes, and in the kernel, mistakes are called
"remote root exploits". (Remember the bug where someone was derefrencing a
null pointer, except that 0 is a valid memory address in kernel land? Yeah.)

But don't do this in high-level code. There, you should use types that are as
descriptive as possible, as that will greatly improve readability and
maintainability.

~~~
pvg
_In C, you can't ever let yourself forget whether you have a pointer or an
integer_

A C compiler will generally keep you from forgetting that just fine.

~~~
Locke1689
No this is very, very wrong.

Consider:

    
    
      unsigned long addr_t = &x;
    

This is not only completely legal ANSI C but is also idiomatic in kernel
development.

~~~
pvg
'generally'. It's the idea that you can easily mix up pointers and integers in
C (and need to carefully remember types, etc) without getting told off by the
compiler that's 'very, very wrong'. In your case, gcc will happily tell you:

warning: initialization makes integer from pointer without a cast

Without explicit casts, the compiler won't let you confuse pointers with
integers without a warning or an error.

~~~
Locke1689
Sorry, this post wasn't meant to be about the cast. You can cast it -- it's
not that line itself that matters. It's that pointers and integers can take
arbitrary values in C. After you do the cast is what really matters. In C you
have to be aware of the _semantics_ of your code at all times because the weak
typing will not save you.

~~~
pvg
Well, to be honest, I really don't exactly know what we're talking about
anymore. What the parent was saying is that you have to be super careful with
the types of pointers and integers because C will let you use them
interchangeably. This is obviously not true, despite the rain of inexplicable
upvotes. He then has a pointer arithmetic example which is really not so much
about typing but about understanding pointer arithmetic.

Of course C will also let you treat pointers as integers and vice versa,
especially if you explicitly tell it to, by casts. But in the general case, it
will whine and, in fact, reducing unnecessary whining is one of the reasons to
not go hog-wild with useless typedefs. I'm not sure how we got here, I think
one person here doesn't know C very well and the rest of us got into a little
and completely tangential C nerdfest.

~~~
Locke1689
_Of course C will also let you treat pointers as integers and vice versa,
especially if you explicitly tell it to, by casts. But in the general case, it
will whine and, in fact, reducing unnecessary whining is one of the reasons to
not go hog-wild with useless typedefs. I'm not sure how we got here, I think
one person here doesn't know C very well and the rest of us got into a little
and completely tangential C nerdfest._

Hmm that's fair. I also think I started entering kernel-dev mode where you
have to carefully watch everything that you do, whereas most people probably
have C experience with the benefit of things like the C standard library.

For whatever it's worth, I was seeking to reference the idiom where you store
an address as an integer in order to do some hardware operations. You have to
keep semantic knowledge of that integer as really being a pointer even though
it's, for the moment, stored as an integer. In cases like this the C compiler
is no help whatsoever.

------
Cushman
That's wacky. When I was first getting into OSS, one of the most confusing
things about code was typedefs. I knew the basics of C, and looking back on it
a lot of what confused me was actually pretty straightforward code that I
probably would have been able to piece together, but the use of arbitrary type
names (which I didn't even know was possible) stopped me from even trying on
more than one occasion.

That's a point in favor of Ruby's "the most readable solution is the most
elegant" philosophy over, say, Perl's inverse: It's cool to feel like a member
of the club, but if writing that extra line means a beginner (or even a child)
can more easily figure out what it does, and she gets interested in
programming because of it, that's a lot better for the hacking community as a
whole.

~~~
phaylon
You're misinformed about Perl's philosophy. "There is more than one way to do
it" doesn't mean they're all good, and all well respected by the community.
There is always more than one way to do it and I have yet to see a single
production-grade programming language where that isn't the case. If it were as
you said, there wouldn't be a Modern Perl movement, there wouldn't be tons of
syntax extensions, and extensions to the core to allow better and more
powerful syntax extensions.

~~~
Cushman
I didn't intend any slight against Perl (nor an endorsement of Ruby, for that
matter), and I know I shouldn't speak on a language I don't know so well. I've
always understood the general trend in Perl to be towards economy rather than
expressiveness, but I'm totally willing to admit that may no longer be the
case.

~~~
phaylon
I'm sorry if I misinterpreted your statement. I just wanted to clarify that
"the most unreadable solution is the most elegant" is not Perl's philosophy by
far :) More the opposite.

Most of the times you see Perl hackers celebrate some kind of complex syntax
expression, it's mostly not the syntax they're enjoying, but the expressed
principle. You could still hide it behind a DSL, which most of the time makes
sense anyway in production code; DRY and all that. But that would also hide
the principles and implementation that was used. So, in generally the Perl
community will communicate ideas and concepts via short Perl snippets, and
Perl's freedom of syntax makes that easily possible even on Twitter or IRC.

In Perl, expressiveness was always very important. It was just done by the use
of symbols and syntax instead of words. It's always a trade-off between being
able to read an algorithm without knowing the language, and being able to
express your ideas concisely. Personally, I have more troubles with sigil-less
variables than with any implementation that uses any kind of symbol to
identify them. Others find them distracting. It's a personal choice.

If you have complex nested expressions in Python that do lots of things, you
should put it in a function and call it by name. It's the same thing in Perl.

~~~
Cushman
No, "Perl's inverse" was unnecessary snark and you're right to call me out on
it. Thanks for the explanation :)

------
papaf
I think OpenBSD has a similar culture:

[http://marc.info/?l=openbsd-
cvs&m=117270339530912](http://marc.info/?l=openbsd-cvs&m=117270339530912)

~~~
16s
For those who don't click the link... it's from Theo:

Log message: the_t world_t would_t be_t a_t better_t place_t if_t some_t
people_t did_t not_t feel_t the_t need_t to_t typedef_t everything_t

------
lelele
So, instead of fixing the language, you fix yourself. Linus was right when he
basically said that whenever you are dealing with a broken language, you have
to develop broken habits too.

------
leif

        > And never _ever_ make the "pointerness" part of the type. People who 
        > write 
        >         typedef struct urb_struct * urbp_t; 
        > (or whatever the name was) should just be shot. I was _soo_ happy to see 
        > that crap get excised from the kernel USB drivers.
    

Amen. And people who use ampersand pass-by-reference in C++ ( foo(int &i) ).

~~~
jamesaguilar
> And people who use ampersand pass-by-reference in C++ ( foo(int &i) ).

Why?

~~~
leif
Because it gives you no more power than passing a pointer. If you're handling
pointers, it's better to be totally explicit about what you're doing.

I don't want to read your function call that looks like it passes a whole
object and have to check the function header to see that it's actually a
reference getting passed, I'd rather see the address taking right at the
function call.

~~~
jamesaguilar
A decent point. Google uses the pointer-ness or const-reference-ness of a
parameter to signal what the parameter is going to be used for [1]. But at the
end of the day reference modification is the kind of basic language feature
that you should expect to see people using.

[1] [http://google-
styleguide.googlecode.com/svn/trunk/cppguide.x...](http://google-
styleguide.googlecode.com/svn/trunk/cppguide.xml)

~~~
leif
This actually bothered me to no end at google. At the API level, it's great, I
can learn a lot about what your function is going to do just by looking at the
signature, but when reading code that uses an API I don't know, it's hell. For
me, the extra help you get by looking at a function signature really isn't
that great, when I should have the same information available in the
documentation that had damn well better accompany the function signature in
your header.

~~~
jamesaguilar
Header documentation and pointer-as-output conventions are both typically
available in the Google libraries I've used.

I really don't understand what you are complaining about. Earlier you said you
didn't like references being modified by functions to which they are
parameters. Now you are complaining about the use of pointers as parameters?
In a language like C++, it's pretty difficult to do neither.

~~~
leif
No, I'm not opposed to references being modified by functions, I'm opposed to
the "ampersand" syntax of C++. I am fully in support of just passing a pointer
and being straightforward about it.

------
nkurz
I liked this this reply farther in to the thread: "inlines, when used
properly, are _not_ larger than not inlining"

[http://lkml.indiana.edu/hypermail/linux/kernel/0206.1/0843.h...](http://lkml.indiana.edu/hypermail/linux/kernel/0206.1/0843.html)

My usual rule is to not inline anything that calls a full fledged function,
and to only explicitly inline things that are truly on high performance paths.
Basically, trust that the compiler will inline or not inline the rest as
appropriate. But this is another interesting metric to consider.

~~~
Locke1689
_My usual rule is to not inline anything that calls a full fledged function,
and to only explicitly inline things that are truly on high performance paths.
Basically, trust that the compiler will inline or not inline the rest as
appropriate. But this is another interesting metric to consider._

That's pretty much correct. Keep in mind, though, that inline has different
consequences in a header file. There are good and bad reasons for that kind of
usage.

------
praptak
_"... and then use "counter_t" all over the place. I think that's not just
ugly, but stupid and counter-productive. It makes it much harder to do things
like "printk()" portably, for example ("should I use %u, %l or just %d?")
..."_

Just do this: printk("count: " COUNTER_T_FORMAT_SPECIFIER "things", c);

~~~
gjm11
So instead of

printk("subtotals: a %d, b %d, c %d, total %d\n", a,b,c,a+b+c);

you have

printk("subtotals: a " COUNTER_T_FORMAT_SPECIFIER ", b "
COUNTER_T_FORMAT_SPECIFIER ", c " COUNTER_T_FORMAT_SPECIFIER ", total "
COUNTER_T_FORMAT_SPECIFIER "\n", a,b,c,a+b+c);

and soon you (1) can't read your code at all because 60% of it is shouty stuff
in your calls to printk and (2) even the printk calls are awkward to read
because your eyes have to skip over all the shouty stuff to work out what the
output's actually going to look like.

(Perhaps that was your point, in which case I apologize for thinking it needed
spelling out more explicitly.)

~~~
praptak
I was trying to be sarcastic.

~~~
silentbicycle
If you're going to be sarcastic, be so heavy-handed about it that you could
sell it as a Methodology later. :) Otherwise, some people will take it at face
value.

------
kia
One more from Linus earlier in the thread

[http://lkml.indiana.edu/hypermail/linux/kernel/0206.1/0398.h...](http://lkml.indiana.edu/hypermail/linux/kernel/0206.1/0398.html)

------
paufernandez
Just imagine having Linus as your teacher in a "Programming C" class...

~~~
leif
oh, to be so lucky...

~~~
gryan
Don't be so sure that all of Linus' opinions are valid for all areas of
programming, or even in all situations within the kernel.

~~~
leif
They aren't, but it would be nice to see all his opinions laid out at once so
I could absorb the good ones.

~~~
AndyKelley
This is the perfect attitude with which to face the Internet.

------
snorkel
Mostly agree with Linus that typedef abuse is annoying except one example:
size_t is platform-dependent and not the same as unsigned int.

~~~
Someone
I also do not understand his

"We actually have real _problems_ due to this in the kernel, where people use
"off_t", and it's not easily printk'able across different architectures (we
used to have this same problem with size_t)."

I see the problem with printk, but giving up the ability to typedef
architecture-dependent types such as size_t because of it? The proper
reaction, IMO, would have been to fix printk. Given that gcc already checks
compatibility between format strings and arguments, it should not be that hard
to add a special format character (say %?), and have the compiler replace it
by a correct character for the argument passed.

------
Jach
I agree except for the part about typedef'ing pointers. If you never typecast
a pointer, then you can't ever have compile time and run time encapsulation
(better than C++ gives with "private"!) through opaque pointers...

~~~
pieter
Why not? Just don't ship the definition of your struct in your headers, only
declare them.

~~~
jedbrown
The API is more confusing if they always have to lug around the extra * and
type out struct if they can never dereference it. I am strongly in favor of
typedefing pointer types that are strictly private.

------
willv
regards something specific Linus said:

"We should also have some format for printing out "u32/u64" etc, but that's
another issue and has the problem that gcc won't understand them, so adding
new formats is _hard_ from a maintenance standpoint. "

do they not use inttypes.h for that?

<http://linux.die.net/man/3/priu32>

~~~
mfukar
Linux _is_ the implementation; if anything, _it_ should provide a inttypes.h.

~~~
Locke1689
Well, no. inttypes.h is defined in the ANSI specification and is provided by
glibc. If it were unistd.h you would be right.

~~~
cdavid
I doubt the kernel can use anything that is defined in the glibc...

~~~
adbge
You're correct. The kernel does not include a C standard library.

------
16s
I primarily use them with c++ containers as doing so allows you to switch from
a vector to a map without changing a lot of other things.

------
buster
Odd to be reminded of Transmeta and Linus working there.. that's 8 years ago
already?

