
Linus on compiler warnings and code reviews - lawl
https://lkml.org/lkml/2015/9/3/428
======
kybernetyk
I tend to treat warnings as errors. (Even set up my IDE to do so).

In the short run it maybe is annoying but in the long run it really pays off.
(Have been burned by a huge pile of crap code base with hundreds of warnings +
64bit transition).

~~~
MereInterest
Agreed. New code shouldn't be committed unless it compiles without error. The
build server should then compile with -Werror, with all relevant shame being
healed on the developer who breaks the build by letting a warning through.

~~~
ryandrake
Sadly, you folks seem to be the minority out there. I can't even remember how
many times I've had conversations along the lines of:

Me: Hey, that code generates 3 compiler warnings, and you shouldn't be casting
that pointer here.

Them: Oh, ok, whatever, change it if you want.

Me: It's not about what I want.. the code isn't correct. It'll break on some
64-bit platforms, for instance.

Them: Well, it works on our target platform, and that's the one in the spec.
If the spec changes, we fix it as a change request.

Me: But it just happens to work! Even a different compiler could result in
different behavior. Can't we just put in a little effort up front to avoid
trouble later?

Them: Go for it if it bothers you so much. We have real bugs to fix. Nobody
cares about hypothetical problems on systems we don't support and compilers we
don't use.

Me: [This is going to drive me to alcoholism]

~~~
blazespin
The biggest issue with lettings warnings go is that they pile up and you start
missing the important ones that point to real bugs. I have seen that happen
many many times.

------
Animats
Biggest flaw in the design of C: no real array parameters. Source of most
buffer overflow vulnerabilities for thirty years. It's a fixable problem [1]
but it's not going to get fixed. We need to move forward to Rust, or something
like it, and get rid of C for anything for which security matters.

[1]
[http://animats.com/papers/languages/safearraysforc43.pdf](http://animats.com/papers/languages/safearraysforc43.pdf)

~~~
kybernetyk
>We need to move forward to Rust

That's the second time in two days I've come to read offensive ranting about a
language + "let's move to Rust" as the proposed silver bullet solution here on
HN.

If this is the way the Rust community tries to evangelize their language I'm
inclined to keep myself as far away as possible from it.

Now don't get me wrong: I think Rust is fine. But that "C/C++ is dumb and
everyone who continues to use it is also dumb and/or ignorant" sentiment won't
win you any friends. There are valid reasons why people keep using C++ and
can't switch to any other language in the near future.

~~~
klodolph
I don't see how this is offensive. It is well established that most
programmers will produce defects at a relatively high rate when writing C or
C++. Rust "or something like it" does effectively eliminate certain categories
of defects. Java evangelists have been claiming the same thing since the late
90s, and they were right... buffer overflows in Java programs have most
commonly occurred in parts of the Java library written in C, and Java has
given us productivity + quality gains in various fields (mostly server-side
web applications).

Nobody's saying that C/C++ or its users are dumb and/or ignorant. The cost of
switching languages is high, and Rust is not very mature.

~~~
ArkyBeagle
Using a language to solve a practices problem is the same with any other
attempt to use technology to correct behavior. It's not completely
unjustified, but don't expect good results.

If you have to write 'C', learn 'C'. I'd say the same of C++, but by the time
you learn it, it's become something different :)

~~~
klodolph
Array bounds errors are a common source of problems in C. Fixing these errors
with behavioral changes is expensive. It increases the amount of training and
increases the cognitive load on programmers, and yet errors will still slip by
because we are imperfect. On the other hand, if I add bounds checking to my
programming language, then I've just transformed serious vulnerabilities into
simple runtime errors. Computers are easier to reprogram than brains are.

~~~
sangnoir
How about "taking heed of compiler warnings" as suggested by Linus? it's
niether expensive nor mentally taxing

~~~
nickpsecurity
How about eliminating the need for them in the common case like other systems
languages did? Why should you have to worry every time you do a common
operation if the alternative has acceptable performance and is actually more
productive?

And I'll add that most of the older, safe languages let you turn off safety
for performance if you have to. You just then have to use you brain like you
would with C or whatever. Just for that module, though, along with calls into
it.

------
meesterdude
I often feel a bit guilty when I spent time looking through something instead
of just giving it a quick glance; but knowing that Linus has to fight the same
fight really gives me confidence that it's better to know whats going on than
to just ship code out the door.

------
Someone1234
That was actually darn polite for Linus.

~~~
fit2rule
Yes, I was going to say something along those lines too .. seems our
Benevolent Leader has gotten a bit soft. Frankly, the use of array function
arguments in code like this would set me off into a minor rage .. I'm
impressed at Linus' restraint this time around.

Whats interesting though is that we're still dealing with these kinds of
issues, even now in the 21st Century, from C programmers who should know
better. Seems that the more things change the more they stay the same ..

~~~
exelius
No, the C programmers who should know better have retired. Everyone else is
just trying to patch code written 10-15 years ago while modifying as little of
it as possible...

~~~
pjmlp
We still have UNIX so it won't change.

------
vmorgulis
Thorvalds still runs Fedora (22) and the GCC warning is a new one coming from
clang.

[https://lkml.org/lkml/2015/9/3/494](https://lkml.org/lkml/2015/9/3/494)

[https://lkml.org/lkml/2015/9/3/499](https://lkml.org/lkml/2015/9/3/499)

------
zwp
"I tried - and failed - to come up with a reasonable grep pattern"

Is there a nice, light, better-than-betterthangrep[1] static analysis tool
that would help with this sort of question? (I suspect the decay to pointer
would, for example, elide this detail from llvm's IR?).

[1] [http://beyondgrep.com/](http://beyondgrep.com/)

~~~
nitrogen
There is/was a refactoring DSL for parsing and modifying C code accurately,
but I don't recall the name. If I remember correctly, it was produced by a
French organization or university.

~~~
krajaratnam
You're probably thinking of
[http://coccinelle.lip6.fr/sp.php](http://coccinelle.lip6.fr/sp.php)?

~~~
nitrogen
Yes, I think that's right.

------
anthay
I've made this mistake in the past. In C++ I used a reference to fix it like
so:

    
    
        #include <cassert>
    
        typedef char array_xyz_type[42];
    
        void test(array_xyz_type & a, array_xyz_type b, char c[42], char (&d)[42])
        {
            assert(sizeof(a) == 42);
            assert(sizeof(b) == 4);
            assert(sizeof(c) == 4);
            assert(sizeof(d) == 42);
        }
    
        int main()
        {
            array_xyz_type x = {1,2,3};
            test(x, x, x, x);
        }
    

I'd add these comments:

1\. The reference makes the code somewhat less flexible in that you can't call
the function with anything other than a char[42] type, e.g. test(new
char[42]...) won't compile. (This may or may not be what you want.)

2\. As Linus said, don't ignore compiler warnings.

3\. Write some code, then watch every line execute in the debugger so you see
if what you wrote is what you thought you wrote. Repeat.

------
pcvarmint
C does have pointers to arrays:

    
    
      static bool rate_control_cap_mask(struct ieee80211_sub_if_data *sdata,
                                       struct ieee80211_supported_band *sband,
                                       struct ieee80211_sta *sta, u32 *mask,
                                       u8 (*mcs_mask)[IEEE80211_HT_MCS_MASK_LEN])
    
       for (i = 0; i < sizeof(*mcs_mask)/sizeof(**mcs_mask); i++) {
         // Use (*mcs_mask)[i]  or  mcs_mask[0][i]
      }

------
acconsta
Does anyone know _why_ array arguments decay to pointers?

~~~
syncsynchalt
Because that's how the machine sees C arrays. It would take an extra
(invisible) parameter on the stack to implicitly pass the array size into the
function, a thing which is very much against the spirit of C.

The only real surprises/magic in C are things like:

    
    
      - addition/subtraction on typed pointers has an implicit multiply by sizeof(type)
      - type conversion between floats and ints

~~~
maximilianburke
> Because that's how the machine sees C arrays. It would take an extra
> (invisible) parameter on the stack to implicitly pass the array size into
> the function, a thing which is very much against the spirit of C.

It would only require a separate parameter if you needed to pass arbitrary-
sized arrays. In this example, if the prototype contains the array size:

    
    
        int foo(char array[30]);
    

It would not be difficult for the compiler to only allow arrays of that length
to be passed as parameters, and no invisible parameters are necessary,
assuming also that the function implementation made the same assumption of
array length.

Developers who handle arrays of arbitrary length tend to already encode a
length parameter, it's this case where it looks to the developer like they are
encoding a length value that is the deceptive/problematic case.

~~~
mortehu
This is one of the many uses of the static keyword in C99:

    
    
      void ten(char foo[static 10]) { }
    
      void generates_warning() {
        char foo[3];
        ten(foo);
      }
    

This causes clang to emit the following warning:

> warning: array argument is too small; contains 3 elements, callee requires
> at least 10

~~~
merlincorey
The warning is true - it is "at least" 10:

    
    
        char ten(char foo[static 10]);
    
        char ten(char foo[static 10]) { return foo[1]; }
    
        void generates_no_warning(void);
    
        void generates_no_warning() {
            char foo[30];
            ten(foo);
        }
    
        int main() {
            generates_no_warning();
        }
    

Modified also so it would build cleanly with -Weverything -Werror

~~~
uxcn
It avoids stepping on memory elsewhere in the program though. My personal
preference for C99 and above is to pass the length as a parameter.

    
    
        int do_stuff(size_t len, int64_t arr[len]);
    

It still suffers from the pointer decay problem, but it at least forces
whoever is reading/writing the code to acknowledge the length.

------
zamalek
In the age and day of static analysis, someone actually took the time to
question the rudimentary analysis the compiler does?

It boggles the mind. I'm with Linus on this one.

------
chengiz
What is the warning?

~~~
myth_buster
It's in the subsequent response [0] from Linus.

[0] [https://lkml.org/lkml/2015/9/3/494](https://lkml.org/lkml/2015/9/3/494)

