
GCC proves an uninitialized variable must be 0 and doesn't warn about it - tbodt
https://lkml.org/lkml/2019/2/25/1092
======
floatingatoll
To summarize various replies in the thread:

This is a variation of an unfixed 2004 GCC bug. Clang detects the issue when
the right warnings are enabled. Those warnings are disabled currently. A
developer is auditing and fixing all such instances found by Clang so that
they can reenable those warnings.

~~~
dgellow
What are the right warnings?

~~~
KindOne
I think -Wsometimes-uninitialized ?

[https://github.com/ClangBuiltLinux/linux/issues/381](https://github.com/ClangBuiltLinux/linux/issues/381)

------
rcdmd
Explanation here (from 2004):
[https://gcc.gnu.org/ml/gcc/2004-12/msg00681.html](https://gcc.gnu.org/ml/gcc/2004-12/msg00681.html)

It's not that GCC proves the uninitialized variable must be 0, it's just that
CCP sets it to 0 and everything happens to work out in specific cases.

~~~
DannyBee
This is not a correct interpretation (I worked on CCP and related
optimizations for years :P)

It does in fact prove that it is zero, because it is the meet of lattice
values (undefined, 0).

You can actually solve this particular case by interpreting phi nodes
differently than CCP does. CCP does not generally care about what blocks
things occur in - it doesn't have to, all definitions dominate all uses, so it
is safe to evaluate things in any order. The only thing you stand to lose is
optimality because things only go one direction on the lattice (to ensure that
it reaches a fixed point).

However, in this particular case, you can see if that if you treated the phi
nodes as the equivalent form (assignments occurring on the edge of the
previous block), and then processed the loop blocks in dominance order, it is
obvious the use is uninitialized.

It is only when looking at phi nodes as an unordered black box (as ccp does)
that you get this particular variant of the issue.

This is, of course, just a very cheap form of flow sensitivity.

You can also get the same effect by using SSI form in a lot of cases.

Most compilers do not bother doing real expensive flow sensitive analysis to
try to analyze stuff like this, because almost everything you can do comes
with its own fun source of false positives and negatives.

~~~
TheAsprngHacker
I had no idea that the meet/join operations are used in compiler analysis, and
I'm curious to find out more. Where can I find out how meet and join are
defined on C values, in the context of GCC?

~~~
DannyBee
Compiler dataflow, is at its core, lattice operations. If you read papers from
the 70's, you will find it's very direct about it.

For this specific optimization, you want [https://github.com/gcc-
mirror/gcc/blob/master/gcc/tree-ssa-c...](https://github.com/gcc-
mirror/gcc/blob/master/gcc/tree-ssa-ccp.c)

------
ncmncm
What I don't understand is why it doesn't optimize away the check on
(inode->i_nlink), since it is assuming the code in the dependent block always
runs.

It has to be assuming that, because it is the only path that doesn't lead to
undefined behavior.

~~~
firethief
The optimizer isn't _required_ to eliminate code paths encountering dynamic
UB; it is _allowed_ to. It also doesn't have to be consistent in its
"assumptions", although inconsistency may be a sign of missing efficiency
somewhere

~~~
metaphor
> _The optimizer isn 't required to eliminate code paths encountering dynamic
> UB; it is allowed to._

If I understood correctly, Linus had some colorful thoughts[1] on that:

 _If you know something is undefined, you warn about it. You don 't silently
generate random code that doesn't match the source code just because some
paper standard says you "can"._

[1]
[https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89501#c12](https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89501#c12)

------
dandigangi
Random question - what are the _underscores_ that show up frequently in
Linus's responses?

~~~
icedchai
It's for emphasis. Some people use asterisks. He uses underscores.

~~~
dullroar
Note he uses both, even in the same paragraph - asterisks typically mean
"bold," whereas underscores typically mean "underlined."

~~~
saagarjha
Except in Slack, which somehow interprets this to be italics.

~~~
floatingatoll
They were always meant for emphasis, which takes on variable definitions over
time. Terminals were bad at italics due to character box layouts and so the
literary approach of emphasis through italics mutated into emphasis through
underlines. Now that we’re all free of monospaced terminal grids, it looks
like we’re drifting back to emphasis through italics.

~~~
fao_
Italics were still used though.

Like /this/.

------
wyldfire
Aside: static analysis is a great initial effort but to really get confidence
you should use techniques like sanitizers (MSan in this case) or valgrind.
(This is general advice, not always applicable to kernels).

~~~
mhh__
You should really use both, however they are only a tool: I don't have any
numbers but I imagine Sanitizers have a good detection rate, but one should
still write code defensively. There are too many horror stories, be they
lethal or expensive, to be risky about memory.

Not a kernel developer: Can you test Kernel code with a sanitizer (if that
sanitizer doesn't work in a compiled kernel) by compiling into and using the
code as blob in userspace code with mocks (Or whatever, unless it's just
utility that doesn't need anything other than an interface)?

~~~
muricula
Linux has support for kernel ASAN and I think UBSAN too. User mode linux is a
also thing and does roughly what you described, and I think you can run it
under valgrind too.

~~~
monocasa
To be fair, UML runs at such a low level (raw syscalls rather than calls to,
for instance, malloc and free) that it really limits what valgrind can tell
you.

------
pfdietz
A simple way to avoid this problem is prohibit, in that code base, local
variable declarations without initializers. If the initializer truly was
unneeded the compiler would be able to eliminate it in almost all cases.
Risking undefined behavior for hypothetical microspeedups is just dumb. If a
case occurs where leaving out the initializer is both useful and safe then the
initializer could be removed there, after deliberate review.

------
zzo38computer
It look like a bug to me, since inode->i_nlink may be zero, and then ret is
uninitialized (optimizing the final "return ret" to "return 0" should be safe
(but is not necessarily worth it, depending on the target instruction set and
ABI), although if warnings are enabled, it seem like the warning should still
be displayed).

~~~
tux3
The compiler is allowed to do that. Since it would be undefined behavior to
read the undefined variable, the compiler gets to pick whatever value it wants
for it.

So after the if, the compiler sees that ret is either 0, or an undefined
value.

It picks 0 as the undefined value since that's a good optimization, so ret is
now always 0, and (in this case) there is no warning.

~~~
kazinator
> _It picks 0 as the undefined value since that 's a good optimization._

No it isn't; it costs an extra instruction to initialize that register or
memory location to zero.

How C compilers optimize situations involving uninitialized variables is by
leaving those variables alone. They do so with the standard's blessing, which
grants them that uninitialized automatic variables are "indeterminately-
valued" and that if the program uses an indeterminate value, its behavior is
undefined. Correctly written programs that perform their own delayed
initialization by later assignment benefit from the fact that the compiler
didn't add wasteful code to initialize those objects, only to have that
compiler-generated initial value be overwritten.

> _ret is now always 0, and (in this case) there is no warning._

It's not correct for diagnostics to pertain to versions of the program that
were logically altered by the compiler, rather than to the original program.
It may be ISO-conforming, only because ISO C doesn't require those diagnostics
in the first place, let alone requiring them to be reliable.

~~~
tux3
>No it isn't; it costs an extra instruction to initialize that register or
memory location to zero.

If you're talking about the "xor eax, eax" that was emitted, then _absolutely
not_. GCC is doing the most efficient thing possible with eax here.

That xor is not inserted because GCC kindly initializes the variable for you
to save you from your mistakes, but because it _has_ to initialize eax before
returning (look, there's a function call before in the code example). And
after deciding the return value to be 0 it knows "xor eax, eax" is always
going to be faster than actually loading the real value of "ret". (In fact
it's a zero-idiom, even a NOP instruction is slower than that!)

What I was really trying to talk about is the SSA semantics and not the actual
emitted code. Now in fairness I'm not familiar with GCC's IR, but I'm going to
blithely assume what GCC does with unitialized variables somewhat ressembles
LLVM's undef semantics.

When I say that picking 0 is a good optimization, I mean that at an SSA level,
the compiler is going have a Phi between Undef and the range [0, 0]. If by
"leaving those variables alone" you mean that the compiler should have
returned Undef as the result of that Phi, that would be a horrible
miscompilation! And of course picking something else than 0 would just be
silly.

>It's not correct for diagnostics to pertain to versions of the program that
were logically altered by the compiler, rather than to the original program.

Agreed. As a user, I am not happy about this lack of diagnostics and I'm happy
to call _that_ a bug if you want. And furthermore, I don't think performing
the optimizations above is fundamentally incompatible with good diagnostics.
Probably just hard to implement for GCC.

Disclaimer: I am not actually a compiler engineer and this probably contains
mistakes.

~~~
kazinator
> _GCC is doing the most efficient thing possible with eax here._

It seems that the most efficient thing with %eax is not to mention it in an
instruction.

There may be some quirk/feature of modern Intel processors that clearing a
register will dis-entangle it from considerations of prior hazards. So that is
to say, when we execute 'xorl %eax, %eax', the processor knows that any prior
value in `%eax` is no longer required by subsequent code. A new lifetime has
begun for that register. Therefore, there is no need to execute a pipeline
stall to wait for that prior value, if it so happens that that prior value is
not yet ready. Internally to the processor, the ISA-level register `%eax` can
be assigned to a different internal register at that point, under the
discipline of "register renaming".

In the absence of any such hardware considerations, the clearing of %eax is
pure waste.

> _it has to initialize eax before returning_

Where is that requirement coming from? Not from ISO C, certainly. The behavior
is undefined. It was the programmer's job to ensure that the return value is
calculated by a well-defined expression, not that of the compiler. From that
standpoint alone, it's perfectly conforming to emit code that just leaves the
existing content in %eax and returns.

~~~
tux3
>It seems that the most efficient thing with %eax is not to mention it in an
instruction.

I think what you're missing is that UB is a property of the __execution of the
program __, and not just of the code that was compiled.

If in practice the if branch is always taken, then there is NO undefined
behavior (surprisingly, perhaps)!

That means GCC has to be prepared to handle that case, and has to set eax to 0
when the if branch is taken.

>Where is that requirement coming from? Not from ISO C, certainly. The
behavior is undefined.

Not so! You don't know until the program is run.

-

Edit: (removed some speculation about assuming the branch is always taken,
that was not relevant)

>There may be some quirk/feature of modern Intel processors that clearing a
register will dis-entangle it from considerations of prior hazards. So that is
to say, when we execute 'xorl %eax, %eax', the processor knows that any prior
value in `%eax` is no longer required by subsequent code. A new lifetime has
begun for that register.

That's absolutely right, by the way. I think this was introduced with Sandy
Bridge, in 2011.

~~~
kazinator
No, I mean the behavior is undefined in that case when the function returns
the indeterminate value, which was initialized anyway.

Look, GCC (what I have here: 7.3.0) is doing this even for the following
trivial function:

    
    
        int undef(void)
        {
          int ret;
          return ret;
        }
    

With -O2 this turns out:

    
    
      xorl %eax, %eax
      ret
    

do we still know until run-time that this has UB?

~~~
tux3
>Look, GCC (what I have here: 7.3.0) is doing this even for the following
trivial function:

You're absolutely right that _in this case_ , it's wasteful. In fact, if you
try Clang you will see that it only emits a ret.

Maybe GCC is doing you a courtesy (or maybe nobody taught it to completely
omit an Undef return value, so it just picks zero!). Either way, it's
unconditionally UB, GCC can do whatever it likes.

>do we still know until run-time that this has UB?

Well, there is no branch here, is there? I'm clearly not going to disagree
with you on this one =]

To be clear: UB is still a property of the execution of the program, but here
it's not hard to statically determine that all executions of this program are
Undefined Behavior.

~~~
kazinator
Moreover, it still does it with "-m32 -mtune=i386". :)

------
saagarjha
Unrelated, but it's interesting to see the "new" Linus: no cursing, and no
calling people morons.

~~~
sfadm
New Linus? He is a guy who sends dozens of mails every month and has cursed in
a few handpicked ones. That doesn't make him especially "cursy". I'd say he's
about average.

~~~
AdmiralAsshat
And Howard Dean was a fairly soft-spoken politician until one particularly
excited outburst at a single rally happened to have been captured on mic.

The "actual" frequency of the outbursts is somewhat secondary to the
perception of frequency. Every time Linus writes a profanity-laced rant, even
if it's email number 83/175 for the day, that particular piece is preserved,
archived, and passed around the internet for giggles, further reinforcing
whatever pre-existing conception the person referencing the email is arguing
for.

