

What Every C Programmer Should Know About Undefined Behavior Part 1 (2011) - adamnemecek
http://blog.llvm.org/2011/05/what-every-c-programmer-should-know.html#

======
kazinator
The light side of undefined behavior is not optimizations but extensions.
Undefined behavior is necessary so that a language can have certain kinds of
conforming extensions. The language standard cannot define every possible
program, because programs can do platform-specific things. For instance, a
program might need to map some hardware to memory and access registers.

"Undefined behavior" in the ISO C sense only means that the ISO C standard
doesn't provide requirements about how an implementation should react to a
given situation. The implementation can still provide its own requirements,
documented and all. ISO C undefined behavior can still be "GNU C defined" or
"glibc defined".

Using some unknown conversion specifier in printf is undefined behavior
according to ISO C, but the one being used could be documented by the given C
library as an extension.

Using undefined behavior to make assumptions like that X < X + 1 is always
true (and can be optimized away) is not a good example of undefined behavior
being useful or beneficial. If the programmer wrote that, it was probably with
the expectation that X + 1 _could be_ less than X due to particular overflow
semantics. Or maybe even X was once unsigned and someone changed it without
considering that expression. This deserves a warning diagnostic like "warning:
expression always true whenever its behavior is defined". And then, who cares
how it is optimized; let the programmer fix it so the diagnostic goes away.

~~~
chc
I thought this was the purpose of implementation-defined behavior rather than
undefined behavior.

~~~
dllthomas
I had the same thought. The difference may be that the standard says
implementations _must_ define "implementation defined" behavior. On
reflection, it would seem uncharacteristic for the standard to insist that
undefined behavior _never_ be defined by the implementation.

~~~
marvy
Undefined means: the implementation can choose to define it, or can choose to
leave it undefined. The standard prefers implementation defined when it's
cheap, but some times it's not.

------
ajarmst
I've long believed that all compliant compilers should implement all undefined
behaviour as an attempt to format the hard drive. It makes bugs much easier to
detect in testing. Bugs that ship will mask themselves from customers, and
many critical security bugs will be self-limiting and reduce the amount of
data exposed.

~~~
DSMan195276
That's not really how undefined behavior works though. In general, undefined
behavior is used to optimize code by assuming it never happens and thus ignore
that case. "implementing" undefined behavior to format the hard drive would
mean that compilers would have to implement checks for every possible type of
undefined behavior whenever you do something that has a undefined case (Ex.
Every time you dereference a pointer). The slow-down would be fairly
noticeable.

~~~
Dewie
I think it was a joke, so not something that should be taken literally.

------
ghshephard
It would have never occurred to me that:

    
    
       for (i = 0; i <= N; ++i) { ... }
    

Has a potential bug in it depending on compiler optimizations (where
N=INT_MAX, and the results of ++i may or may not be defined - with the
compiler using the "UNDEFINED" behavior to ensure that the above loop runs N+1
times.)

~~~
kazinator
If N is INT_MAX, and the loop does not have a break statement in in anywhere,
then the ++i will evaluate on the last iteration and overflow past INT_MAX.
That is undefined; there is no may or may not about it.

Behavior which depends on compiler optimizations isn't a potential bug, it's
almost always a bug. (The one exception might be programs whose specification
is that they explore behaviors that depend on optimization!)

The compiler can help here by diagnosing the suspicious comparison i <= N, if
N is a constant known to be INT_MAX.

    
    
        warning: comparison always true
    

The programmer can shut up the diagnostic by removing the expression, if it
really is the intent that the loop guard is always true:

    
    
       for (i = 0; ; i++) {
         /* conditional break;  somewhere in the loop
            prevents i++ overflow. */
       }

