
The Dangers of Super Smart Compilers - ingve
http://hacksoflife.blogspot.com/2015/12/the-dangers-of-super-smart-compilers.html
======
dklsf
It's not really surprising that invalid code doesn't go along with compiler
optimizations.

The correct way to deal with this here is probably either _NOT_ using nullptr,
and instead using a special null object (similar to the end iterator); or
using pointers instead of references (because you really want pointer
behaviour).

Their fix is insane, and not something I'd want in production code. Unless you
carefully vet your code and compiler version, you should never rely on
undefined behaviour.

(There are some cases in the c standard that technically undefined but all
compiler vendors have essentially agreed to handle in the same way; so those
are fine. IIRC unions are a common example.)

~~~
Kristine1975
_IIRC unions are a common example._

If you mean type punning: That is fine in C (chapter 6.5.2.3 of the C11
Standard even mentions "type punning" in footnote 95), but not in C++, where
it's undefined behavior.

I would just use memcpy. It works even when C code is compiled as C++, and
modern compilers know what memcpy does and can optimize its call to a move
operation:
[http://blog.regehr.org/archives/959](http://blog.regehr.org/archives/959)

~~~
barrkel
Type punning is not fine in C99 - you need to use a correctly typed pointer or
a char pointer.

~~~
Kristine1975
You're right. Type punning via pointer cast, e.g.

    
    
      int i = ...
      float f = *(float*) &i;
    

is not fine (violates strict aliasing). Type punning via union, e.g.

    
    
      union { int i; float f; } u;
      u.i = ...
      float f = u.f;
    

is, though (but only in C).

Just use memcpy :-)

------
rsp1984
I really like The Hacks of Life and think it's one of the best blogs on the
Internet, but this time they really got it wrong.

The title should not be _The Dangers of Super Smart Compilers_ but rather _The
Dangers of Engineers Trying to be Super Smart with C++_. Yes that's right.
Having operator== and operator!= overloads for stuff like pointers in your
code means that you're either a 0.01% Super Engineer who 100% of the time
knows what the f* he's doing and knows that it's actually necessary for the
problem at hand (overall not so likely), or it means that your code is
unnecessarily complicated and there's an easier and more elegant and more
readable solution if you take a step back and re-factor some stuff (more
likely, and yes sometimes it means taking a lot of steps back).

But please when stuff goes wrong, don't blame the compiler first.

~~~
crazy2be
What? These are overloads for smart pointers, not regular pointers, and thus
require special code to get the semantics of regular pointers that are usually
expected and desired by developers. Even the standard library unique_ptr
overloads these operators
([http://en.cppreference.com/w/cpp/memory/unique_ptr/operator_...](http://en.cppreference.com/w/cpp/memory/unique_ptr/operator_cmp)).

Of course, you could argue that smart pointers shouldn't be a thing, but I
think most C++ developers would disagree given the significant advantage of
essentially automatic ownership management, at least for tree-like structures
(cycles are more complicated, as always).

What would you propose as a simpler solution? Many languages have GC, which is
arguably the simpler solution you speak of, but would be unacceptable in C++,
since the large majority of applications it is used for require predictable
low-latency responses. Manually managing pointers is how you do things in C,
but has it's own sets of problems, since it can't be checked in any way by the
compiler, so things like forgetting to free, double free, use after free, etc,
are common, while with smart pointers these errors are impossible.

------
tines
I don't know about you guys, but as a reader unfamiliar with their codebase,
when I see

    
    
        return &*rhs == NULL;
    

I think exactly what the optimizer thought: "Well, the address of an object
can't ever be NULL, so this expression always evaluates to false."

I don't think code that does this is "semi-legit"; it's completely counter-
intuitive to abuse the expected semantics of unary `operator*` that way, even
if the compiler did have no problem with it. You can't expect a new member of
the team to read this and not have fire alarms go off in their head.

As a side note, I also think that this optimization would be done by any
compiler capable of basic pointer analysis. This isn't really GCC being out-
of-the-ordinary "super-smart."

~~~
Animats
Right. C++ does not allow null references. I've seen this before from the "I
can use null references because I know what I'm doing." crowd. It's really a
problem of C++, though, that it's possible to create null references.
References were an afterthought in C++ (Strostup admits "this" should have
been a reference) and it shows. The abstraction is leaky.

Rust is stricter about references not being null, as in you get a compile time
error if you try.

~~~
makecheck
Invalid references are a GOOD thing. The problem is that the C++ language does
not allow sane behavior for the common case of objects going away.

The C++ language creates the impression that holding a reference is stable,
when in fact it is still possible for a reference to point to bogus memory
(same as a dangling pointer). The C++ reference starts out on the right track
with some nice compile-time guarantees like required initialization but then
messes everything up by being immutable. The memory can be deallocated, and
people with references are screwed; they _cannot possibly change their own
state_ to reflect the fact that the object is now gone. And C++ has some
inconsistent hacks, such as the fact that a "const" reference to a temporary
object will prevent object deallocation until scope exit but this is not the
case for other uses of references.

Compare that behavior to something like Objective-C's "weak" properties, where
ANY weak pointer to an object is set to 0 when the target object is destroyed.
This guarantees that the property values will have one of two values: a
pointer to valid memory, or "nil"; _never_ a bogus pointer. In addition,
Objective-C ignores messages sent to "nil". While clearly this can still
create hurdles for debugging, it is fantastic for end users because the
program itself is quite stable. The language created facilities for making
invalid references _hard_ , and mostly eliminated the penalty for screwing up.

------
lazyjones
The various C abuses don't really merit further comments, but compilers could
help avoid many similar issues by always printing warnings about undefined or
unexpected behavior (when warnings are turned on at all), not only when the
problem will actually manifest itself due to a high optimization level. This
is (or was) probably due to high CPU cost of the required analysis steps.

~~~
mikeash
This function can invoke undefined behavior:

    
    
        int f(int x) { return x + 1; }
    

So can this one:

    
    
        int get(struct Something *s) { return s->field; }
    

Figuring out when undefined behavior needs a warning and when it's unnecessary
because it's known by the programmer to be impossible is a Hard Problem.
Especially since the compiler has no idea whether the programmer wrote f()
knowing that they'll never pass in INT_MAX, or whether the programmer actually
does expect to pass in INT_MAX and expects to receive INT_MIN as the result.

~~~
Spivak
In practice, does anyone really assume that their system doesn't have two's
compliment overflow behavior?

Hell during my intro to C course I took at university they told me I should
actually detect overflow by seeing if the signs change.

~~~
mikeash
I sure hope a lot of people assume their system doesn't have two's compliment
overflow behavior, because that is absolutely not guaranteed to be the case,
and there are real-world C compilers which will not exhibit that behavior in
all circumstances.

For example, since overflow is undefined, the compiler can assume that an
expression like x > x + 1 is always true. Testing with the version of clang
that comes with Xcode 7.2, it does this when compiling with -O3.

(Note that this is only true of _signed_ overflow. Unsigned overflow is
defined behavior, and all unsigned arithmetic is done modulo 2^n where n is
the number of bits in the type.)

The underlying machine is, of course, two's complement. But nothing says the
compiler must compile code to match how the underlying machine behaves.

Unfortunately, whoever taught your intro to C course apparently didn't
understand C. This is a frighteningly common occurrence, of course. Most
people seem to learn C is a very inconsistent way, and imagine a lot of things
about it rather than learning how the language actually works.

------
buster
To me, depending on undefined behavior sounds more like a bug and unclean code
then a fault of the compiler. Just because C gives you the power to shoot
yourself in the foot doesn't make it a compilers error if you do.

------
to3m
I claim this is all highly unhelpful. Both the C and C++ standards explicitly
permit an implementation to do something sensible in the face of undefined
behaviour, the sort of thing that a regular user of the system in question
might expect. It's a bit unclear to me why compiler authors don't just do
that. Undefined means exactly what it says, nothing more, but people seem to
love this idea that it has to be some kind of here-be-dragons nightmare
scenario.

Mandatory reading: [http://blog.metaobject.com/2014/04/cc-
osmartass.html](http://blog.metaobject.com/2014/04/cc-osmartass.html),
[http://robertoconcerto.blogspot.co.uk/2010/10/strict-
aliasin...](http://robertoconcerto.blogspot.co.uk/2010/10/strict-
aliasing.html)

~~~
mikeash
Doing "something sensible" sounds good, but it's hard to actually do. You can
point out the sensible approach to specific examples, but writing the
optimizer to take a "sensible approach" in all cases is hard.

For every example like this, where the programmer wants a NULL check to be
done even when the value "can't be NULL," there's an opposite example where
the programmer wants a NULL check to be elided because the value actually
can't be NULL.

Taking advantage of undefined behavior isn't done because compiler writers are
sadists who enjoy writing perverse implementations which rely on the letter of
the law. It's done because it allows the compiler to generate better, faster
code.

I think a lot of the trouble is that too many programmers learn C in a
completely ad-hoc, informal fashion and never take the time to actually become
familiar with the fundamentals of the language. People simply don't know how
the language is specified, what operations are legal and what operations
aren't, why there are illegal operations that the compiler won't (and can't)
warn you about, etc.

C is definitely a dangerous language and is a difficult language to use well.
But that's all the more reason people who plan to use it need to ensure they
understand it.

More mandatory reading: [http://blog.llvm.org/2011/05/what-every-c-programmer-
should-...](http://blog.llvm.org/2011/05/what-every-c-programmer-should-
know.html)

~~~
andrewflnr
> It's done because it allows the compiler to generate better, faster code.

Why? Why is this OK? You say doing "something reasonable" is hard, but it
obviously doesn't include entirely ignoring code I wrote. That's basically
saying "I see you did something weird there, but going to assume the rest of
your code and ask the libraries you link to are perfectly standard-compliant,
which lets me generate faster code here. I'm so smart!"

Default safe, not fast, wherever possible.

~~~
mikeash
Sure it does. A big part of optimization is specializing code to be fast by
removing unimportant bits.

Here's a quick bad example:

    
    
        int computation(int *a, int *b) {
            if(a == NULL && b == NULL) return 0;
            if(a == NULL) return *b;
            if(b == NULL) return *a;
            return ...something with *a and *b...;
        }
    
        ...elsewhere...
        int a[1024];
        int b[1024];
        ...fetch data into a and b...
        int c[1024];
        for(int i = 0; i < 1024; i++)
            c = computation(a + i, b + i);
    

Would you want this code, compiled with full optimizations, to run slowly
because it does four NULL checks every time through the loop? A good compiler
will probably inline the computation and eliminate all the checks because the
condition they're checking for can never happen. Is that a bad thing?

~~~
nkurz
This is a good example, and I think shows persuasively that it's good for a
compiler to make logical assumptions about inputs when inlining code. But I
think this is different than making optimizations by presuming the absence of
undefined behavior when compiling the function itself.

Consider a programmer who feels it's clearer to dereference a and b once,
rather than writing using * a and * b in the complicated math after the
preliminary checks. And because they don't want to rely on C99 or distract by
opening a new block, they put the conversion at the top of the function. Oh,
and the new guy decided to add the "missile launch" code here.

    
    
        int computation(int *ptr_a, int *ptr_b) {
            int val_a = *a, val_b = *b;
            if(ptr_a == NULL && ptr_b == NULL) return 0;
            if(ptr_a == NULL) return val_b;
            if(ptr_b == NULL) return val_a;
            launch_missiles(); /* OK to put this here? */
            return ...something with val_a and val_b...;
        }
    
    

I think there is a reasonable argument that there is a difference between
optimizing when a code path can be shown to never occur because of constrained
inputs, and optimizing because behavior supposedly can be inferred from the
absence of undefined behavior.

I'm fine with a literalist compiler that produces code that crashes when it
immediately tries to dereference a NULL pointer, with one that produces code
that does the safety checks but defers the load, but even if allowed by the
spec[1] I don't think it's good behavior for a compiler to silently omit the
safety checks and launch the missiles when passed a NULL.

[1] Is this undefined behavior according to the spec? I consider myself an
expert C programmer, and I think it's undefined in C99, but I wouldn't bet
that I've correctly understood the issues for the other dialects of C,
muchless C++. Is a conforming compiler "allowed" to skip the safety checks and
launch the missiles? I don't know, and thus defensively wouldn't write the
code like this, but it would be nice to think that long-standing mission-
critical code written by others won't suddenly stop working with a new release
of the optimizing compiler.

~~~
mikeash
I agree that's a pretty good example the other way.

I'm pretty sure your snippet invokes nasal demons in all dialects.
Dereferencing NULL is a pretty classic case where anything can happen once you
do it. "Calling launch_missiles even though you don't want it to happen" is a
subset of "anything," of course.

Consider what happens if you rearrange that example a bit, though. Let's say
it's more like this:

    
    
        int computation(int *ptr_a, int *ptr_b) {
            if(ptr_a == NULL && ptr_b == NULL) return 0;
            if(ptr_a == NULL) return val_b;
            if(ptr_b == NULL) return val_a;
            launch_missiles(); /* OK to put this here? */
            return ...something with val_a and val_b...;
        }
    
        int otherComputation(int *ptr_a, int *ptr_b) {
            int val_a = *ptr_a, val_b = *ptr_b;
            int intermediate = computation(ptr_a, ptr_b);
            return ...something with val_a, val_b, and intermediate....;
        }
    

Then you call otherComputation in a tight loop. Should the compiler elide the
NULL checks here, if computation gets inlined? otherComputation clearly
requires that its parameters are non-NULL, so it should be fine to remove the
checks. Failing to do so would be pretty annoying. It's clear that the inputs
are constrained, and while you could fix it by factoring out the common code
so that the NULL checks are explicitly optional, it would be best if the
optimizer can handle it and not require us to rearrange our code to make it
happy.

That's really what I'm getting at. There are clear examples where it's bad to
assume undefined behavior cannot happen and optimize away checks. But there
are also clear examples where it's exactly what you want to do. Can the
optimizer reliably distinguish between these cases?

I think the best answer is to have tools available like the undefined behavior
sanitizer which can catch these problems at runtime. You want the optimizer to
make your code as fast as it legally can. Trying to tone it down is going to
be error-prone, and safety checks that can't be counted on are dangerous. You
might end up just moving the problem around: instead of code suddenly being
optimized out because the compiler proved it can't run without invoking
undefined behavior, you'll have code suddenly being optimized out because
something changed that made the compiler think you intended it. By checking at
runtime, you recast the problem from "does the programmer intend to pass in
values which will result in undefined behavior?" to "does this code _actually
get called_ with such values?" That's way easier to check, should have zero
false positives, and should have as few false negatives with good test
coverage.

~~~
nkurz
Your otherComputation() example doesn't strike me as a clear cut case. My
preference (and I agree that this is not required by spec) would be that the
checks would be omitted only if the dereference has actually been executed,
and not if the initialization has just been declared. Essentially, I think I'd
like dereference of a NULL pointer to be defined behavior as far as generated
code (the code must behave identically to how it would behave if the load was
issued) while possibly undefined for the processor (crashing is valid, but
returning a constant would be valid too). The compiler is allowed to reduce
many checks to one by assuming consistency, but not allowed to ignore the
checks altogether.

 _I 'm pretty sure your snippet invokes nasal demons in all dialects.
Dereferencing NULL is a pretty classic case where anything can happen once you
do it._

Is the dereferencing itself undefined in all dialects, or is it only undefined
if one later "evaluates" the dereferenced value? Or do some define evaluation
differently? For example (to my surprise) a comment by 'lkasndasd' on the
corresponding Reddit thread points out that &* ptr is never undefined behavior
in C, since the address operator and the dereference operator cancel out even
(explicitly) if ptr is NULL:

    
    
      6.5.3.2 Unary arithmetic operators, Semantics, paragraph 3:
      The unary & operator yields the address of its operand. 
      [...] If the operand is the result of a unary * operator, 
      neither that operator nor the & operator is evaluated and 
      the result is as if both were omitted, except that the 
      constraints on the operators still apply and the result is 
      not an lvalue.
      and footnote 102:
      Thus, &*E is equivalent to E (even if E is a null pointer) [...]
    

[https://www.reddit.com/r/programming/comments/3xk6zj/the_dan...](https://www.reddit.com/r/programming/comments/3xk6zj/the_dangers_of_super_smart_compilers/cy5gi1t)

[http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1570.pdf](http://www.open-
std.org/jtc1/sc22/wg14/www/docs/n1570.pdf)

~~~
mikeash
That quote is interesting, but the fact that they have to explicitly call it
out implies to me that the mere act of writing the dereference is undefined
behavior for NULL in all other cases.

I'm pretty sure the dereference is immediately UB (other than the case you
quote here). Consider that when optimizations are off, that line will probably
immediately execute a load instruction, generating a segfault.

~~~
nkurz
I thought it would be interesting to see how some different compilers actually
behave here. This was my test program:

    
    
      nate@haswell:~/tmp$ cat deref.c
      /* gcc -fno-inline -O3 -g -Wall deref.c -o deref */
      /* (or as desired to show particular outcome) */
    
      #include <stdio.h>
      #include <stdlib.h>
    
      void deref_and_print(int *ptr) {
        int i = *ptr;  /* undefined behavior if ptr is NULL */
        if (ptr == NULL) printf("NULL\n");
        else printf("%d\n", i);
      }
    
      int main(int argc, char **argv) {
        int *ptr = NULL;
        if (argc <= 1) {
          printf("Usage: %s [arg]\n", argv[0]);
          printf(" Demonstrates undefined behavior if 'arg'");
          printf(" is not a positive integer\n");
          exit(1);
        }
        int i = atoi(argv[1]);
        if (i > 0) ptr = &i;
        deref_and_print(ptr);
        exit(0);
      }
    

As you guessed, icc, gcc, and clang all crash with a segfault for 'deref 0'
when compiled with -O0. But more interestingly, when compiled with -O1, -O2,
or -O3, clang prints "NULL", while icc and gcc continue to crash.

~~~
mikeash
Nice test. Surprising results for clang. I take it that it doesn't crash for
you when it prints NULL? That's the case for me. I'd have thought it would
either remove the NULL check and crash, or leave it in but continue to
dereference NULL, and thus crash.

~~~
saurik
The code when it doesn't crash has been reorganized by the compiler so the
indirection doesn't even happen until it is preparing the arguments for the
printf, which is in an else after the NULL check. If you don't need the value
of a computation in one of two sides of an if statement and you know there are
no legitimate side effects (I say "legitimate", as crashing does not count) to
the calculation, it makes sense to delay the execution of that logic until and
unless you actually need it done. I thereby don't find either result
surprising here: it is just a demonstration of different optimization
decisions and abilities of the two compilers.

------
fizixer
Interactive optimizers?

Take whatever author wrote as "dialog and thinking" of the optimizer and turn
it into prompts.

There should be at least some (aggressive) optimizations which should require
user's permission. And if the user gives permission once or twice (or
whatever), then it happens automatically.

And since we're talking about LLVM/Clang, the project that is supposed to take
the compilation process to the "next level", why not.

------
Too
He doesn't mention which type of smart handle it is but shouldn't null in
shared_ptr be compared after doing get(), or simply with the bool override?

------
sklogic
It was LLVM, not Clang. And it is a really great idea to treat UB as "it never
happened". Just do not rely on UB and always use static analysis.

~~~
Kristine1975
And turn on the Undefined Behavior Sanitizer
([http://clang.llvm.org/docs/UndefinedBehaviorSanitizer.html](http://clang.llvm.org/docs/UndefinedBehaviorSanitizer.html))
with

    
    
      -fsanitize=undefined
    

It would have caught this bug at run-time.

------
nothrabannosir
How does this not segfault on -O0 if the pointer is NULL?

~~~
jzwinck
Conceptually, dereferencing a pointer then taking the address of the result is
a no-op, you get back what you put in. There's no _requirement_ for
dereferencing a null pointer to segfault, only an expectation of many modern
commodity-system programmers. Some systems allow you to dereference NULL and
just return a bunch of zeros there. It's allowed by C and C++.

The real question is, how does anyone compile without "-Wall -Werror" in 2015?
Ignoring your compiler when it tells you your code is wrong is negligence.

~~~
gizmo686
My understanding of the C standard is that is explicitly does not allow you to
dereference NULL, even if you really want to know what is at memory address 0.
Of course this behaviour is only undefined, so an implementation if free to
let you do it if it wants to.

~~~
Kristine1975
Minor nitpick: A NULL pointer need not point to memory address 0 (e.g. it
could point to the highest possible address, or have a special "this is a NULL
pointer" bit set). The C11 Standard does a little dance with "null pointer"
and "null pointer constant" there, c.f. chapter 6.3.2.3:

 _An integer constant expression with the value 0 [...] is called a null
pointer constant. If a null pointer constant is converted to a pointer type,
the resulting pointer, called a null pointer, is guaranteed to compare unequal
to a pointer to any object or function._

