
Dangerous Optimizations and the Loss of Causality in C and C++ (2010) [pdf] - pmarin
https://pubweb.eng.utah.edu/~cs5785/slides-f10/Dangerous+Optimizations.pdf
======
Contero
> Consider the following example:
    
    
      if (cond) {
        A[1] = X;
      } else {
        A[0] = X;
      }
    

> A total license implementation could determine that, in the absence of any
> undefined behavior, the condition cond must have value 0 or 1.

> If undefined behavior occurred somewhere in the integer arithmetic of cond,
> then cond could end up evaluating to a value other than 0 or 1

I don't even follow this very first example. Neither C nor C++ have any
requirement that their condition statements be given 0 or 1.

In C:

> the first substatement is executed if the expression compares unequal to 0.

In C++:

> The value of a condition that is an expression is the value of the
> expression, contextually converted to bool

Where conversion to bool is defined as:

> A zero value, null pointer value, or null member pointer value is converted
> to false; any other value is converted to true

I'm sure such a thing could occur if cond had a boolean type but contained
uninitialized data. That would be similar to the situation talked about in
this link: [https://markshroyer.com/2012/06/c-both-true-and-
false/](https://markshroyer.com/2012/06/c-both-true-and-false/)

~~~
Pinus
I, too, scratched my head over that for a while, and eventually realized that
I had probably misunderstood what the authors are trying to say.

I think what they mean is that _if_ the compiler has analyzed the code, not
shown in the example, that comes before the if statement, and concluded that
unless something undefined happens, cond can only be 0 or 1, _then_ it can
optimize out the condition.

They are not saying that the code actually shown is enough to conclude that
cond must be 0 or 1.

~~~
Contero
Yeah re-reading the wording now I think you're right. It's this part that
throws me off:

> could determine that, in the absence of any undefined behavior

"could determine that" based on the code example shown

vs

"could determine that" based on static analysis performed on some preceding
code

It would have been a lot easier to wrap my head around if it were an example
where cond could be 0 or 4 or something along those lines. It would really
underscore the compiler's desire to reuse the cond as the index.

------
coliveira
I don't get why people get so upset about C++ optimizations. They are by
nature optional. If you think an optimization is dangerous, just disable it.
That's why compilers have optimization levels, so you can select what you're
comfortable with.

~~~
keldaris
Exactly. It's become fashionable to decry the lengths to which C/C++ compilers
are able to go in the pursuit of performance because of "safety" issues, but
the advantage of letting the programmer fully control every knob in the chain
is immense.

~~~
vvanders
> but the advantage of letting the programmer fully control every knob in the
> chain is immense.

That's actually not completely true.

Take the restrict keyword for instance. Super painful to use in C/C++ and very
dangerous. In Rust since the language has constrains around single mutable
pointers they can turn it on globally and you get it "for free". If I recall
correctly it's going live in one of the upcoming stable releases.

Sometimes constraints can actually let you get better performance than a wild-
west of pointers everywhere.

~~~
shepmaster
> going live in one of the upcoming stable releases

If I remember correctly, it was originally enabled but had to be disabled
because LLVM had some bugs around how it was handled. These arose because
`restrict` is comparatively rarely used in C / C++ code.

[https://github.com/rust-lang/rust/issues/31681](https://github.com/rust-
lang/rust/issues/31681)

~~~
steveklabnik
Turning it back on landed in master a little over a month ago; meaning it will
be in Rust 1.28, the next release.

------
petermcneeley
The simplest and best example in this presentation:

(x * 2000) / 1000 can be reduced to x * 2 but only if we make some assumptions
about x (about if x can overflow)

The reason why this is such a problem in C languages is because most of the
time one is working in a mixed abstraction environment. The underlying model
for many parts of the language is assembly. In order to get away from these
issues and to produce more optional code we should move away from mixing
models of abstraction.

~~~
andromeduck
The value of (X _2000) /1000 will always depend on type. For example, this
could result in overflows if it were a fixed point integral or loss of
precision if it were a floating point type. X could be a non associative type
with overloaded operators like a digest/summary type.

IMO the wierdist thing to read in C++ though is X / A _ A. Its roughly the
same as X -= X % A. They values may differ if X or A are negative.

------
kulu2002
I had very bad experience of -O flag especially in case of ISRs. I don't use
any optimisation levels... even at -O0 GCC seems to do something.

Good practice as per my experience is to check MCDC and code coverage at unit
test level

------
eru
I guess, if you want to know what your code will be doing when executed, don't
use C (and don't even dream of C++).

Those are reserved for people who value speed over correctness.

~~~
tomnj
False. Like for any language, C and C++ programmers (should) put very high
value on correctness. Incorrect code can be written in safe languages too, but
clearly C/C++ make it much easier to kill your program spectacularly.
Discipline, knowledge, and tools are required to work in C/C++.

~~~
pjmlp
The usual statement.

First time I heard it was about 1993.

~~~
mort96
That doesn't mean it's not true though? People habe been claiming the earth is
round since 500 BC, yet it's still as true as ever.

~~~
pjmlp
In this case the CVE database records history proves otherwise.

