

Incorrect optimization in 1963 - StylifyYourBlog
http://arcanesentiment.blogspot.com/2015/01/incorrect-optimization-in-1963.html

======
StefanKarpinski
> Floating-point users today are accustomed (or resigned, sometimes) to
> compilers that make invalid optimizations by assuming all arithmetic is
> mathematically correct instead of rounding.

Huh? Modern compilers don't do that. They will only reorder or re-associate
operations if they know that it will produce the exact same result. Anything
else is considered a severe compiler bug. The exception is if you explicitly
ask for such sketchy optimizations with the `-ffast-math` flag, of which the
GCC man page says: "This option is not turned on by any -O option besides
-Ofast since it can result in incorrect output for programs." This option is
off by default. The rest of the article is interesting, if correct –
unfortunately, it's a little hard not to be skeptical when the first sentence
is so flagrantly wrong.

~~~
aidenn0
Plenty of modern compilers will optimize e.g. a/c + b/c to (a+b)/c when
dealing with floating point. There is nothing in the C standard which makes
that invalid.

~~~
kps
> There is nothing in the C standard which makes that invalid.

C only _permits_ ¹ optimizations that do not change semantics, so changing the
result of a floating point expression is invalid. Example 5 on p16 of C11²
makes this explicit: “ _Rearrangement for floating-point expressions is often
restricted because of limitations in precision as well as range. The
implementation cannot generally apply the mathematical associative rules for
addition or multiplication, nor the distributive rule, because of roundoff
error, even in the absence of overflow and underflow. Likewise,
implementations cannot generally replace decimal constants in order to
rearrange expressions._ ”

¹ Of course many compilers have options that invoke something other than
standard-conforming C.

² C11 final draft: [http://www.open-
std.org/jtc1/sc22/wg14/www/docs/n1570.pdf](http://www.open-
std.org/jtc1/sc22/wg14/www/docs/n1570.pdf)

~~~
aidenn0
Thanks for that; I've purchased the C11 standard, but haven't gotten around to
reading it yet since I'm not likely to write anything in C11 for a few years
yet.

------
Someone
Undesirable and surprising, yes, but incorrect? The manual clearly describes
the behaviour and AFAIK, there wasn't a fortran standard yet.

I guess the reordering was done to simplify later compiler passes, in
particular the common subexpression elimination one.

~~~
acqq
Yes. The compilers at that time had as little as only 4 K words of RAM in the
whole computer. Everything that can be simplified in one pass made a huge
difference.

------
kazinator
The "not grouped by parentheses" is already where things go wrong. Parentheses
should determine the syntactic parse of the expression, not the order of
evaluation. Optimization works with an abstract representation of the program
where the parenthesis tokens are long gone.

That is to say (a + b) + c should not have any influence relative to a + b +
c, at least if addition is already left associative. Because then a + b + c
_already means_ (a + b) + c. Adding the parentheses to make (a + b) + c
doesn't change the parse whatsoever.

If parentheses are needed to establish the precise meaning, it means that the
compiler writer has neglected to document, and commit to, an associativity for
the operator; the compiler writer has not specified whether a + b + c means (a
+ b) + c or a + (b + c). That is a mistake: a gratuitous ambiguity in the
syntax of the programming language.

If there is such a mistake in the design, a stray ambiguity, then you cannot
call the optimization wrong. If it is not specified whether a + b + c means (a
+ b) + c, or a + (b + c) or something else like add3_in_any_order(a, b, c),
then the compiler is in fact free to use any of the possible parses as a guide
for the operation order.

~~~
Joky
Indeed: "The ANSI Fortran standard is less restrictive than the C standard: it
requires the compiler to respect the order of evaluation specified by
parentheses, but otherwise allows the compiler to reorder expressions as it
sees fit."

~~~
kazinator
That piece of text, if taken literally and perhaps out of context, means that
if we have a * b + c, the Fortran compiler is allowed to add b+c first, and
then multiply by a, because there are no parentheses. It's just a question of
whether the compiler writers "sees fit" such a thing.

In C, the order of operations is not ambiguous, and is distinct from the
evaluation order. In a * b + c, three operands a, b and c have to be evaluated
(the variables be reduced to the operand values they hold). This evaluation
takes place in an unspecified order (possibly leading to undefined behavior).
On these values, however, the arithmetic order of operations is clear:
multiplication takes precedence and so is done first.

The evaluation order makes a difference if the expressions a, b and c
themselves have side effects.

------
InfiniteRand
If you want really accurate math, the best idea for a program is maintain a
representation of an expression until it absolutely needs to, since every time
the math is condensed into actual numbers there is usually some loss of
precision. Even as humans, it makes sense, we rarely write 0.33333 instead of
1/3 unless we have to by circumstance (or at least you souldn't)

------
hamburglar
They say never optimize too early. 1963 is definitely too early in my
experience.

