

Why doesn't GCC optimize a*a*a*a*a*a to (a*a*a)*(a*a*a)? (2013) - Doublon
http://stackoverflow.com/questions/6430448/why-doesnt-gcc-optimize-aaaaaa-to-aaaaaa

======
tikhonj
Because, of course, floating point addition and multiplication is not
associative. This turns out _surprisingly_ easy to demonstrate:

    
    
        0.1 + (0.2 + 0.3) = 0.6
        0.1 + 0.2 + 0.3 = 0.6000000000000001
    

and the same for multiplication:

    
    
        0.1 * 0.2 * 0.3 = 6.000000000000001e-3
        0.1 * (0.2 * 0.3) = 6.0e-3
    

It actually isn't "surprising" if you understand how the format works. It
essentially uses scientific notation but in binary, with a set number of bits
for both the mantissa and the exponent as well as a few changes thrown in for
better behavior at its limits (like denormalization). This means that it can't
directly express numbers which are very easy to write in decimal form, like
0.1, just like we can't express 1/3 as a finite decimal. It's designed to
manage this as well as possible with the small number of bits at its disposal,
but we still inevitably run into these issues.

Of course, most programmers only have a vague idea of how floating point
numbers work. (I'm certainly among them!) It's very easy to run into problems.
And even with a better understanding of the format, it's still very difficult
to predict exactly what will happen in more complex expressions.

A really cool aside is that there are some relatively new toys we can use to
model floating point numbers in interesting ways. In particular, several SMT
solvers including Z3[1] now support a "theory of floating point" which lets us
exhaustively verify and analyze programs that use floating point numbers. I
haven't seen any applications taking advantage of this directly, but I
personally find it very exciting and will probably try using it for debugging
the next time I have to work with numeric code.

A little while back, there was an article about how you can test floating
point functions by enumerating every single 32-bit float. This is a great way
of thinking! However, people were right to point out that this does not really
scale when you have more than one float input or if you want to talk about
doubles. This is why SMT solvers supporting floating point numbers is so
exciting: it makes this sort of approach practical even for programs that use
lots of doubles. So you _can_ test every single double or every single pair of
doubles or more, just by being clever about how you do it.

I haven't tried using the floating point theories, so I have no idea how they
scale. However, I suspect they are not significantly worse than normal
bitvectors (ie signed/unsigned integers). And those scale really well to
larger sizes or multiple variables. Assuming the FP support scales even a
fraction as well, this should be enough to practically verify pretty non-
trivial functions!

[1]: [http://z3.codeplex.com/](http://z3.codeplex.com/)

~~~
mark-r
I've always thought it would be helpful to have floating point values that
tracked the upper and lower bounds of the possible error. That would make
analysis of the results much easier.

~~~
robzyb
But what about the accuracy of the error bounds?

~~~
pascal_cuoq
It is easy to track the real result of the computation with floating-point
intervals: round down when computing the lower bound, and up when computing
the upper bound of each interval.

------
Khaine
Completely relevant podcast discussion [http://edgecasesshow.com/083-floating-
point-numbers-are-a-le...](http://edgecasesshow.com/083-floating-point-
numbers-are-a-leaky-abstraction.html)

~~~
mahmud
Wow! Thank you. I didn't realize there were such legit programming podcasts
around. Any other similar ones I need to know about?

~~~
mietek
Check out The Haskell Cast:

[http://www.haskellcast.com](http://www.haskellcast.com)

------
octo_t
Basically:

floating point is _hard_. Programmers get it wrong, compilers get it right.

(normally).

~~~
pascal_cuoq
In the case of C compilers, they get floating-point right by re-defining what
the language definition means:
[http://arxiv.org/abs/cs/0701192](http://arxiv.org/abs/cs/0701192)

There has been some progress since the time this report was an accurate
description of the situation, but the situation is still mostly terrible:
[http://stackoverflow.com/questions/17663780/is-there-a-
docum...](http://stackoverflow.com/questions/17663780/is-there-a-document-
describing-how-clang-handles-excess-floating-point-precision)

And soon, fused-multiply-add will be common enough that compilers will start
generating by default the instruction for code that contains addiction and
multiplications, causing a regression with respect to the current situation.

~~~
pm215
Using fused-multiply-add unless the user has given the compiler option to
deviate from strict IEEE754 results would be a compiler bug. I'm pretty sure
gcc gets this right -- it will use FMAC on ARM (for instance) only if you pass
it the fast-math option.

~~~
pascal_cuoq
A C99 compiler pragma, FP_CONTRACT, lets the programmer decide locally whether
using an FMA instruction for multiplication and addition is allowable.

The debate is whether the pragma should be on or off by default. I am a strong
proponent of the “off” default. Some argue for the “on” default. Even if the
“on” camp wins at some point for some compiler, you can always use the line
below to make a standard-compliant compiler that is otherwise respectful of
IEEE 754 respect multiplication and addition:

#pragma STDC FP_CONTRACT ON

But the point is that you will have to use the incantation explicitly.

If you are an ordinary programmer who does not know about the incantation, you
will eventually encounter a situation where, say, (a * b + c == a * b + c)
evaluates to false despite the results being finite, and it will reinforce the
superstition that floating-point is black magic. And since this example is
remarkable, you will tell others about it, the example will be repeated and
distorted, and the superstition will never die.

~~~
stephencanon
I think the most compelling argument for “off” being the default is xx - yy.
If FP_CONTRACT defaults to “on”, most uses of this expression will become
fma(x,x,-yy), which will result in the expression not being exactly zero when
x == y.

The most compelling argument for “on” being the default is that most users
don’t care, and “on” allows compiler optimizations that save energy. Numerical
consistency or independence of Russian gas[1]?

[1] Yeah, right. "Numerical consistency or 20 seconds longer playtime of
flappy bird on a charge?” is a way better argument.

------
sheetjs
Always a great read: "What Every Computer Scientist Should Know About
Floating-Point Arithmetic"

PDF rendering:
[http://www.cse.msu.edu/~cse320/Documents/FloatingPoint.pdf](http://www.cse.msu.edu/~cse320/Documents/FloatingPoint.pdf)

HTML rendering:
[http://docs.oracle.com/cd/E19957-01/806-3568/ncg_goldberg.ht...](http://docs.oracle.com/cd/E19957-01/806-3568/ncg_goldberg.html)

------
TeMPOraL
One thing I always wondered about is _why_ are we using floating point
arithmetic at all, instead of fixed point math with explicitly specified
ranges (say, "here I need 20 bits for the integer part and 44 for the
fractional part")? What is the practical value of having a floating point that
would justify dealing with all that complexity and conceptual problems they
introduce?

~~~
pascal_cuoq
Have you used fixed-point arithmetic much? You need to specify the width and
the position of the dot for each intermediate variable (for instance,
multiplying two numbers that have 20 bits for the integer part may require 40
bits for the integer part of the result). Since the operands in 20.44 format
had only 64 bits of precision, it only makes sense to represent the result to
64 bits, so the format of this result should probably be 40.24.

If you don't do this, your program may still work but you are wasting space.
For instance, if the multiplication of your two 20.44 numbers does not
overflow the 20.44 format, good for you, but it means that between the two
operands, you were wasting 20 bits in total for leading zeroes that carried no
information.

Consider floating-point as a way not to have to make all these decisions at
compile-time and still use space extremely efficiently to represent as much
precision as possible.

~~~
stephencanon
Also: when people screw with FP, 99% of the time the answer is still vaguely
sane or makes it very obvious that something went wrong. When people screw up
with fixed point, 99% of the time the result is indistinguishable from noise.

------
pkulak
> Um... you know that a _a_ a _a_ a _a and (a_ a _a)_ (a _a_ a) are not the
> same with floating point numbers, don't you?

Why would so many people upvote such a condescending comment?

~~~
kawliga
The exact same reason I stopped asking/answering questions on SO years ago -
too many condescending trolls trying to act smart.

------
willvarfar
Dup:

Previously on HN
[https://news.ycombinator.com/item?id=2684254](https://news.ycombinator.com/item?id=2684254)

~~~
octo_t
About 3 years ago...

~~~
chrisBob
3 years is a loooooong time for the internet. I am going to put a note on my
calendar to post this again in 3 years. What is the median age of an active HN
account anyway?

------
d23
Random sidenote, but I've never seen an SO answer be upvoted in realtime until
now. Didn't even know they programmed such functionality.

~~~
bashcoder
It's a pretty impressive system. If you're interested in their architecture,
here's a YouTube video worth watching: "Marco Cecconi The Architecture of
Stack Overflow - Developer Conference 2013" [0]

[0]
[https://www.youtube.com/watch?v=OGi8FT2j8hE](https://www.youtube.com/watch?v=OGi8FT2j8hE)

------
username42
The non associativity of addition is obvious, but for the multiplication, I
understand why it does not give always the same answer, but I do not see why a
change of the order could change the accuracy.

~~~
Ono-Sendai
The result depends on the number of multiplications, and hence roundings, that
are done.

------
ww2
How would a functional language compiler deal with this case? like GHC?

~~~
pascal_cuoq
It doesn't. If we forget about FPU flags, floating-point multiplication is
pure. That does not change anything to the fact that a * (b * c) is different
from (a * b) * c.

The problem is not side-effects in multiplication. The problem is that * is
not associative.

------
pygy_
I already knew about the lack of associativity of IEEE floats, but TIL that
`pow(x,n)` is more precise than chained multiplications.

Good to know.

~~~
rbonvall
Technically it is not a matter of precision but of accuracy.

~~~
pygy_
Interesting, thanks for the correction.

I'm neither a native English speaker nor an engineer, and I didn't know about
the distinction.

[https://en.wikipedia.org/wiki/Accuracy_and_precision](https://en.wikipedia.org/wiki/Accuracy_and_precision)

------
jevinskie
There is a discussion from today and yesterday on llvm-dev that deals with
floating point guarantees and optimizations:
[http://thread.gmane.org/gmane.comp.compilers.clang.devel/358...](http://thread.gmane.org/gmane.comp.compilers.clang.devel/35858/)

------
woodchuck64
floating point in short: computer representation of scientific notation(1)
with sign bit, exponent(2) and coefficient(3) crammed in the same word.

1\.
[http://en.wikipedia.org/wiki/Scientific_notation](http://en.wikipedia.org/wiki/Scientific_notation)

2\. base 2, biased by 127, 8 bits (IEEE float)

3\. base 2, implicit leading 1, 23 bits (IEEE float)

[http://en.wikipedia.org/wiki/Single-precision_floating-
point...](http://en.wikipedia.org/wiki/Single-precision_floating-point_format)

------
picomancer
It actually does, if a is an integer.

