
Mean of two floating point numbers can be dangerous - Scea91
http://blog.honzabrabec.cz/2016/06/05/mean-of-two-floating-point-numbers-can-be-dangerous/
======
laughinghan
This behavior can actually be turned around and exploited to do some
calculations to full precision, using algorithms that _look_ like they would
infinite loop: [http://www.shapeoperator.com/2014/02/22/bisecting-
floats/](http://www.shapeoperator.com/2014/02/22/bisecting-floats/)

~~~
Scea91
Thanks, it is interesting. I have encountered something related too:
[http://people.eecs.berkeley.edu/~jrs/papers/robust-
predicate...](http://people.eecs.berkeley.edu/~jrs/papers/robust-
predicates.pdf)

------
Confusion
Related: How do you compute the midpoint of an interval? [1]. PDF [2]

[1]
[http://dl.acm.org/citation.cfm?id=2493882](http://dl.acm.org/citation.cfm?id=2493882)
[2] [https://hal.archives-
ouvertes.fr/hal-00576641v1/document](https://hal.archives-
ouvertes.fr/hal-00576641v1/document)

~~~
akkartik
This really should be the top comment of the thread. I've been reading the
paper slowly since yesterday.

Tl;dr - the best way to compute mid-points is:

    
    
        round_to_nearest_even((a - a/2) + b/2)
    

Why?

It can't be (a+b)/2 because (a+b) can overflow.

It can't be (a/2 + b/2) because a/2 or b/2 can underflow. However, this works
if you explicitly handle the case when a == b.

It can't be (a + (b/2 - a/2)) for similar but more complicated reasons.

However, if you switch things around, ((a - a/2) + b/2) works great without
the special case. (Though all three options need to handle a second special
case: when a == -b.)

Regarding the rounding method, you want it to be symmetric and moreover you
want nearby numbers to not always be biased in the same direction (as might
happen if you always round towards zero). Hence rounding to the nearest 'even'
floating point number (i.e. nearest bit pattern with a zero in the LSB)

\---

In the case of OP, I think the paper would argue that the mid-point was
working just fine, and that the code surrounding it isn't considering all
possibilities (symmetrically). But then the discussion of infinities in the
paper suggests that no matter what you do, there are issues for surrounding
code to consider. Which in turn suggests that floating point is absolutely
busted as an abstraction.

------
Someone
_" and on my hardware it caused the splitValue to be equal to point2"_

The fix is the right one, but in C/C++, hardware is not what dictates rounding
mode, the language implementation does, and that can be changed at runtime
([http://www.cplusplus.com/reference/cfenv/fesetround/](http://www.cplusplus.com/reference/cfenv/fesetround/))

I think C/C++ default to 'round to nearest', with ties doing 'round to even'.
Gnu libc certainly does
([http://www.gnu.org/software/libc/manual/html_node/Rounding.h...](http://www.gnu.org/software/libc/manual/html_node/Rounding.html))

------
rwmj
I had a fun floating-point related bug the other day: I wanted to sort a list
of doubles in C, and my comparison function was in essence:

    
    
        static int compare (void *dpa, void *dpb)
        {
          return *(double*)dpa - *(double*)dpb;
        }
    

The doubles represented times in nanoseconds, and the program failed to sort
properly when the total runtime was larger than (IIRC) 2 seconds. I wonder if
you can work out why.

(Spoiler alert) the fix was:
[https://github.com/libguestfs/libguestfs/commit/1f4a0bd90df3...](https://github.com/libguestfs/libguestfs/commit/1f4a0bd90df35df56eb86c2386d801b51828cb22)

~~~
CamperBob2
Your fix requires a lot of mental CPU cycles on the part of whoever reads the
code. Are you sure it wouldn't have been better to do it the simple, obvious
way?

    
    
       double a = *(double *) dpa;   
       double b = *(double *) dpb;
    
       if (a > b) return 1;
       if (b < a) return -1;
       return 0;

~~~
rwmj
Yup I think you are right -- the (fixed) code is very obscure now.

~~~
huhtenberg
I think it's actually quite elegant. The issue is that it always does two
comparisons, even when one would suffice.

~~~
Someone
Depending on how the compiler computes _a >b_, it may be branchless, though,
and I expect that to be more important on modern CPUs.

------
wronskian
I saw this exact failure mode in the wild - a C++ Newton-Raphson
implementation that relied on the order coming out of the mean calculation. In
this case, the ordering put the N-R algorithm out and instability resulted. I
seem to remember the fix was to put some simple bisection around the N-R
guess, and if it went outside that range, truncate at the bisection range. It
worked! (If anyone really wants the details, I can probably find the exact fix
in our source control history.)

~~~
anon4711
I'm sufficiently interested to ask for further details :)

------
kazinator
I don't understand this article. If you know the index, or other indicator of
location, of the second point in the sequence, then just split the sequence
before that point. No arithmetic involved; the type of the object is not even
relevant.

~~~
Scea91
You are right that in this isolated problem it would make sense. However, this
is actually a classifier so the idea is to maximize the margin in the split.

------
jrg123
That's (one reason) why standard containers use < instead of <=

------
anon4711
Even though this is not what this article is about, quite a few comments
suggest better ways to compute the midpoint of two doubles, where "better" is
supposed to mean "such that unnecessary overflow is avoided".

What I haven't seen is the suggestion

    
    
      0.5*x + 0.5*y
    

Yes, we now have two multiplications, but floating point multiplication is
considerably faster than division.

------
bnolsen
to be robust you need to deal with overflow and underflow and also pesky
differences in domain.

overflow breaks (x + y) / 2\. where x and y are large underflow breaks (x /
2.) + (y / 2.) where x and y are small.

and of course the problem where x is outside the exponent of y. But one of
these values gets treated as though it were insignificant and effectively 0.

~~~
Scea91
I think that it is necessary to make a compromise between absolute robustness
and readability, maintainability etc.

I hope that the input to the program is sensible enough that floating point
overflow won't be an issue, since there are more complicated formulas in the
program than this simple one.

------
peter303
Some numerical packages have an Almost class to deal with numerical
imprecision. For example you test for Almost.equal with tiny tolerance.

~~~
panic
How would that help here?

------
d33
Obligatory: "What Every Programmer Should Know About Floating-Point
Arithmetic"

[http://floating-point-gui.de/](http://floating-point-gui.de/)

------
dingo_bat
I think the more important lesson here is to not write algorithms in a way
that go into recursion because data types aren't infinitely precise.

~~~
thedufer
The original didn't require infinite precision; rather, it required that the
average of two different numbers be between the two or equal to the smaller
one. This is true in integer arithmetic due to rounding down, but apparently
not true for floating point.

------
PDoyle
Same happens with integers. Try computing the mean of 3 and 4 using integer
math.

The only difference, really, is that floating point can lull you into
believing they have unlimited precision. With integer math, the problem would
have been more obvious in the first place.

~~~
dskloet
It's not the same. With integer division you know the division rounds down, so
using <= always splits between the two points.

That said, it's always safer to compute the mean of two integers as

min + (max - min) / 2

to avoid integer overflow.

~~~
TimonKnigge
I like this one better: (min&max) + ((min^max) >> 1)

    
    
      #include <iostream>
      using namespace std;
    
      int avg(int a, int b) {
        return (a&b) + ((a^b)>>1);
      }
    
      int main() {
        cout << avg(int(2e9), int(2e9 + 10)) << endl;
        cout << avg(int(2e9), int(-2e9)) << endl;
        cout << avg(int(-2e9), int(-2e9 - 10)) << endl;
        
        return 0;
      }
    

Gives:

    
    
      2000000005
      0
      -2000000005

~~~
Retr0spectrum
Is there an explanation of how that actually works somewhere?

~~~
TimonKnigge
Yes there is! Right here:

It's actually really simple. We'll write a and b in binary notation, for
example:

    
    
      a = 1001101
      b = 0100111
    

Now what happens when you add two numbers in binary? We essentially add the
numbers in each column together, and if it overflows, we carry to the next
column (this is how you carry out addition in general).

So what are the columns where we need to carry, the ones that overflow? These
are given by (a&b) - the columns where both a and b contain a one. To actually
carry we just move everything one position to the left: ((a&b)<<1). And what
are the columns where we don't need to carry, the ones that don't overflow?
These are the ones where we have exactly one zero, either in a or in b, so:
a^b.

In other words, a + b = ((a&b)<<1) + (a^b). To compute the average, we divide
by two, or in other words, we bitshift to the right by one place: (a + b)/2 =
(a&b) + ((a^b)>>1)

If anything is unclear, feel free to ask :)

~~~
Retr0spectrum
Great explanation, thanks.

------
erikb
I would have upvoted if the title would have been "Mean of two floating point
numbers can be mean".

