
If 1+x is 1, how much is 1-x? - gus_massa
https://gus-massa.blogspot.com/2015/04/if-1x-is-1-how-much-is-1-x.html
======
jasode
Nice post.

When people wonder why floating point calculations sometimes don't perfectly
match the expectations of a "correct" answer[1], inevitably people will often
respond with the famous paper, _" What Every Computer Scientist Should Know
About Floating-Point Arithmetic"_[2].

Unfortunately, the people who suggest that paper don't realize that it's a
very technical treatise and not an appropriate introduction for the types of
programmers who ask the question.

I've always thought a better answer was to have the programmer "construct" a
toy floating point format (e.g. 8 bits instead of 32-bit-or 64-bits). The
programmer would then notice that he has a finite representation of 256
possible "states" to represent -inf to 0 to +inf. The programmer would have to
_pick_ a range and a precision to "squeeze" the Real Numbers into that
representation of 8 bits.

Design choice 1: I'll have my 256 possible states represent -1billion to
+1billion. Well, since you can't overlay 256 states/values across 2 billion
_unique_ values, you will have huge "gaps" between numbers.

Design choice 2: I'll have my 256 possible states represent -10 to +10. With a
smaller range, you can now increase the precision and represent fractional
numbers with digits after the decimal point. But the range is very small.
Also, you still have "gaps" where you can't represent most[3] fractional
numbers.

The programmer would quickly notice that he'd run into some contradictions. No
matter what he does, there will always be gaps. The lightbulb would then go on
and then he'd immediately scale up the limitations inherent in 8bit floating
point all the way to 64bit floating point and know exactly why 0.3333 * 3.0
does not exactly equal 1.0.

It had been on my vague "todo" list to write such a tutorial but your blog
post already gets that "construction of floating point" idea nicely. The
programmer can then explore more advanced topics of "normalization" and "bias"
in designing a floating point format.

[1] [http://stackoverflow.com/questions/588004/is-floating-
point-...](http://stackoverflow.com/questions/588004/is-floating-point-math-
broken)

[2]
[http://docs.oracle.com/cd/E19957-01/806-3568/ncg_goldberg.ht...](http://docs.oracle.com/cd/E19957-01/806-3568/ncg_goldberg.html)

[3][http://math.stackexchange.com/questions/474415/intuitive-
exp...](http://math.stackexchange.com/questions/474415/intuitive-explanation-
for-how-could-there-be-more-irrational-numbers-than-rati)

~~~
thomasahle
Here is a nice challenge: Just like we have arbitrary precision integers
(allocating more memory as needed) and arbitrary precision "fractional
numbers", design a library providing arbitrary precision "computable numbers".

~~~
kenj0418
> design a library providing arbitrary precision "computable numbers"

I believe they call those "programming languages"

~~~
scarmig
Naive question:

Programs written in programming languages aren't guaranteed to terminate.
Computable numbers, however, are all real numbers that can be represented to
arbitrary precision by a computation that terminates. Wouldn't that mean we
don't need a fully fledged programming language to have the capability to
represent all computable numbers?

Somewhere in there is the idea we just need a program that determines whether
an arbitrary program terminates or not...

~~~
mkehrt
I can't tell how much of this is a joke, but, yes, it turns out you can't
write such a programming language for exactly the reason you allude to: you
can't tell whether a given number is computable in finite time, as you don't
know whether your program will terminate.

------
ColinWright
This is an excellent explanation - clear, concise, easy to work through, and
enlightening.

Thank you.

I think I've sent a collection of suggested corrections via the web site, I
include just the technical one here:

You wrote:

    
    
        Another example -|2|0011 represents the number
        X = - 2^2 1.0011_{2}
          = -  4 (1 + 0 + 0 1/2 1 1/8 1/4 + 1/16 + 1)
          = -  4 * 19/16
          = - 19/4
          = 4.75
    

This is missing a minus sign at the very end.

~~~
gus_massa
Thanks, fixed.

I'll probably fix all the other errors/typos you send in the mail, but it will
take a few minutes.

~~~
ColinWright
Cool - and again, lovely post, nicely done.

------
bitL
One of the reasons I would be happier if there were more emphasis in real
math/geometry algorithms on "range" processing, i.e. treating each "number" as
a scaled "range" instead and figuring out how to combine these ranges in every
step to get some stable solution. Most often I see programmers just bashing
analytic formulas like they were computing something precisely and then being
surprised that results are way off. Imagine computing curve intersections by a
combination of analytical and iterative method and then figuring out if such
an intersection is unique or if it is some existing endpoint...

~~~
maho
Interval arithmetic [1] has been around since the 1950's, I believe, but for
some reason it never caught on.

In a nutshell, interval arithmetic is about storing each imprecise number as a
lower bound and upper bound. On each calculation, the lower bound is rounded
down and the upper bound is rounded up. The existing libraries are well
developed, and the performance hit is often surprisingly close to 2.

[1]
[http://en.m.wikipedia.org/wiki/Interval_arithmetic](http://en.m.wikipedia.org/wiki/Interval_arithmetic)

~~~
brandmeyer
That's because intervals tend to just keep growing. Without some function to
make the interval smaller or otherwise keep it bounded, cumulative interval
arithmetic tends to be useless.

Using the rounding mode to calculate once with rounding down, once with
standard rounding, and once to rounding up will give a much better indication
of whether or not an algorithm is sensitive to roundoff error.

------
meragrin
This is about floating point arithmetic in computers.

~~~
madez
This is specifically about IEEE 754.

Arbitray-precision floating point arithmetic like e.g. GMP offers doesn't
suffer these issues.

~~~
brandmeyer
MPFR is sensitive in exactly the same way. Just because you can arbitrarily
increase the precision of results with that library doesn't mean that it isn't
still sensitive to roundoff error.

~~~
madez
Yes, you are right.

However, it's possible using MPFR in a way without suffering this issues. One
could increase precision every time there would be rounding.

~~~
ori_b
The problem is that the second you multiply by pi or e, which is fairly
common, your required precision for a perfect result becomes infinite.

Even if you don't, your required precision blows up very quickly, and your
performance drops.

~~~
madez
In finite precision arithmetic "multiplying by pi or e" makes _no sense_. And
all arithmetic is finite precision. You could multiply by an approximation,
and accordingly adjust the precision to avoid rounding.

In more general context, you could choose e as the base of your number system
and then multiplying by e is a trivial shift. Then, however, multiplying by 2
would make arithmetically no sense. The natural question that arises is
whether it's possible to represent all computable numbers non-lossy in one
system. The answer is yes.

Besides that, I think you should re-evaluate your decision-making for votes.

~~~
ori_b
> In finite precision arithmetic "multiplying by pi or e" makes no sense. And
> all arithmetic is finite precision.

Yes. that's kind of my point: Either you have a finite precision cutoff and
lose accuracy, or the memory used for your arbitrary precision representation
blows up to incredibly large values (transcendentals will do this to you
quickly), or a combination of the two.

------
UhUhUhUh
That's the very interesting border between numbers as abstractions of
quantities and numbers as element of a set (N, Z, R, C etc. The question of 0
is particularly puzzling to me.

~~~
wyager
Most numerical sets you're likely to run into (including N, Z, R, and C) are
groups under addition, which (among other things) means `x + a = a ⇔ x = 0`

~~~
sadfsdfsda
N is not a group under addition (lacks inverses).

------
pascal_cuoq
StackOverflow user tmyklebu provided an efficient algorithm to solve a more
general floating-point equation, C1 + x = C2, using only computations in the
very floating-point system that the equation is written in:

[http://stackoverflow.com/a/24223668/139746](http://stackoverflow.com/a/24223668/139746)

------
frozenport
>>Mathematically, if we had infinite precision

Or just fixed precision? For example the calculation is intuitive for `int`s.

------
eccstartup
Mathematica gives 1.

    
    
        Int[1]:= 1 - x /. Solve[1 + x == 1, x]
        Out[1]= {1}

------
Sevzinn
One.

------
t0mbstone
If 1+x is 1, then logically, 1-x should also be 1. Period.

If this is not the case, then you have a crappy hardware or software math
implementation that doesn't follow logic. Duh.

~~~
ColinWright
I'd be amazed if the hardware you're using and the software you're running
doesn't have the "feature" that for some value of x, 1+x=1 and yet 1-x != 1.
If you read the article you might understand why.

If you don't understand why, perhaps you could come back and ask specific
questions.

For the concrete and practical minded who don't want to bother with
understanding the theoretical arguments, here's some python:

    
    
        #!/usr/bin/python
    
        a = 1.
        x = 1.
    
        while a+x!=1.:
            x = x/2
    
        print 'x =', x
        print 'a+x =', '%.18f' % (a+x)
        print 'a-x =', '%.18f' % (a-x)
    

Output:

    
    
        x = 1.11022302463e-16
        a+x = 1.000000000000000000
        a-x = 0.999999999999999889

~~~
eccstartup
No, no! If your program can't get things that even kids can do done, you are
writing rubbish.

~~~
ColinWright
I see you posted this on Facebook:

    
    
        There was a post on hacker news about floating numbers.
        It said "if 1+x is 1", then "1-x is not."  There is a
        long explanation of how this is true.  They are very
        proud of this outcome. They believe this wired outcome
        is "perfect logical". I just want to say "F**k you!"
        If your program can't handle those even kids can do,
        you are writing rubbish. F**k you!
    

It's not a case of being proud of it, it's an unavoidable consequence of using
a small number of bits (32, 64, 128, 65536, all these numbers are small) to
try to represent a very large range of numbers. In other words, it's an
unavoidable consequence of using floating point.

The point is that floating point numbers are not mathematics, and they don't
obey all the mathematical laws you've been taught. They're an excellent model,
provided you stay away from the edges. But if you do go close to the edges,
the inaccuracies of the model get exposed. To understand the difference is of
real value.

Let's use mathematical reasoning to show why it's true.

Using 64 bit (say) numbers to represent a range larger than 0 to 2^64-1 we
must choose either that the numbers we can represent are equally spaced, or
not.

If we choose them to be equally spaced then we either cannot represent 0, or
we cannot represent 1. To see this, suppose we can represent both 0 and 1. The
numbers being equally spaced means that we can then represent 2, 3, 4, and so
on up to 2^64-1. Now we have nothing left to go beyond, and thus we are not
representing a range larger than 0 to 2^64-1. So if the representable numbers
are equally spaced, then we cannot represent both 0 and 1, which seems sub-
optimal.

If we choose them not to be equally spaced then there will be consecutive
representable numbers r0<r1<r2, such that mathematically r1-r0 is not equal to
r2-r1. Now set:

    
    
        d0 = r1-r0 so that r0 + d0 = r1
        d1 = r2-r1 so that r1 + d1 = r2
    

Suppose that d0 < d1, and let h = (d0+d1)/2\. Hence (d0/2)<h<(d1/2).

Note that d0, d1, and h might not be representable.

Now r0 and r1 are consecutive representable numbers. The number h is more than
half the gap between r0 to r1, so we would want r1-h to round down to r0. But
the number h is _less_ than half the gap from r1 to r2, so we would want r1+h
to round to r1.

Therefore:

    
    
        r1 - h = r0
        r1 + h = r1
    

Thus it is unavoidable that if the representable numbers are not equally
spaced then we can find a and x such that:

    
    
        a + x == a
        a - x != a
    

The above argument still goes through even if d0>d1, or if we want things to
round down, or if we want things to round up, and we leave it as an exercise
for the interested reader to check the details in those cases.

Now you have three choices:

* Reject this, because it contradicts your belief that floating point numbers must behave in the same way as real, mathematical numbers;

* Demonstrate a logical flaw in the argument; or

* Accept the conclusion, even though it contradicts your belief that floating point number behave in the same way as real, mathematical numbers, and accept that floating point numbers are just a model of mathematical numbers.

~~~
eccstartup
Thank you. I hope one day I will not complain about this because I find a
better approach.

