
Floating point error is the least of my worries (2011) - rbanffy
https://www.johndcook.com/blog/2011/11/01/floating-point-worries/
======
smogcutter
In a funny way, reminds me of football: they'll bring out the chains to
precisely measure the yardage... to a spot that was arbitrarily chosen by the
ref. The refs do the best they can, but there's often no way they've spotted
the ball exactly where the play died. The whole ceremony with the chains is
ridiculous.

See also: economics.

~~~
wglb
Yes, I agree, and have thought about this a bit.

It seems like it effectively reduces to a coin flip choice.

~~~
jsight
I think that is the intent. If its really close then the system turns it into
a random outcome with no apparent bias.

------
dnautics
Having been someone that has programmed alternatives to IEEE FP, I would say
that this is not a correct perspective in this sense: Floating point error can
crop up in places _you will never expect_ , and when it does the magnitude can
be very difficult to predict (and can be far worse than any approximation or
modeling error).

Here's a good example. Try solving this set of linear equations by hand (it's
nearly trivial):

0.25510582x + 0.52746197y = 0.79981812

0.80143857x + 1.65707065y = 2.51270273

The answer you should get is: {-1, 2}, btw, which should be fairly easy to
verify by hand.

Now try doing it with a computer.

~~~
ajross
That's only a place you would "never expect" because you've carefully hidden
the gotcha. Stated less obtusely, you're finding the intersection between two
lines _which are parallel to JUST above the precision of an IEEE double_. This
is a cute example, not a real world situation. In the real world, precision
loss in this circumstance is acceptable and any real system that accepted a
"parallel" condition like this would have a check for it.

For a similar situation: atan2() exists because of a coverage issue in the
problem area and not a bug in the idea of a tangent or a float.

~~~
ajkjk
I would think that atan2() exists because atan() doesn't invert tan()
correctly for some purposes -- if you're trying to find the angle associated
with a point then atan(y/x) is just the wrong answer.

The fact that it avoids the division by 0 is a convenient side effect of also
being more correct.

(I'm just guessing; I don't know the actual history of it.)

~~~
ajross
The point was more that problems that need to use atan() in a context where
the line may point near the Y axis don't flip out over "floating point
precision", they just use atan2().

Problems like the one in the grandparent that need to intersect nearly-
parallel lines (or subspaces) use special techniques better suited to the
problem and don't just feed it to a linear solver.

------
keithnz
As he says, if you understand the nature of floating point, it should be fine.
But having seen this been misunderstood over and over and over and over again
on stackoverflow, many don't.

While we are on floating point, I really liked this article that was on HN a
while back [https://lemire.me/blog/2017/02/28/how-many-floating-point-
nu...](https://lemire.me/blog/2017/02/28/how-many-floating-point-numbers-are-
in-the-interval-01/)

------
roenxi
He is right, but misses the big issue with floating point which is that, as
far as I recall, it is neither commutative or associative. Hence, you can't
use algebra then check equalities.

Most capable programmers just don't think to use a == b for floating points,
because it isn't going to work a lot of the time, but for someone who isn't
aware of that minefield they can wander in and get some very stupid bugs.

~~~
bsder
> Most capable programmers just don't think to use a == b for floating points,
> because it isn't going to work a lot of the time, but for someone who isn't
> aware of that minefield they can wander in and get some very stupid bugs.

Well, there are two separate issues here:

1) The default for programming should be _decimal_ floating point instead of
_binary_ floating point. Suddenly, your strange "a == b" for floating point
works just fine because 0.1 is exactly representable. Thus, addition,
subtraction, and multiplication behave just like integer. Computers are now
fast enough that any language in which you don't actually specify a floating
point, should default to decimal. This covers 99% of the silly cases for
beginners, and benefits people doing actual monetary calculations.

2) Programmers need to be able to specify when they want _binary_ floating
point for speed along with all the weirdness.

~~~
kccqzy
The default for programming should be arbitrary precision rational numbers.
Think the rational numbers in Scheme or Haskell.

~~~
adwn
> _The default for programming should be arbitrary precision rational
> numbers._

And all is well – until you learn about _e_ and _pi_ and realize, that your
arbitrary precision rational numbers are no better than floating-point.

~~~
kccqzy
I find that the typical developer’s typical work doesn’t require irrational
numbers at all. Of courses those specifically doing numeric computing would be
best served with traditional floating point, but I truly think most don’t.
Maybe my experience is biased then.

~~~
adwn
The typical developer's typical work mostly requires integers, and when they
need reals, 64-bit floats are nearly always sufficient. I would even say that
arbitrary-size integers are needed more often than true rationals.

------
wglb
I am struggling to see how this argument applies to things requring serious
numerical analysis, such as finding eigenvalues and eigenvectors of a 500th
order matrix, or some other complex control system with feedback and many
parts.

Great pains need to be taken for calculations of this sort. One test: why
would you sort floating point numbers in an array before taking their sum?

------
deadcell
OP has never played KSP - had the kraken rip whole colony ships apart because
of oscillatory wobbles coming in and out of physics modes :(

~~~
conistonwater
As far as I can tell, that supports, rather than contradicts, the post's point
about modelling, fp, and approximation errors. It's a little too easy to
ascribe KSP's bugs to floating-point errors, its physics engine just isn't
trustworthy enough. How do you really know that the oscillations are not just
due to it solving a stiff system of ODEs with explicit Euler, or something
perfectly predictable like that?

------
Twisell
I just ran into the problem for an HR collegue that uses Excel.

Real financial data manage to produce three floats (with only 2 decimal places
as usual with finance) that when used together a+b-c produced -0 (actually
-0.1336e-20 or something like that).

The result ended up producing a cascade of mismatching results in further
calculations seeking for 0 balance.

It’s a known and identified "feature" with a correct workaround which is
rounding (for finance) and an horrible evil twin "formatting away and praying"
that can lead to futur unfigurable mess for candid users.

So for me I agree that floating point error is not so bad from a mathematical
standpoint, but from a technical stand point let us all remind that for most
humankind 3/2 = 1.5 (and not 0).

~~~
Someone
_”with a correct workaround which is rounding (for finance) and an horrible
evil twin "formatting away and praying" ”_

Are you suggesting the “precision as displayed” setting in Excel is that evil
twin? Why?

~~~
Twisell
Because financial data (in our use case) have a fixed 2 decimal format. In
almost every other context you could impose this constraint by somehow
defining a data type and be done with it.

------
golergka
Floating point is a trade-off which you just have to consciously choose.
Sometimes you need more precision. Sometimes - even less: in my current
project, I encode horizontal object position (coordinates x and z) in 2 bytes,
while y coordinate (vertical) and all 3 euler angles fit into 1 byte each.
It's far from precise, but looks OK from the point of view of our top-down
camera, and all movement updates of players and NPCs fit into half of a
pessimistic IP packet (about 400 bytes) per frame.

------
recursive
Floats are used in applications other than simulations and modelling. Probably
they're mostly used in applications other that simulations and modelling.
Modelling errors are nowhere to be found, but floats can still be troublesome.
I once spent a day trying to determine why a filter criteria wasn't being met
that stated {numberA} + {numberB} >= 0.3 The answer as you can guess was that
the system _wanted_ to be doing decimal arithmetic, but was implemented (by
me) using floats.

------
fizixer
And I take exception to your lack of worry about floating point errors (FPE).

Trefethen's statement was made only in response to the outrageous
mischaracterization of numerical-analysis by non-numerical analysts as "people
who are mainly concerned about errors in their calculations". Trefethen wrote
a paper debunking the claim that numerical analysis is all or mostly about
error analysis.

Now back to FPE. Let me give you a few arguments for why FPE is a major
headache:

\- Consider a random collinear triplet of points on a 2D plane. What I mean by
that is three (x, y) pairs (x1, y1), (x2, y2), (x3, y3) such that they make a
straight line. Do you know that if you generate such triplets with random
angles, 99.999% of your triplets would not be collinear if you're using FP?
Why? because FP representation is not exact except for a hopefully small
number of scenarios. E.g., if your lines are perfectly vertical (x1 == x2 ==
x3) or perfectly horizontal (y1 == y2 == y3), or a perfect diagonal (x1 == y1,
x2 == y2, x3 == y) you'd have collinearity, otherwise it's a toss up. For any
other lines, you could try to be very careful in picking x and y values, but
then you probably compromised some other property of your points depending on
the application (and btw, when you decide to carefully pick x and y values,
welcome to the world of 'being concerned about FP' already). This is the well
known problem of 'robustness' in computer graphics, and something CG
researchers have to spend a lot of time on, "beyond" their work of actually
having created a correct algorithm, and "beyond" actually having implemented a
performant version of it.

\- You compare floating-point error magnitudes with modeling error magnitudes.
I'm afraid that's a silly comparison. When you're working as a numerical
analyst, the modeling domain is the primary part of your research activity.
You're expected to spend time on looking at modeling and associated errors.
FPE could have much smaller magnitudes, but when you have them, they're a
serious headache and a distraction, and force you to drop down to system-
programming and dealing with the IEEE standard and how you can circumvent the
negative affects. That's like saying "I'm not concerned about leaks in the
fluid systems of my car because they're insignificant compared to my worries
in driving the car in dangerous conditions like snowy weather, steep hills,
rough terrain".

\- Also by the way, good luck creating a reproducible system primarily based
on FP if it has to run on different architectures. Various microprocessors
like Intel x86, x86_64, arm, power, etc, etc have implemented their own
slightly differing ways of doing single and double precision operations like
addition, multiplication, trig functions, etc. Your FP numbers are pretty much
guaranteed to differ in their 10 or 11th decimal points. You might think this
is not a big deal. But then make sure your system is not running for hours or
days, (and definitely make sure your system is not for critical applications
like medical or space, etc), or you will encounter these errors accumulating
and results diverging away from each other in more and more significant
decimal places.

So to summarize, if you're not worried about FPE, I'm afraid it's gives a
strong impression that you're only getting started in FP based work (hands-on
numerical analysis, computational science, stats/ML, etc) and have not
acquired enough experience to realize its importance.

edit: I see you have a PhD in applied math based on non-linear PDEs, and a
postdoc. So my question to you is, did your PhD involve implementing and
running numerical algorithms (meaning writing/compiling/testing actual
numerical code)? If the answer is yes then I'm extremely surprised that you
don't worry about FPE.

~~~
jschwartzi
He's not saying that he's not worried about FPE. What he's saying is that
misunderstanding the physical system being modeled is far more likely to cause
problems, and that it's much harder to spot than an FPE. So your mathematical
model requires much more scrutiny for that reason. Basically if you don't
spend much time questioning your assumptions about the world your system
exists in, then don't even bother accounting for FPE because it's going to be
overwhelmed by errors that you actually chose to introduce by your modeling
assumptions.

