

My Bug, My Bad: Sunk by Float - Strilanc
http://twistedoakstudios.com/blog/Post3044_my-bug-my-bad-2-sunk-by-float

======
kevingadd
I had a bug along these lines in a C# application that was pretty fun to
troubleshoot, because it only occurred when the runtime gave a particular
float extra precision (by letting it live in full-size x87 registers).

I was using an unmodified version of one of those standard 'triangulation by
ear clipping' algorithms, transliterated directly into C#, using floats. In
practice, it did exactly what I wanted and I wasn't using it much, so I never
put much thought to it. Then it started going into an infinite loop in some
cases. Troublesome.

So, I'd get it to go into an infinite loop, I'd attach the debugger, and...
it'd leave the infinite loop. What?

Hm, identify the arguments that produce the infinite loop (good ol' printf
debugging!) and reproduce those in an isolated test case. Great, test case
enters an infinite loop, let's attach the debugger... hey, what? It's not
failing anymore.

As it turns out, the default behavior for attaching a debugger to a .NET
executable includes having the JIT deoptimize some of your method bodies in
order to make breakpoints and local variables easier to work with (in their
wisdom, MS did include an option to turn this off) - so any time I ran the
test case under a debugger, it passed, because deoptimizing the method had
caused the floating point math to flush to 32-bit float representation more
often, truncating the additional precision. The infinite loop ended up being
caused by that additional precision!

Best of all, the other platform I was writing the code for at the time
(PowerPC) didn't exhibit the bug, because its floating-point implementation
doesn't add additional precision.

------
EEGuy
Had to deal with accumulating noise and unpredictable rounding in a double
precision supported application.

Ended up scaling the amounts to integers, then converting those known and
stable amounts to _character strings_ then doing the multiplication and
division by character manipulation!

Don't laugh too hard; performance wasn't a consideration but _decimal_
accuracy was. It did the job.

------
NateDad
Rules for using floats:

    
    
      1. Don't use floats. 
      2. Never use big floats.
      3. WHY ARE YOU USING FLOATS!?

------
RyanZAG
Surely just casting to double is going to be a far more practical and
efficient solution than using a global volatile float ?

The problem of extra precision provided randomly by the compiler for floats
should be fixed by just casting to a double, ensuring that you always use the
extra precision.

~~~
Strilanc
I mention at the start that casting to double is not allowed. The motivation
for that restriction is to allow the solution to generalize to double/int64 or
even quad/int128.

In a more practical sense, casting to a double is a good solution. Both int32s
and floats can be represented faithfully as a double, and that's what I used
to test my implementation against.

However, the unreliable extra precision is still a problem. Casting to double
may or may not remove it. You're still exposed to false positives and false
negatives, depending on where and when precision is available. You should
still run the float through a volatile float field, if you want consistent
behavior.

(Note: the volatile field doesn't have to be static. You can use a struct on
the stack. That just took more code, and seemed like a superfluous detail. In
C, instead of C#, you can just do a cast, instead of storing it in a field.)

~~~
nightowl03d
Unless this is a numeric simulation where performance really matters, the
correct answer is that you should not be using floats in the first place. Use
a rational type, or scaled integers instead. If this is a numerical
computation the comparison should be using an epsilon.

Floats are strictly a performance optimization for doing computations whose
solutions could be rational or irrational numbers. In all other cases you
should either use integers or rational numbers.

Bottom line is if you are not doing physics, graphics, signal processing, or
system/financial modeling you should never use floats/doubles/quad

If the need to do this comparison is because of a non-numerical third party
library giving you a float, you should consider dropping the library. The
library author, while not likely to be an idiot in the broader sense, is
definitely a numerical idiot.

~~~
Strilanc
I agree with everything you said.

Well, one with clarification. Using an epsilon only partially solves the
problem. It improves the situation, but there's still an unstable region where
compiler quirks will flip the results of comparisons. The only way to avoid
the issue is, as you said, using a more appropriate type like rational.

~~~
nightowl03d
I should have said a well chosen epsilon. If the computation is good to 0.001,
(based on a detailed error propagation analysis of the algorithm against the
expected input range), then I would use an epsilon of 1/512 when comparing
against an int. The idea is that with a coarseness at that level the compiler
rounding would likely have little effect. The comparison number would be (int
+ a power of two). But it sounds like for your scenario a scaled int or a
rational would be more appropriate.

If you have access to a university library, (most universities will let you
sign in and browse), is Chapter One of Stoer and Bullrish text Introduction to
Numerical Analysis. (ISBN 0-387-954452)

It is on my desk because I am working out the epsilon for an algorithm a
colleague designed. :-)

------
martinced
_"Floats are terrifying. Maybe not as terrifying as unicode, but pretty darn
terrifying."_

Good. At least someone else than Schneier realizes Unicode is a major PITA
from a correctness and security point of view. Schneier said it best:
_"Unicode is too complex to ever be secure"_.

Back to floats. It's nearly always a mistake to be using floats unless you're
working on scientific computation (and 3D and game physics a are a kind of
scientific computation).

If you don't know how to compute error propagation: floats aren't for you.
There. As simple as that.

It's interesting to realize that most comments here do not adress that
fundamental issue: it's not a particular bug in fp that's the problem. It's
the overuse of fp itself that is the problem.

I can give you one the simple example of a f^cktarded use of floats: in a
gigantic spec (HTML) someone decided there would be a probability expressed as
a number between 0 and 1 with up to three decimal points. This is totally
f^cktarded and lots of people complained that it was a retarded thing to do.
You should express this as a number between 0 and 999. Why? Because you know
that otherwise clueless programmers are going to make mistakes while using
floating point numbers where they shouldn't.

And surely some programmers did. Some clueless monkeys in the Tomcat codebases
decided it was a good idea to parse a number which was know to have at most
three digits after the dot using... Java's built-in floating point parsing
number library. That's silly of course: when a f^cktarded spec talks about a
number between 0.000 and 0.999 you parse it manually into integers and do
integers maths but I digress.

So what did happen? Some people realized they could throw Java's floating-
point parsing number library into an infinite loop and that's it: the most
deadly Java remote DoS exploit to attack website. By faking one browser
request you could throw one thread on a webapp server in an infinite loop.
Rinse and repeat and a single machine could DoS an entire server farm.

Due to _stupid_ programmers using floating-point numbers when they shouldn't.

The problem is exacerbated by clueless people everywhere thinking floating-
point numbers are a good idea. They're not. They should die a horrible days
and the days where CPU didn't have floating-point numbers were arguably better
days for 99.999% of what should our use.

Thankfully there's one area were even f^cktarded monkeys aren't using
floating-point numbers: cryptography.

I'd say that unless you can _write_ Goldberg's _"What every computer scientist
should know about floating-point numbers"_ then you shouldn't be using
floating-point numbers. As simple as that.

