Hacker Newsnew | comments | ask | jobs | submitlogin
NaNs Just Don't Get No Respect (drdobbs.com)
39 points by stianan 611 days ago | comments


tikhonj 611 days ago | link

NaNs are annoying because, thanks to them, equality on floating point numbers is not an equivalence relation. In particular, NaN /= NaN.

This means that in Haskell, for example, you cannot really rely on the Eq class representing an equivalence relation. Code relying on the fact that x == x should be true for all x could not work as expected for floating point numbers.

I don't know if this has any practical ramifications in real code, but it certainly makes things less elegant and more complex than they have to be.

-----

joshAg 611 days ago | link

Attmepting to just use simple equivalence with any floating point type a horrible idea to begin with, even without NaNs. You should instead declare equivalence if the absolute difference between the two numbers is less than some bound based on what you're doing and not look for bit equivalency.

However, you should be able to examine two NaNs and declare them "equivalent" (for certain definitions of equivalence) by intelligently examining the bits based on the hardware that you're running the program on. In the case of a binary Nan [1] that would entail checking that the exponential fields are both entirely high (eg 0x8 == (a.exponent & b.exponent), assuming a standard 8 bit exponent) and that the mantissas are nonzero (eg a.mantissa && b.mantissa).

[1]: "Binary format NaNs are represented with the exponential field filled with ones (like infinity values), and some non-zero number in the significand (to make them distinct from infinity values)." --http://en.wikipedia.org/wiki/NaN

-----

shrughes 611 days ago | link

That's not true, there are plenty of cases where using equivalence is just fine. Integer arithmetic, and algorithms that are more reliably written not to contain any empty intervals are two examples.

-----

noblethrasher 611 days ago | link

He was specifically talking about equivalence on floating point types. Integers don't have or need NaN.

-----

shrughes 611 days ago | link

I was talking about integer values (with floating point representation) being multiplied and added (and divided and floored, I suppose).

-----

joshAg 610 days ago | link

even just addition and multiplication with floats make simple equivalence a horrible idea, due to uncertainty.

For example:

    float a = 1.0;
    float b = 1000.0;
    for (int i = 0; i < 1000000; ++i)
        a+=1.0;
    b *= b;
There is no guarantee that a == b. Floats make everything more complicated, even simple addition: http://en.wikipedia.org/wiki/Kahan_summation_algorithm

-----

shrughes 610 days ago | link

There is no uncertainty. There is a guarantee that a == b (if we ignore the off-by-one error in your post), because IEEE operations are guaranteed to be accurate within half an ulp. You can safely perform addition, subtraction, and multiplication, and truncated or floored division, within the 24-bit integer range for single-precision floats and the 53-bit integer range for doubles. This is why people can safely use integers in Javascript.

-----

joshAg 610 days ago | link

i guess that's what a i get for not double checking my math. here's a revised version that (as long as i haven't made any other math mistakes) still fits within a 32 bit signed int but doesn't guarantee simple equality:

    float a = 0.0;
    float b = 10000.0;
    for (int i = 0; i < 100000000; ++i)
        a+=1.0;
    b *= b;
why in the world would you use a float instead of an int for addition, subtraction, and multiplication, and truncated or floored division, within the 24-bit integer range? it seems like there's no benefit to offset the facts that floating point operations are slower than integer operations and that ints can store integers 7 or 8 bits larger.

and what happens when you go beyond 24 bits? since it's a float no error or warning will be thrown, but now equivalence won't work for numbers that are easily stored by an int.

-----

shrughes 610 days ago | link

Why aren't you capitalizing your sentences? Are you too lazy to write properly?

Where did I say I'd use floating point numbers for integer math? Yes, let's move the conversation to a direction it never existed so that you can pretend you were right.

(The place I'd use it would be in a Javascript implementation, or a Lua implementation, and other situations where I'm designing a programming language where I want the simplicity of having only one numerical type. And that would be a 53-bit range, not 24-bit.)

-----

joshAg 610 days ago | link

You said using equivalence for floats that store integers is fine. here is a link: [1]. The point of my example was to show that that is not the case for numbers that are easily stored by an int that's the same size as a float.

I've not used lua or javascript, but wouldn't it be better to choose an implmentation that silently switches between ints, floats, doubles and big-nums as needed? That way you don't limit the speed float and int operations by autoconverting up to doubles, when double precision isn't needed.

[1]: http://news.ycombinator.com/item?id=4400424

-----

shrughes 610 days ago | link

I do not recommend using floating point numbers for integer math. I am saying that if you have integers stored in floating point representation, equality comparisons are fine.

> I've not used lua or javascript, but wouldn't it be better to choose an implmentation that silently switches between ints, floats, doubles and big-nums as needed?

If you want to go through the engineering effort, sure, it might make sense to have separate int and double encodings. You have to check for the possibility of integer overflow and convert to double in that case. In particular, this is useful if you want to use Javascript's bitwise operators. See https://developer.mozilla.org/en-US/docs/SpiderMonkey/Intern... for a description of how SpiderMonkey does it. Lua just has one representation for numbers, double normally but it could be float or long depending on how you compile it. There would be no reason to have single-precision floating point or big-num representations.

-----

joshAg 610 days ago | link

If you mean arithmetic on floating point numbers that only store integers, then no equivalence is not ok, because there are examples where bit equivalence fails. One example is repeatedly adding 1.0 to a float something like 1 trillion times versus multiplying 1000000.0 by 1000000.0

What do you mean by

    algorithms that are more reliably written not to contain any empty intervals are two examples.

-----

shrughes 610 days ago | link

Obviously I'm referring to integer operations within a 52-bit or 23-bit range, not outside the ability of floating point representations to represent integers!

> What do you mean by

I mean exactly that sort of thing. Algorithms that, for example, divide a line into intervals, where empty intervals are not needed, or desired, and are generally risky with respect to the probability of having a correct implementation. An example of this would be computations of the area of a union of rectangles. You might make a segment tree -- and avoid having degenerate rectangles in your segment tree.

-----

adamzochowski 611 days ago | link

This looks very much like NULLs in the SQL world.

A null (and NaN) is like an unknown. One can't compare unknowns because they are exactly that, unknown.

Let's construct a language that on division on zero returns unknowns.

   a =  5 / 0;
   b = 10 / 0;
Now, both a and b are set to unknown state. If one were to compare a to b, should the expectation be that they hold same value?

I wish all languages would have nullability like SQL does. Where a great care has to be given to deal with nullable data, lest nulls nullify everything.

-----

malkia 611 days ago | link

In some languages, a, b would be +Infinity. On top of my head I can't remember whether +Infity != +Infinity (have to write a test and see). For NaN's definitely.

-----

tikhonj 611 days ago | link

One of those languages is JavaScript, so--assuming your browser supports JavaScript--you do have somewhere to test it :P.

The answer is that 5 / 0 is Infinity and (5 / 0) == (5 / 0), so it's all good :). Now, 0 / 0 is NaN and (0 / 0) != (0 / 0).

-----

Evbn 611 days ago | link

Would you prefer NaN be another type, like Maybe? And have to all math in a monad or with chronically repeated case analysis?It's necessary complexity, whichever way you do it.

-----

tikhonj 611 days ago | link

Oh, I don't really have a good solution in mind--it's just annoying.

I guess one option would be to just declare that all NaNs are equal--I'm pretty sure that's how bottoms work in Haskell, and it seems that NaN is essentially a floating-point version of bottom.

-----

justincormack 611 days ago | link

There are however multiple bit patterns for NaN which complicates that.

-----

tikhonj 611 days ago | link

Yes. I was actually thinking about this. However, you have a similar problem with the current model: two NaNs can have the same bit pattern, but still have to be unequal.

This is somewhat simpler--as long as either argument to (==) is NaN the answer is False. However, you still have to figure out that at least one bit pattern corresponds to NaN. If you want them to be equal, you would have to figure out that both are NaN. This is certainly a little more difficult, but I think it isn't much worse than the current scenario.

-----

justincormack 611 days ago | link

I think though you currently just subtract them and test equality to zero, so it is all done by the hardware.

-----

Evbn 610 days ago | link

If you try to compare something to bottom (undefined), your program will crash. Try it.

-----

malkia 611 days ago | link

There are much worse thing than NaN's.

They are called denormals. These appear when dealing at the same time with lots of big numbers (very far away from 0) in operations with lots small numbers (close to 0).

In such cases the FPU (or whatever deals with fp numbers), switches to a format that could be very inefficient producing an order of magnitude slower operations.

For example when dealing with IIR filters in audio, your audio buffer might contain them. One of the solution is to have a white noise buffer somewhere (or couple of numbers) that are not denormalized and add with them - it would magically normalize again.

I'm not a guy dealing with "numerical stability" (usually these are physics, audio or any simualation engine programmers), but know this from simple experience.

-----

zurn 610 days ago | link

Denormals are part of IEEE fp. If your implementation is too slow, you can often trade correctness for speed by turning them off in the C/C++ runtimes.

They're also a sign you're skirting on the limits of FP precision (or worse) so a bit of numerical analysis might still be a good idea...

-----

malkia 610 days ago | link

You cannot simply turned them off everywhere. On certain platforms they are produced always, and nothing can be done, but openly deal with them (by expecting them to happen).

-----

saurik 611 days ago | link

The things this author likes about NaN are also properties of NULL in many environments (that NULL cannot be compared to NULL, that operating on NULL returns NULL, etc.); so while you might not see many languages default initializing things to NaN, you do see them default initializing things to NULL with similar effect.

-----

duaneb 611 days ago | link

Except this is actually worse, since there are many possible values which evaluate to NaN.

EDIT: I do not know how D implements NaNs; they may have magic to make them more sane to work with.

-----

WalterBright 611 days ago | link

D does not implement NaNs, it just relies on the IEEE FP hardware to do it.

What D does do is expose NaNs so the programmer can rely on their existence and use them in a straightforward manner.

-----

klodolph 611 days ago | link

I've gotten a bit pissed at the Microsoft C compiler for (1) having no standard way to generate NaN or Infinity and (2) having a good enough static analyzer that if you generate one by casting, it emits a warning saying that your arithmetic overflows.

Gee, thanks MSC. I didn't expect "x = INFINITY;" to overflow.

-----

malkia 611 days ago | link

0/0 should be NaN, 1/0 should be +Infinity, -1/0 should be -Infinity. (I haven't tried that in a while).

Also check the flags, like /fp:precise for MSVC

-----

roryokane 611 days ago | link

An alternative workaround to writing `float f = 0` in languages without NaN:

    float f;
    bool thingIsFoo = condition1; // store the result…
    if (thingIsFoo)
        f = 7;
    // ... code ...
    if (thingIsFoo && condition2) // and explicitly depend on it later
        ++f;
But this causes an extra `&&` to be computed at runtime, so it seems NaNs are still better for this case.

-----

mieubrisse 611 days ago | link

You've written quite the interesting and informative article, and your logic as for why you initialize to NaN was perfectly clear.

-----

tzs 611 days ago | link

URL for people without bionic eyes: http://www.drdobbs.com/cpp/nans-just-dont-get-no-respect/240...

-----

voyou 611 days ago | link

Stop trying to make D happen. It's not going to happen.

-----

malkia 611 days ago | link

Be afraid of QNaN the Barbarian!

-----




Lists | RSS | Bookmarklet | Guidelines | FAQ | DMCA | News News | Feature Requests | Bugs | Y Combinator | Apply | Library

Search: