
Why is NaN not equal to NaN? - DanielRibeiro
http://stackoverflow.com/questions/1565164/what-is-the-rationale-for-all-comparisons-returning-false-for-ieee754-nan-values/1573715#1573715
======
benjamincburns
The comments on this point to another question of which this is an exact
duplicate. Interested parties should definitely read the top answer there, as
it was written by a member of the IEEE-754 committee.

[http://stackoverflow.com/a/1573715/203705](http://stackoverflow.com/a/1573715/203705)

~~~
stephencanon
Hey, that’s my answer! /wave

Strangely, I was just adding some more material to this answer as I’ve never
really been completely satisfied with it; then I flip over to hacker news and
there it is. Anyway, if anyone has other IEEE-754 questions, I’ll try to
answer.

~~~
haberman
Wow, I never would have guessed that a significant reason was historical, so
that x != x could provide a robust way to detect NaN without requiring a
special API. Is there a similar test for infinity?

~~~
jules
That reasoning is incorrect. If they had chosen NaN == NaN, then the test for
NaN could be simply x == NaN...

The other reason mentioned is also unconvincing. Surely preserving the law
x==x is more important than the law x==y <=> x-y==0.

Then we have:

> Regarding your comment "that doesn't mean that the correct answer is false",
> this is wrong. The predicate (y < x) asks whether y is less than x. If y is
> NaN, then it is not less than any floating-point value x, so the answer is
> necessarily false.

Clearly this reasoning is incorrect for the simple reason that the same
reasoning can lead to the opposite conclusion. If y is NaN then we can't say
that y < x is definitely true but neither can we say that y < x is definitely
false.

Here is my argument for why NaN == NaN should be true. The == operator for
floats should rarely be used. If you want to know whether two floats can be
considered the same, then you should always use some other condition like
|x-y| < epsilon, and NOT x == y. The reason is that floating point arithmetic
is inexact, so even a slight roundoff error will make x == y false. So if its
job is _not_ to test whether two floats are the same _numerically_ then what
is the job for the == operator? Its job is to test whether x and y are
precisely the same _float_. The x == y operator should not pretend to work on
the abstraction that x and y are kinda-sorta real numbers, because such an
operator is useless anyway because the criterion for that needs to be
application specific. It should compare floats while admitting that we are
comparing floats which can take on the value +infinity -infinity +0 -0 and
NaN, and we are not comparing real numbers which do not have those values.
That's why NaN == NaN should be true.

~~~
stephencanon
> If they had chosen NaN == NaN, then the test for NaN could be simply x ==
> NaN

Not in languages that didn’t have a name for NaN it wouldn’t.

> you should always use some other condition like |x-y| < epsilon, and NOT x
> == y.

This is a popular myth, but it is false. There are circumstances in which
exact equality comparisons in floating-point are perfectly appropriate.

> Its job is to test whether x and y are precisely the same float.

No; if it were we would have +0 != -0. You’re looking for totalOrder(x,y) or a
related predicate.

~~~
jules
> Not in languages that didn’t have a name for NaN it wouldn’t.

...then use NaN = 0/0.

> This is a popular myth, but it is false. There are circumstances in which
> exact equality comparisons in floating-point are perfectly appropriate.

Can you give an example where == is appropriate, but where you want NaN == NaN
to be false?

> No; if it were we would have +0 != -0. You’re looking for totalOrder(x,y) or
> a related predicate.

Exactly, +0 == -0 should be false too. totalOrder(x,y) is what == should have
been (or at least the equality part of it).

p.s. An arguably more serious issue with IEEE floats is rounding modes. The
round up/round down modes are tantalizingly close to providing the ability to
do ironclad approximate real arithmetic in the sense that you could give
upper/lower bounds on the exact value. The problem is that the rounding modes
only involve giving an upper/lower bound on the output given an _exact_ input,
rather than an upper/lower bound on the output given an upper/lower bound on
the input. So it's not compositional which means you can't guarantee anything.

~~~
thedufer
> ...then use NaN = 0/0.

Many languages throw an error when you divide by zero. As far as I can tell,
if x != x didn't work you would have to resort to bizarre contortions like `x
== float("NaN")` for such tests in Python (the problems: there are no NaN or
Infinity constants and division by 0 throws).

Also, since you expect == to do bit equality on floats, you will be
distinguishing between the huge number of representation of NaN, so NaN = 0/0
isn't valid even if there is a NaN constant (and which one would it be?).

~~~
derefr
X/0.0 is safe, and will always return NaN, no matter the language -- as long
as 0.0 is a representation of an IEEE754 floating-point value, and as long as
the expression "X/0.0" is interpreted as coercing X into a floating-point
value in order to do fpdiv(X, 0.0) -- then safely returning Infinity here (or
NaN in the case of 0/0.0) is required of the FPU in order for it to be
considered IEEE754-compliant.

On the other hand, X/0 ( _integer_ division-by-zero, where X is not a
floating-point value either) usually does cause an exception. In fact, in
bare-metal ASM without a trap-handler interrupt set up, it halts the processor
entirely.

~~~
mansr
> On the other hand, X/0 (integer division-by-zero, where X is not a floating-
> point value either) usually does cause an exception.

That's true for x86, but not for most other architectures.

------
karamazov
This is what I would expect from the underlying mathematics. NaN roughly means
the output of a function is undefined, and it does not make sense to say
functions are equal at some value just because they are both undefined for
that value.

For example, it's probably more sensible to say that 2x/x != 3x/x for all
values of x, rather than 2x/x != 3x/x for all x except 0, wherein both
functions are undefined and therefore equal to each other.

~~~
Fishkins
I agree it really doesn't make sense to say two NaNs are definitely equal, but
it doesn't seem necessarily true that they aren't equal, either. Semantically,
I would say NaN == NaN should return undefined (assuming JS here). Of course,
it would be weird for == not to return true or false, so in practice that
probably isn't a good idea. So I agree with you false is the least bad option.

~~~
yourad_io
Why wouldn't NaN==NaN return a NaN value, which also has the following
property:

-> Can't be used as a branching conditional.

Think about it: if (NaN) { a; } else { b; }

The pedantically-correct answer would be to execute neither, but that wouldn't
be very helpful (however fun debugging that might have been).

The pragmatically-correct answer, imho, would be to throw an error. Would I
rather have a; or b; executed, when I can't trust the conditional value?
Neither, by the time the NaN has hit branching code, it's time it escalated
into an error.

edit: I may have reinvented "Maybe"
[https://news.ycombinator.com/item?id=7747284](https://news.ycombinator.com/item?id=7747284)

~~~
Fishkins
I thought about that, but I like undefined better. Part of the reason is NaN
is of type number[0], and I think a number makes less as a return value for a
conditional statement than a general non-value (e.g. undefined or null). Also,
if someone asked me to evaluate the equality of karamozov's equations at 0,
I'd literally say "that's undefined." So the JS undefined value seemed perfect
to me (perfect as in the most elegant, not necessarily the most practical).

For some reason I hadn't thought about it till you mentioned it, but I do like
the idea of throwing an exception when a nonsensical comparison of NaNs
happen. Of course that would be inimical to the way JS normally operates.

I don't think what you're getting at is exactly Maybe. You're able to easily
programmatically determine whether a Maybe has a value, whereas the equality
of NaNs is basically unknowable.

0 - Talking in JS again, although it's similar for Java and other languages.

~~~
yourad_io
> You can ignore invalid data for as long as you want, but you can also draw
> lines and say "it can't have been ignored if flow gets to here". (from
> comment I linked above)

I was talking about this property of Maybe. I was thinking that assigning a
NaN value to a boolean should probably be "fine" (no exception), but _using
that when branching_ would mean that this carried-forward-error (which the NaN
represents, effectively) is just about to go beyond "infecting" just data, to
affecting code flow. So, throw.

------
zenogais
The following is my take:

It seems like the desired result is to make it determinable when a NaN value
is produced in a language that doesn't have an is_nan function (such as an
assembly language lacking such a predicate) so that a simple boolean statement
does the trick, for example:

def is_nan(val): return not (val == val)

This absolutely bizarre statement of non-identity gives us a surefire test. It
is from the perspective of higher level languages that the non-identity of NaN
with itself seems bizarre.

~~~
dllthomas
If NaN == NaN, we'd still have that test, it would just read

def is_nan(val): return (val == NaN)

It's true that it can't just give NaN or you'd be stuck.

------
marcosdumay
There's the obvious warning that one shall not compare floating point numbers
for equality...

But then, it's indeed quite surprizing. It get worse on pure tanguages where
the expressions (0 / 0) and (0 / 0) are equal before evaluation, but different
afterwards. Also, it trashes hash tables.

Yet, NANs being different is the sanest mathematical definition. Maybe people
should have opted for the ugliest choice because of real world concerns (it
wouldn't be the first time), but this time, they didn't.

------
xenadu02
Some of the links I posted to stack overflow; the top-voted answer there (as
of this writing) was wildly incorrect, claiming it to be a mistake. It is not.

Any invalid calculation produces NaN as opposed to raising an exception/signal
because that would be non-portable (especially in the early 80s). The goal was
that a correct algorithm on one platform or in one language would translate
and be correct somewhere else. Further, some algorithms depend on being able
to probe a function to find its bounds, requiring them to be able to locate
the out of bounds condition without halting execution; doing that in a safe,
portable way is actually quite hard unless you do it in software as part of
the spec, hence NaN.

If NaN==NaN, then (NaN/NaN)==1, etc etc and you still end up in a world of
nonsense as far as naive algebra is concerned. This is just some of my fellow
programmers whining and crying about having to think about floating point, the
same sort of complaining I hear about Unicode (and full of just as much self-
sure awful advice that produces wildly incorrect results for everything except
the complainer's specific use case).

[http://docs.oracle.com/cd/E19957-01/806-3568/ncg_goldberg.ht...](http://docs.oracle.com/cd/E19957-01/806-3568/ncg_goldberg.html)

[http://www.eecs.berkeley.edu/~wkahan/ieee754status/754story....](http://www.eecs.berkeley.edu/~wkahan/ieee754status/754story.html)

[http://cnx.org/content/m32770/latest/](http://cnx.org/content/m32770/latest/)

~~~
lbarrett
It's not just "having to think about floating-point"\--it's that a number of
other things break. Easy function memoization doesn't work with NaNs because
NaN!=NaN, so the function arguments look different, so the saved memoized
value is never used.

------
shalmanese
NaN == NaN should return NaB, Not a Boolean.

~~~
shawnz
Yes, and if you try and branch based on that value, the CPU will jump to "Not
an Address" and execute the instruction "Not an Opcode" over and over.

------
tannerc
The comments give reason to believe this isn't the cause, but my first
instinct was that NaN cannot be equal to NaN as it isn't clear what the object
represents. As Mike C states in his answer: "Another way to look at it is that
NaN just gives you information about what something ISN'T, not what it is."

But is this an effective way of looking at the question? In reality what
you're trying to do is not compare a NaN to another NaN, you're actually
attempting to check if both objects are NaN. This hurts my brain.

------
malkia
I like the fact that NaN taints other numbers - early failure is better, than
trying to hide it.

    
    
      #include <math.h>
      #include <stdio.h>
    
      int main(int argc, const char* argv[] )
      {
      	double nan = 0.0 / 0.0;
      	double x = 1.0;
      	double y = nan + x;
      	printf( "nan=%f\n", nan );
      	printf( "  x=%f\n", x );
      	printf( "  y=%f\n", y );
      	return 0;
      }
      
      /* Should print (gcc):
         nan=nan
           x=1.000000
           y=nan
      */

~~~
dllthomas
I _dislike_ the fact that NaN taints other numbers - for precisely the same
reason.

Much better, in terms of early failure, would be signalling NaN or something
like...

    
    
        ...
    
        maybenan_double maybe_nan = 0.0 / 0.0;
        double x = 1.0;
        if(is_nan(maybe_nan)) {
            fprintf(stderr, "maybe_nan is NaN\n");
            return 1;
        }
    
        double not_nan = from_maybenan(maybe_nan);
    
        double y = not_nan + x;
    
        ...
    
    

Of course, there are other dimensions in which this is _not_ better, but I
don't like silent propagation if it might stretch over a lot of code.

------
letney
As a C/C++ and graphics programmer, I love NaN. It allows saving space by not
having to carry an extra is_valid boolean for uninitialized values or regions
of data that are masked.

For textures, they are great as they don't interpolate w/ nearby neighbors and
properly discard fragments & vertices without the aforementioned cost of an
extra boolean attribute.

------
quasque
That was a very interesting read. I always thought NaN != NaN was because it's
a class of values (there are many binary representations that are NaN, as it
just requires an all-ones exponent and a mantissa that isn't all-zeroes) but I
guess I was considering it too simplistically.

------
zmmmmm
This might actually explain a very intermittent bug that I spent some hours on
and never solved. A sort that would never terminate, and the best I could come
up with was that for reasons I couldn't understand the comparison function was
returning false when an item was compared to itself ... TIL.

------
jamesbrownuhh
On the subject of Not A Number, is it a good time to recall Gary Bernhardt's
excellent "Wat" piece?

[http://www.youtube.com/watch?v=HAw2jU4de2A](http://www.youtube.com/watch?v=HAw2jU4de2A)

------
InclinedPlane
Not a number just tells you what something isn't, not what it is. A NaN could
be a string, or a potato, or an airplane, or the prime minister of Djibouti.
Is a potato equal to an airplane?

------
wglb
A very good post.

Most here don't remember the totally sad state of floating point before this
came to be.

------
personjerry
Before you join this discussion -- read the third answer posted, it is the
"most correct" one.

------
dsugarman
also anyone who works in SQL should know why NULL != NULL and NULL = NULL come
back as ? and never true

------
leeoniya
tldr: cause !red !== !blue

------
tybenz
log(-1) -> exception 1/0 -> exception Problem solved. WTF is NaN even a
thing??

~~~
klodolph
Throwing exceptions for everything isn't a catch-all solution for writing
better software—take a look at the Ariane rocket failure, which was caused by
bounded numeric types throwing exceptions aggressively. Some modern
programming styles tend to avoid enthusiastic non-local returns anyway in
favor of things like monadic composition, such as Maybe in Haskell (which is
more like NaN than it is like an exception).

The nice thing about NaN is that you can just do a calculation as normal, and
check for NaN at the very end. This means you don't have to do an expensive
test/branch after every arithmetic or other numeric operation. The hardware is
much, _much_ easier to design if you don't have to make it branch, and the
code is much faster without the compiler inserting branches. People who work
with floating point numbers every day care very deeply about performance.

If you don't care as much about performance, why not write your code in a
language that does throw exceptions? Python, for example.

Those of us that _do_ use numerics love NaN, love signed zeros, and can live
with NaN ≠ NaN even though it's kind of dumb.

~~~
dllthomas
Maybe is more like NaN than an exception, but the nice thing about it as
opposed to NaN is that you can isolate the particular failure. Sure, I can
check for NaN in my doubles, but my types can't make sure I've checked for it
by point X. Running a huge program to solve a problem, and getting back NaN
with no explanation, is horrendously obnoxious.

~~~
klodolph
Hm, I have the opposite experience. Being able to ignore invalid data every
once in a while trumps having a hundred hours of computation cut short by an
unexpected exception bringing the whole thing down.

~~~
dllthomas
That's why Maybe (without fromJust!) is so nice. You can ignore invalid data
for as long as you want, but you can also draw lines and say "it can't have
been ignored if flow gets to here". I'm certainly not arguing in favor of
exceptions!

Edited to add: That said, I'd rather that 100 hours of computation be cut
short than take longer and be equally useless...

