
The secret life of NaN - zbentley
https://anniecherkaev.com/the-secret-life-of-nan
======
scarmig
Javascript doesn't have the same richness of NaN as other language specs,
FWIW. [0]

> the 9007199254740990 (that is, 253-2) distinct “Not-a-Number” values of the
> IEEE Standard are represented in ECMAScript as a single special NaN value.
> (Note that the NaN value is produced by the program expression NaN.) In some
> implementations, external code might be able to detect a difference between
> various Not-a-Number values, but such behaviour is implementation-dependent;
> to ECMAScript code, all NaN values are indistinguishable from each other.

> The bit pattern that might be observed in an ArrayBuffer (see 24.1) or a
> SharedArrayBuffer (see 24.2) after a Number value has been stored into it is
> not necessarily the same as the internal representation of that Number value
> used by the ECMAScript implementation.

[0] [https://www.ecma-
international.org/ecma-262/8.0/index.html#s...](https://www.ecma-
international.org/ecma-262/8.0/index.html#sec-ecmascript-language-types-
number-type)

~~~
stcredzero
Could this be used to hide data in Javascript programs?

~~~
mikekchar
As per TFA, some implementations use NaN-boxing to represent other values (all
values are encoded into NaN). In those implementations, you will have pretty
strange behaviour. V8 doesn't use NaN-boxing so it will work there.

Of course NaN-boxing is the reason that the ECMA spec specifies that all NaNs
are treated equally. If you are encoding all of your values in NaN, you still
have to have one NaN value that is _actually_ NaN. The spec doesn't specify
which one that must be.

BTW, I highly recommend reading TFA and the links within, because it's really
fascinating (the "tagged-pointers" used in V8 and other languages like Guile
are also really interesting).

------
kodablah
Careful, some engines deal w/ NaN bits differently. E.g. the JVM doesn't
guarantee non-NaN bits won't move (I had trouble with this at one point [0]).
For specs like WebAssembly (and their conformance tests), they expect the
backends preserve the bits during some reinterpret ops[1] IIRC.

0 - [https://stackoverflow.com/questions/43129365/javas-math-
rint...](https://stackoverflow.com/questions/43129365/javas-math-rint-not-
behaving-as-expected-when-using-nan) 1 -
[http://webassembly.org/docs/rationale/#nan-bit-pattern-
nonde...](http://webassembly.org/docs/rationale/#nan-bit-pattern-
nondeterminism)

------
rdtsc
> the double-precision format has 52 bits for the mantissa, which means there
> are 51 bits available for the payload.

I have seen data passed through the NaN payload in C in a signal processing
application. It was both vile and genius at the same time. Vile because of how
hackish it was, genius because it avoided a larger redesign of the
application.

~~~
hermitdev
First thought I had when reading about this is that the unused bits would be
perfect for potentially reporting a cause of the NaN. A bit hackish, sure, but
potentially useful, especially in C where you don't have exceptions that can
carry human readable explanations of what wrong to arrive at a NaN.

------
laythea
Incidentally, I found a cool way of checking whether NaN occurred:

Just do:

void some_function(float number) { if(number != number) number was passed in
as <NaN> }

Took me a little bit to get my head around it but if the number is not equal
to the number, then a problem occurred. I used this to stop camera code
(driven by floats) from crashing when receiving non-sense input coordinates.
It still doesn't work properly, but it doesn't crash now either! (at least not
for that :)

~~~
dvh
I thought JavaScript has isNaN function

~~~
jcoffland
That's not JavaScript.

~~~
ranit
JavaScript does have isNaN() function: [https://developer.mozilla.org/en-
US/docs/Web/JavaScript/Refe...](https://developer.mozilla.org/en-
US/docs/Web/JavaScript/Reference/Global_Objects/isNaN)

~~~
idbehold
Better to use Number.isNaN(): [https://developer.mozilla.org/en-
US/docs/Web/JavaScript/Refe...](https://developer.mozilla.org/en-
US/docs/Web/JavaScript/Reference/Global_Objects/Number/isNaN)

------
cryptonector
The x86_64 address-space size is currently 49 bits. There are two 48-bit
halves. It could get bigger eventually. So NaN-coding (or Nan boxing, or
whatever you want to call it), is risky.

I remember Solaris had a problem because it would put anonymous mmap()s in one
48-bit range and the heap and stacks in the other range, and this broke some
ECMAScript implementation that used NaN-coding with the assumption that all
data would be in the top 48-bit address space.

~~~
jwilk
No, there are two 47-bit halves. (But see my other comment about 57-bit
address space.)

~~~
cryptonector
Ay, sry.

------
rwmj
This is timely. There's some problem/difference between RISC-V and other IEEE
implementations around propagating NaNs. It causes issues with numpy and R.

[https://groups.google.com/a/groups.riscv.org/forum/#!topic/i...](https://groups.google.com/a/groups.riscv.org/forum/#!topic/isa-
dev/O7GOQmq80Dc)

------
wolfgke
"for instance, the C standard defines and requires the SIGFPE signal to
represent floating point computational exceptions.".

The C standard does not define and/or require this. What you link and refer to
is the POSIX standard.

~~~
keithwinstein
Not quite -- the C standard does define SIGFPE, and sort of requires it. C99
says "C does require that SIGFPE be the signal corresponding to arithmetic
exceptions, if there is any signal raised for them." See clause 7.14 ("Signal
handling") and H.3.1.2 ("Traps").

------
zeveb
> Double precision NaNs come with a payload of 51 bits which can be used for
> whatever you want — one especially fun hack is using the payload to
> represent all other non-floating point values and their types at runtime in
> dynamically typed languages.

That is _genius_! It'd be a little weird using a system where the most
positive fixnum is 51 bits, but it wouldn't be _terrible_ — and of course
that's why bignums exist. And 50 or 49 bits are certainly more than large
enough for realistic RAM sizes for quite awhile — the latter is roughly half a
petabyte.

It seems from the article that it's a pretty common technique; I'm surprised
that I've not heard of it before.

------
nixpulvis
NaN is one of the reasons IEEE floating point numbers don't have a fully
defined order.

~~~
hermitdev
Sort of like ANSI NULL in databases. They're purposefully not supposed to
equate (or not equate) to anything, even theirselves.

------
undecidabot
A little "fun" fact about NaN: they are never equal to each other, even if
it's the exact same value and variable.

How can you check if a variable is NaN then? Well, if it's not equal to itself
then it must be NaN!

------
jwilk
> 48 bit pointers (the current x86-64 pointer bit-width)

For completeness: Linux supports 57-bit address space on x86-64:
[https://lwn.net/Articles/117749/](https://lwn.net/Articles/117749/)

I don't know if there's any real hardware that supports this, though.

~~~
jwilk
Oops, wrong link. I meant this:
[https://lwn.net/Articles/717293/](https://lwn.net/Articles/717293/)

------
baybal2
My first encounter with Javascript: Not a Number is a Number.

In my youth, I spent days wondering how in the world numbers in animation
scripts are not "NaN" strings the moment something causes zero division, but
are still numbers called NaN.

~~~
scrupulusalbion
Can arithmetic be applyied to NaN a la i? (e.g. 5 + 2 * NaN = 5 + 2NaN)

~~~
mort96
NaN is sort of viral. Any math operator (afaik) returns NaN if any of the
operands is NaN. That means `5 + (2 * NaN)` becomes `5 + NaN`, which just
becomes `NaN`.

Still, `5 + 2 * NaN == 5 + 2 * NaN` would return false, because it becomes
`NaN == NaN`, and NaN is not equal to NaN.

~~~
logicallee
what about 0 * instead of 2 * ? On one hand by the viral property it would
seem to also be unequal? On the other hand having zero NaN is something every
equation already has :D it seems fair to silently remove a 0 * NaN term.

~~~
Retra
"0 * NaN" makes as much sense as "0 * Blue". It's NaN because you're trying to
do something that is fundamentally not multiplication.

~~~
logicallee
thanks - makes sense and is a clear explanation.

------
skybrian
Go is sort of the opposite of this. If you do something like:

func sum(a, b float64) interface{} { return a+b }

It will allocate memory for the float, because an interface type always
contains two pointers. [1]

It's pretty crazy that the second word in a Go interface has to be a pointer.
But I suppose if dynamic types were used more, they would be optimized more.

[1] [https://www.darkcoding.net/software/go-the-price-of-
interfac...](https://www.darkcoding.net/software/go-the-price-of-interface/)

~~~
zaarn
I think it's more about not having exceptions. If the second word is always a
pointer then you can make code that blindly assumes such and optimize for
that.

If it may not be a pointer you have to branch, which can be add some overhead
to the lucky path (every pointer deref will have to wait for type comparison
in machine code)

If the float has just been written, the data _should_ be in cache so the
performance penalty is probably not too bad in most cases.

~~~
skybrian
Well, the problem is when you have a []interface{} that contains a lot of
float64 mixed with other things. If the size of the slice is large, I don't
think you can assume it's all cached.

~~~
zaarn
If the slice is large, it's less likely to be cached either way.

~~~
skybrian
True, but there is also locality of reference (having everything in a
contiguous block uses less cache than pointer-chasing) and garbage collection
pressure.

In principle, using NaN boxing or a similar technique to store non-floats, you
should be able to store floats in an []interface{} as efficiently as in a
[]float64. Scripting languages do this, but not Go.

------
al2o3cr
Another similar NaN-boxing approach, available as an option in mruby:

[https://github.com/mruby/mruby/blob/d6cb4f9cf2027eb20f67238a...](https://github.com/mruby/mruby/blob/d6cb4f9cf2027eb20f67238aa6c051448602e7e6/include/mruby/boxing_nan.h#L31)

