
Floating Point Visually Explained (2017) - bmease
http://fabiensanglard.net/floating_point_visually_explained/index.html
======
0-_-0
Fabien Sargland is the same guy who wrote 2 books on a deep dive into 2 game
engines: Wofenstein 3D and Doom which are great reading and I really recommend
them to the HN crowd (can be downloaded for free):

[https://fabiensanglard.net/gebbwolf3d/](https://fabiensanglard.net/gebbwolf3d/)

[https://fabiensanglard.net/gebbdoom/](https://fabiensanglard.net/gebbdoom/)

~~~
wernsey
He also recently had a series of articles about the various ports of Out of
this World/Another World to various systems which are well worth reading if
you're into this sort of thing

------
Veserv
For another visual explanation in words, floating point numbers (ignoring
subnormal) are just a linear approximation of 2^x [1] where there is one piece
for each integer (x = 4 to x = 5, etc). As an example, draw a straight line
between 2^4 (16) and 2^5 (32). The floating point numbers in that range are
evenly spaced on that line.

Another explanation using the window + offset terminology used in the post is
that the offset is a percentage of the way through the window. So, for a
window of 2^x, the difference between an offset of y and y + 1 is 2^(x-23) or
1/2^(-23) of 2^x. Put another way, floating point numbers do not have absolute
error like integers (each number is within 1 of a representable value), but %
error (each number is within 1/2^(-23) of a representable value). Essentially,
floating point numbers use % error bars instead of absolute error bars.

Using this model you can even see how to create your own floating point
numbers. Just pick a % precision you want, for single FP that is 1/2^(-23) and
double FP 1/2^(-52), that defines the range of your mantissa (offset). Then
pick a range of x values you want to represent, that is the range of your
exponent (window).

As an aside, subnormal numbers do not respect this principle. They extend the
expressible range for very small numbers by sacrificing % error for those
numbers. In the very worst case of the smallest subnormal number you can get
25% error (it might actually be 50%). As might be imagined, this plays havoc
on error propagation since if you ever multiply by a number that just so
happens to be the smallest subnormal, all your multiplies might suddenly be
off by a factor of 25% instead of the normal 100 * 2^(-23)% which is 2,000,000
times the % error which is quite a bit harder to compensate for. This is why
many people consider subnormals to be a blemish.

[1] The approximation is actually offset in the x direction for the bias. If
you want to be more accurate, you are actually graphing 2^(x - 127).

~~~
titzer
On the other hand, if you don't have subnormals, then you have the funny
property that subtracting two inequal numbers would yield 0. This never
happens with subnormals because they allow representing numbers closer to
zero, below the smallest representable exponent. The numerical stability of
some algorithms crucially relies on subnormals.

~~~
Veserv
I do not understand your statement. I can come up with two possible
interpretations, but neither seem true to me. Can you provide an example of
what you mean?

I am also interested in your statement that certain algorithms depend on
subnormals as defined in the floating point spec. Can you provide an example
of such an algorithm? I can intuit how it might be desirable to have a single
"too small/epsilon" value, but I do not see offhand how you can leverage the
full range of subnormals in any reasonably generic way that is not extremely
dependent on the specific problem and scale (i.e. multiply all numbers by 2^30
and increase the maximum exponent by 30, do you get the same output multiplied
by 2^30), so I would like to see how it is done.

In terms of my two possible interpretations, the first is that subtracting two
unequal floating point numbers yield 0. I am pretty sure this is not the case.
It may yield no change, but I am pretty sure it can not yield 0. The other is
that subtracting two unequal arbitrary precision numbers represented as
floating point numbers yield 0. This is true, but is a known limitation of
emulating arbitrary precision arithmetic using limited precision and must
always be accounted for. If this is what you meant, then we can just choose
numbers too small to be expressed by subnormals to cause the problem to occur
again, so all it does is allow handling of a few more cases at the cost of
complexity and non-uniformity. If you did not mean either of these two
interpretations, can you explain what you meant, preferably with a concrete
example?

~~~
nneonneo
If x and y are finite floats (not NaN or ±Infinity), with x != y, then x - y
!= 0. This is true for any pair of floats only because of denormal numbers. If
there were no denormals (or denormals were flushed to zero), then if |x - y| <
2^-127 then x - y would become zero.

------
commandlinefan
Wow, that's a much easier way to convert from decimal to floating point than I
had ever seen. He doesn't mention why biased notation is used (i.e. why the
exponent is stored as 127+E): it's used so that if you sort positive numbers
as if they were integers, they'll still end up in the right order.

~~~
bogomipz
>"He doesn't mention why biased notation is used (i.e. why the exponent is
stored as 127+E): it's used so that if you sort positive numbers as if they
were integers, they'll still end up in the right order."

Could you elaborate on this? Maybe an example? This sounds interesting but I'm
failing to grasp it.

~~~
Someone
[https://wiki.c2.com/?IeeeSevenFiftyFour](https://wiki.c2.com/?IeeeSevenFiftyFour):

 _“IEEE 754 […] Has the interesting and useful property that two 's complement
comparisons of the underlying bit pattern of any two IEEE 754 numbers will
have the same result as comparing the numbers that are represented”_

That means that, if you interpret the bits of a float/double as an
int32/int64, increase that integer by one, and then interpret the bits of the
result as a float/double, you get the smallest float/double that’s larger than
what you started with (with exceptions for NaN, infinities, +/\- zero, and,
possibly, some categories I forget)

That can be useful if you want to iterate over all floats between 0.0 and 1.0,
for example, but may not be that efficient on some modern hardware, where
moving data from the “float side” of the CPU to the “integer side” is
expensive)

~~~
saagarjha
This has more to do with the mantissa than the biased exponent.

~~~
Someone
No, it has to do with bothequally.

Changing a 0 bit to a 1 in an integer increases its value, and by the rule I
gave, should also increase the value of the floating point number with the
same bit pattern. It doesn’t matter whether that flipped bit is in the
mantissa or the exponent.

That requires the use of the biased number in the exponent. In particular, the
“all zeroes” bit pattern for the exponent must be the representation for the
lowest possible exponent, and the “all ones” one that for the highest. It
cannot be the ‘normal’ two’s complement representation of -1.

~~~
saagarjha
This is a different claim than the one above, which was about incrementing the
least significant bit.

~~~
Someone
But it trivially follows from it. If operation _foo_ (increasing _x_ by one)
makes something larger, repeated operation of _foo_ (increasing _x_ by 2^n)
makes it larger, too.

------
SupriseAnxiety
This right here is the only way I pretty much figured out just now a more
precise understanding of the weird patterns going on in my head. I crave
precision and this explained more than a whole year covering Basic C. Thank
you so much! My head doesn’t have to hurt over that anymore. I never
understood floats either but this representation truly helps clear some fog!!

------
GolDDranks
Another visualization that blew my mind; I learnt this from a presentation
that was about doing floats with FPGAs:

Think 32-bit floats as a 256-bit buffer interpreted as a fixed precision
number, with the decimal point straight in the middle, but with the limitation
that you can set only a continuous 24-bit window on that buffer to non-zeros.

Then, the 8 bits of the exponent determine where in the bit buffer that window
points to, and the 23 (+1) bits are the contents of the window.

------
saagarjha
Related, [https://float.exposed/](https://float.exposed/) is a great resource:
both when trying to see how floating point is laid out, as well as when having
to convert between the bit representation and the number for "actual work" ;)

~~~
bkanber
That is great. Also helps remind you that for very large numbers, the
precision is in the hundreds-of-thousands.

------
augustt
Didn't everyone learn scientific notation in high school? It's pretty much
exactly that, and you could put the coefficient/exponent into whatever bit
pattern you'd like.

~~~
GuB-42
It is scientific notation with a twist. The twist is the implicit 1. in the
mantissa.

There is no easy way to make it work in decimal. So you need to learn how
fractional parts work in binary, then move up to scientific notation in
binary. Then you probably noticed that all numbers start with 1, so you can
knock it off to save space. Oh, and add a special case for zero, because it is
the only number that doesn't start with 1, and if you are feeling adventurous,
there are subnormal numbers.

That's why I find the explanation in the article absolutely brilliant. By
treating the exponent as a window, not only you don't need to bother with
scientific notation in binary, but zero and subnormal numbers fit very nicely.

In fact, I've been working for years with floating point numbers and feel
stupid for not realizing it works just as described in the article. And
feeling stupid after reading something is usually a very good sign. I clicked,
expected an article I could dismiss in a typical Hacker News fashion, but no,
not this time.

~~~
anaphor
It's an implicit 1 or 0 (because it's binary floating point). The implicit 0
is for subnormal numbers. That's the part people usually don't explain when
they're first introducing it. It's literally {1,0}.xxxxxx where x is also a 1
or 0. I.e. a binary floating point number in scientific notation. I've seen a
lot of explanations that kind of gloss over that part of it (not saying you
don't understand it, just that even the article doesn't make that clear enough
IMO).

~~~
piadodjanho
In practice, subnormal are very rarely used. Most compiler disable subnormals
when compiling with anything other than -O0. It takes over a hundred cycle to
complete an operation.

Demo: #include <stdio.h>

    
    
        int
        main ()
        {
        
            volatile float v;
            float acc = 0;
            float den = 1.40129846432e-45;
        
            for (size_t i; i < (1ul<<33); i++) {
                acc += den;
            }
        
            v = acc;
        
            return 0;
        }
    

With -01: $ gcc float.c -o float -O1 && time ./float ./float 8.93s user 0.00s
system 99% cpu 8.933 total

With -O0: $ gcc float.c -o float -O1 && time ./float ./float 20.60s user 0.00s
system 99% cpu 20.610 total

~~~
piadodjanho
I looked at the asm generate from my original example and they generate very
different codes, gcc applies other optimization when compiled with -O1.

I've been fighting the compiler to generate a minimal working example of the
subnormals, but didn't have any success.

Some things take need to be taken in account (from the top of my head):

\- Rounding. You don't want to get stuck in the same number. \- The FPU have
some accumulator register that are larger than the floating point register. \-
Using more register than the architecture has it not trivial because the
register renaming and code reordering. The CPU might optimize in a way that
the data never leaves those register.

Trying to make a mwe, I found this code:

    
    
        #include <stdio.h>
        
        int
        main ()
        {
            double x = 5e-324;
            double acc = x;
        
            for (size_t i; i < (1ul<<46); i++) {
                    acc += x;
        
            }
        
            printf ("%e\n", acc);
        
            return 0;
        }
    
    

Runs is fraction of seconds with -O0:

    
    
        gcc double.c -o double -O0
    

But takes forever (killed after 5 minutes) with -O1:

    
    
        gcc double.c -o double -O1
    

I'm using gcc (Arch Linux 9.3.0-1) 9.3.0 on i7-8700

I also manage to create a code that sometimes run in 1s, but in others would
take 30s. Didn't matter if I recompiled.

Floating point is hard.

~~~
sigjuice
Shouldn't i be initialized?

~~~
piadodjanho
lol, how the compiler didn't warn about uninitialized variable

~~~
sigjuice
Compiler warnings are not straightforward and depend on many things; compiler
version, optimization settings, warning settings.

    
    
      $ gcc-9 -Wuninitialized -O0 float.c  # NO WARNINGS!!!
    
      $ gcc-9 -O1 float.c  # NO WARNINGS!!!
      
      $ gcc-9 -Wuninitialized -O1 float.c
      float.c: In function 'main':
      float.c:9:5: warning: 'i' is used uninitialized in this function [-Wuninitialized]
          9 |     for (size_t i; i < (1ul << 33); i++) {
            |     ^~~

~~~
piadodjanho
Weirdly, those two codes behaves differently:

    
    
        $ gcc hn.c -O0 -o hn
    
        for (size_t i; i < (1ul<<46); i++) {
            printf ("%zd\n",i);
            acc += x;
            break;
        }
    

Without break:

    
    
        $ gcc hn.c -O0 -o hn
    
        for (size_t i; i < (1ul<<46); i++) {
            printf ("%zd\n",i);
            acc += x;
        }

------
zoomablemind
It'd be nice to also mention about ulp [0], the unit of least precision. The
floating point is an approximation as concept, and ulp is one of the
properties of an implementation of floating point representation in binary
form.

0: [https://stackoverflow.com/questions/43965347/ulp-unit-of-
lea...](https://stackoverflow.com/questions/43965347/ulp-unit-of-least-
precision)

------
splittingTimes
That's why when I did numerical simulation of electron Dynamics in
semiconductors during my phD we never used straight SI units (m, s, kg, etc),
but instead expressed all physical natural constants in nm, fs, eV, etc. That
way all relevant constants had numerical values between 1 and 10 which
stabilized the simulations a lot.

~~~
swimfar
I had never heard of this idea before. Do you have any references to this
technique? Also, depending on what equations you're using, aren't you
constrained to using a consistent system of units?

~~~
splittingTimes
Never anything official. It was just how it was done in the group and wider
scientific community. Some googling pulled that up if it helps

[https://books.google.de/books?id=uzJbyD6_3DAC&pg=PA4](https://books.google.de/books?id=uzJbyD6_3DAC&pg=PA4)

And yes you have to be consistent and stay in that unit system. The other
remaining two we used plain Kelvin for temperature and for the charge Coulomb
was measures in electron charge. That's it.

~~~
Scaevolus
That shouldn't matter for the factors you're describing. The number of floats
between 1 and 2 is the same as the number of floats between 2^-30 and 2^-31.
Until you hit denormal numbers and start overflowing or underflowing your
exponent, it doesn't have any effect on precision.

If you have a process that converts floats into other formats with more
restricted exponents, like human readable strings, it might matter.

------
bchociej
Is all of that really easier to understand than exponential notation? It's a
great tool to visualize floating point precision, but it's lot more
circuitious to get to an understanding of what a floating point number
actually means IMO

~~~
bkanber
The equation fleshes out the detail that the visualization misses. I do best
when I have both.

------
seanalltogether
I always found the wikipedia examples for 16bit floating point helpful since
the numbers are smaller. You can really see how the exponent and fraction
affect each other in a very simple way.

[https://en.wikipedia.org/wiki/Half-precision_floating-
point_...](https://en.wikipedia.org/wiki/Half-precision_floating-
point_format#Half_precision_examples)

~~~
enriquto
The 8-bit floating point is even easier to understand, since you can list them
all in a single page, and even visualize the entire addition and
multiplication tables.

~~~
piadodjanho
Also a great format to count how many representation are wasted with redundant
representation (ZeroS, NaN, +inf, -inf).

~~~
enriquto
What is redundant about that? It "wastes" a completely negligible part of the
representation space, and the consistency gains are enormous.

~~~
cesarb
When talking about tiny 8-bit floats, it does waste a lot: if your exponent is
only 3 bits, you've "wasted" 1/8 of all 256 possible values, which is a lot.
With normal-sized floats, it's much less of an issue: 1/256 of the billions of
possible 32-bit values, and 1/2048 of all possible 64-bit values.

(Also, the real "waste" is only on the multiple NaN values, since the zeros
always "waste" only a single value for the "negative zero", and the infinities
always "waste" only two values; AFAIK, both negative zero and the infinities
are necessary for stability of some calculations.)

------
rwallace
Excellent article! Just one thing surprised me:

> While I was writing a book about Wolfenstein 3D[1], I wanted to vividly
> demonstrate how much of a handicap it was to work without floating point

I would've expected fixed point to work fine for games, because that's a
domain where you know the data, and in particular the dynamic range, in
advance, so the 'automatically adjust to whatever dynamic range happens to be
in the data' feature of floating point isn't needed. What am I missing? (If
the answer is 'it would take too long to explain here, but he does actually
explain it in the book', I'm prepared to accept that.)

------
dylan604
"(−1)S∗1.M∗2(E−127) How everybody hates floating point to be explained to
them."

Now I know I'm weird, as that formula makes sense to me.

~~~
dntbnmpls
The formula makes sense, but it doesn't give you the full picture of floating
points. The formula doesn't explain NaN, positive infinity, negative infinity
or even the bias. I think seeing the layout of a floating point number and
their constituent parts in memory is more instructive than a formula.

Floating points, like two's complement, are hardware derived/limited
creations. You have to understand the underlying hardware and binary
limitations to understand why they exist and why they were created in that
manner. The same thing applies to ascii. Zero is 48, A is 65 and a is 97. The
mapping seems arbitrary unless you see the bit pattern and you realize how
clever the design is and the reason why some of the symbols got their mappings
in ascii.

------
jonplackett
Whenever I read something about how computers _really_ work (as in not just a
nice easy to comprehend programming language) I realise just how much smarter
some people in the world are than me.

~~~
bigdict
With every thing like that that you read, you get closer to them!

------
adrianmonk
Here's how I like to think of it.

 _Floating point numbers are just fractions_ but with one extra condition: the
denominator is a power of two.

The normal rules of fractions apply. If you want to add them, you have to make
sure the denominators match, which would involve scaling the numerators too.

Just like fractions, there are multiple ways of writing the same value. 3/2 is
the same as 6/4.

You can't write 1/3 exactly because, hey look at your denominator, it's 3.
Which isn't a power of 2, is it? So that can't be a floating point value.

~~~
cambalache
> Just like fractions, there are multiple ways of writing the same value. 3/2
> is the same as 6/4.

This is not correct. In binary you only have 2 choices {0,1} for the
numerator, so 3/2 is represented 1.1 and 6/4 can only be represented as 1.1
(feel free to add the completing 0s)

~~~
adrianmonk
The numerator is a binary integer, not a bit (and not the part of the number
before the "decimal" point).

3 in binary is 11. 6 in binary is 110.

3/2 (decimal) is 11/10 (binary) and 6/4 (decimal) is 110/100 (binary).

Of course the denominators always have trailing zeros, so can be (and are)
represented more compactly.

------
zozbot234
I believe that this "window and offset" intuition, while indeed true and
useful in the radix-2 ("binary") case, does not cleanly extend to the general
case where no hidden bit is used even for non-"subnormal" numbers, and some
numbers may thus have multiple representations. This shows up perhaps most
clearly in the case of decimal floating point, but ISTR that a non-2 radix was
also used in some mainframes.

~~~
saagarjha
Subnormal numbers are a blemish on the otherwise beautiful IEEE 754
specification :(

------
biddlesby
I saw a very simple visualisation of floating point by simply plotting points
along the number line and zooming out. That gets the idea across very quickly!

~~~
wenc
Yes, I came across something like that in my numerical computation course.
This was for me the most compelling visualization of what a floating point
number was.

The floating number line is actually an (unevenly spaced in reals, uniformly
spaced in binary) discrete number line that was used to approximate continuous
infinite real numbers, in finite precision e.g.

Reals

0 __________________________________________________ 1

Floats

0 .. .... .. ... .. ... ... ... .. ... ... ... .. .... ... ... ... . ... ....
... .. . ... 1

~~~
saagarjha
A more accurate representation might be

    
    
      0.... . . . .  .  .  .  .    .    .    .    1
    

(Hacker News eats consecutive spaces, it seems like :/)

~~~
wenc
Correct about spacing. More accurate reproduction here:

[https://www.volkerschatz.com/science/float.html](https://www.volkerschatz.com/science/float.html)

------
cmrdporcupine
This is a bit odd:

"Trivia: People who really wanted a hardware floating point unit in 1991 could
buy one. The only people who could possibly want one back then would have been
scientists (as per Intel understanding of the market). They were marketed as
"Math CoProcessor". "

The 486DX (1989) was already common in 91/92 and came with a floating point
unit. I had a 50mhz 486DX, and I was not by any means wealthy. FP unit was
certainly used by lots of software, especially things like Excel, but C
compilers certainly produced code for it if you had one.

Likewise, the 68040 (1990) had onboard FP. Macintosh Quadra, Amiga 4000, and
various NeXT models had it.

Yes if you bought a 386 you often had to get a floating point co-processor as
upgrade, but it wasn't _that_ uncommon. Same on the 68k series; I knew people
with Atari MegaSTes (68000) that bought an FP co-processor. They weren't
astronomers :-)

This feels like recent history, am I really that old?

------
jscheel
I remember wanting a math co-processor so I could use autocad at home as a
kid.

~~~
fabiensanglard
And now you could probably buy an expensive Tesla V workstation dedicated to
that but you don't have time anymore. Wait. No. That is me :P !

------
baby
Interesting, because I think about it the other way: the offset tells you
between which power of two you're in:

* [2^-127, 2^-126] * ... * [1, 2] * [2, 2^2] * [2^2, 2^3] * ... * [2^128, 2^129]

and the window tells you where you are in this "window". You have 8 bits to
figure out where you are in there, so it's obvious that you have much more
precision if you are in the window [1, 2] than in the window [2^128, 2^129]

------
jiveturkey
[https://opencores.org/projects/fpuvhdl](https://opencores.org/projects/fpuvhdl)

------
boomlinde
I wonder why the significand is represented as a > 1 number. It complicates
representing zero (making it a special case not covered by the article of
defining exponent=0, mantissa=0 to mean zero). Is it for the purpose of
simplifying arithmetic operations or is it to minimize the number of redundant
representations of zero?

~~~
rational_indian
The leading 1. prefix can be omitted from the binary representation thus
saving a bit in the process.

~~~
boomlinde
That doesn't answer my question, though. I'm asking why the significand is in
the range 1-2 instead of 0-1.

------
dang
A thread from 2019:
[https://news.ycombinator.com/item?id=19084773](https://news.ycombinator.com/item?id=19084773)

Discussed at the time:
[https://news.ycombinator.com/item?id=15359574](https://news.ycombinator.com/item?id=15359574)

(Reposts are ok after a year or so.)

------
jimmaswell
I remember wondering at first how you know this won't have multiple
representations and cover every number - it's because the mantissa can't reach
2, and if it was 2 it would be the same as adding to the exponent, so you get
the full range between any two exponents.

~~~
piadodjanho
Ackchyually...

The IEEE-754 has a lot of redundant representation. Not where you would expect
though.

Caveat: Those features are invaluable for some niche applications, but not for
the average joe.

To start. Every IEEE-754 float has two zero representation: one for positive
zero and another negative negative zero (sic).

The special numbers are another source of redundancy. The the double format,
have about 9,007,199,254,740,992 different combination to encode three
different states that a production ready software shouldn't reach: NaN, +inf
and -inf.

Other than the redundancy, the double have many rarely used combination. For
instance, the subnormals representation. Unless you are compiling your program
with -O0 or with some exotic compiler, they are disabled by default. One
subnormal operation can take over a hundread of cycles to complete. Therefore,
more 9,007,199,254,740,992 wasted combination.

If that wasn't bad enough, since the magnitude of the numbers follows a normal
distribution (someone whose name I forgot's law), the most significant bits of
the exponent field are very rarely used. The IEEE-754 encoding is suboptimal.

The posit floating point address all those issues. It uses an tapered encoding
for the exponent field.

~~~
thechao
I'm mixed on Gustafson's posit stuff. For me, the only thing I'd change for fp
would be:

1\. -0 now encodes NAN.

2\. +inf/-inf are all Fs with sign: 0x7FFFFFFF, 0xFFFFFFFF.

3\. 0 is the only denorm.

Which does four _good_ things:

1\. Gets rid of the utter insanity which is -0.

2\. Gets rid of all the redundant NANs.

3\. Makes INF "look like" INF.

4\. Gets rid of "hard" mixed denorm/norm math.

And one seriously bad thing:

1\. Lose a bunch of underflow values in the denorm range.

However, as to the latter: who the fuck cares! Getting down to that range
using anything other than divide-by-two completely trashes the error rate
_anyways_ , so why bother?

The rest of Gustafson's stuff always sounds like crazy-people talk, to me.

~~~
piadodjanho
He also propose the use of an opaque register to accumulate (quire), in
contrast to the transparent float register (its a mess, each compiler does
what it think is best).

When working with numbers that exceed the posit representation you use the
quire to accumulate. At the end of the computation you convert again to posit
to store in memory, or store the quire in memory.

In C, it would look like something like:

    
    
        posit32_r a, b;
        quire_t q;
        
        q = a; // load posit into quire
        
        q = q + b; // accumulate in quire
        
        a = q; // load quire into posit
    
    

> The rest of Gustafson's stuff always sounds like crazy-people talk, to me.

I've read all his papers on posit and agree. But I do believe the idea of
encoding exponent with golomb-rice is actually very good and suit most users.
The normalization hardware (used in the subtraction operation) can be easily
repurposed to decode the exponent and shift the exponent.

But the quire logic (fixed point arithmetic) might use more area than a larger
float-point. But maybe in power usage it pays of.

~~~
thechao
I come from GPU-land, and a quire always brings a chuckle from the fp HW folk.
They like the rest of Gustafson’s stuff, though.

~~~
piadodjanho
Yeah. Gustafons is like, memory is cheap, just use a few hundreds of kb to
store the quire. To be fair, he is not HW guy.

------
jimbob45
This guy's Quake 1/2/3 source code deep dives are well worth checking out as
well.

------
barbs
Awesome. Maybe now I can finally understand the fast inverse square root hack.

[https://en.wikipedia.org/wiki/Fast_inverse_square_root](https://en.wikipedia.org/wiki/Fast_inverse_square_root)

------
voz_
This is beautifully written. Bravo to the author.

