
Why floats suck - accurate galaxy-wide coordinates in 128 bits - peterhajas
http://www.facepunch.com/content/119-Why-floats-suck.
======
mturk
I am an astrophysicist, and I deal with this issue directly. In my
simulations, I do deeply nested Adaptive Mesh Refinement calculations. These
simulations run inside an Eulerian grid, where each cell contains a fluid
element. These cells are collected into grid patches. When a cell inside a
given grid exceeds some criteria, we refine it by inserting 8 cells; the next
level of cells are then collected and minimum bounding boxes are created that
cover the flagged cells, and we compute now at an additional 'level' of
refinement.

My research concerns the formation of the first stars in the universe. In
order to set up the initial conditions of these objects correctly, we have to
resolve fluctuations in the background density of the Universe on the scale of
kiloparsecs. However, we are also mostly concerned about how the fluid behaves
on much smaller scales, at the tens to hundreds to thousands of AU scale, all
the way down to the radius of the sun or even lower.

In one of my "stunt" runs, I ran up to 42 levels of refinement inside a box;
this means that the dynamic range between the coarsest cells and the finest
cells was 2^42. For the most part, however, we only use about 30-33 levels of
refinement. In order to attain this resolution, we utilize extended precision
positioning. (Future versions of the code will utilize integer positioning
with relative offsets for Lagrangian elements like particles.) Inserting this
functionality into the code took quite a bit of debugging and testing, in
particular because we had to account for differences in how Intel, PGI and GNU
passed extended precision values back and forth between Fortran and C.

I have uploaded a very simple 2D zoomin movie of one of these simulations,
which only had 30 levels of refinement, to Vimeo. It's busy converting as I
write this, but it will be found here when it has completed:
<http://vimeo.com/23176123>

~~~
hartror
I chat with PHD students at the Swinburne University Centre for Astrophysics
and Supercomputing from tiem to time and they have many similar stories to
tell. There are pretty sexy problems to deal with as a programmer enough to
have me considered abandoning industry and joining the ranks of the
professional student.

~~~
TheEzEzz
As someone in the opposite position to you (academic coder, considering
abandoning academia for industry), I offer up my take: the painful bits of
scientific coding are infinitely more painful and numerous than the painful
bits of non-scientific coding.

I'm kind of a masochistic in that I thoroughly enjoy debugging non-scientific
codes (I do indie game dev in my off time), but debugging scientific code
makes me want to spoon my eyeballs out one at a time. Debugging normal code is
like a puzzle, with interesting hints as to what the problem is. Debugging
scientific code is like trying to crack a hash table by hand.

------
ChuckMcM
This is an interesting article and worth a read. Then head over to Mike
Cowlishaw's rant on decimal [1]. I met Mike at a conference and he's a
brilliant guy, prior to seeing his decimal stuff I was using REXX on an Amiga
(what Tcl might have been) which was language he invented while at IBM.

He got interested in decimal math early on but not because 'floats' such by
_binary floats_ really suck. That is a number system where .1 can't be
accurately represented. How often do you use .1 in your calculations? He has a
couple of examples of 34 digit precision which is representable in a 128 bit
quantity. (3.7 bits per digit). Straight BCD would of course be 32 digits
minus the digits which represent the exponent.

I wrote some CAD routines once that used decimal arithmetic rather than binary
and was amazed at how much better it worked in terms of edges lining up and
when you approximate non-square surfaces you get much better tesselation.

And of course for history buffs the IBM 1620 used decimal as a native
representation for numbers. Now that we've got lots of transistors to play
with, perhaps its a good time to revisit the way we represent numbers.

[1] <http://speleotrove.com/decimal/>

~~~
palish
" _I wrote some CAD routines once that used decimal arithmetic rather than
binary and was amazed at how much better it worked in terms of edges lining up
and when you approximate non-square surfaces you get much better tesselation._
"

That's precisely what I'm working on. What did you use?

P.S. Use 0.125, not 0.100

~~~
ChuckMcM
I used the Java decNumber class (linked off Mike's web site) and the Java 2D
API. Once the 3D API came out it seemed like the visualization code at least
should use that. Converting to/from decimals to floats was unnecessarily
painful at the time.

------
bhickey
Floats are great, you just need to understand when to use them!

If you want to use them for an application where you're doing frequent
multiplications or divisions, I strongly suggest log transforming your domain
and replacing your multiplies with adds and divisions with subtracts. This
goes a long way to prevent numeric underflow when performing Viterbi parses
without resorting to some godawful slow non-native numeric representation.

Haskell implements this as LogFloat
([http://hackage.haskell.org/packages/archive/logfloat/0.8.2/d...](http://hackage.haskell.org/packages/archive/logfloat/0.8.2/doc/html/Data-
Number-LogFloat.html)), and I've cloned it for C++ as LogReal
(<http://www.bhickey.net/2011/03/logreal/> and
<http://github.com/bhickey/logreal>)

~~~
wladimir
Well his point is that floats suck for coordinates, because they have a high
precision at the origin, then become less and less precise further off. If you
want to simulate a galaxy (his example) you want the same precision
everywhere, and a uniform non-floating point representation is better.

~~~
atakan_gurkan
If you want to simulate a galaxy, the smallest scale that you can resolve is
much much larger than size*2^(-128). Since you need floating point arithmetic
for many other computations, increasing the precision in your coordinates will
cost a lot of extra computing without any gain in accuracy.

When precision in coordinates becomes an issue, you make a transformation and
use relative coordinates anyways.

~~~
jerf
100000ly / 2^64 = 51 meters and change. And your simulation is probably faster
with appropriate int math than any float math will be.

Transforming to relative coordinates doesn't solve all the problems; you do
your local calculation with the improved precision, but presumably at some
point you then throw that back away again when you translate back to the
global coords.

~~~
palish
_"And your simulation is probably faster with appropriate int math than any
float math will be."_

This isn't true for modern CPUs, especially if you use SIMD vector operations.

------
jacquesgt
Understanding the precision of floats is a small pain, but fixed point math is
much more painful.

With fixed point, you have to decide how many fraction bits you're going to
have. If you have more than zero fractions bits, multiplies and divdes end up
looking like this (assuming 16 fraction bits): result = (a * b + 1 << 15) >>
16 result = (a / b) << 16; // rounds to zero is a is close to b result = ((a
<< 4) / b) << 20; // not much range or precision on the result Of course, if
your variables can't all be represented with the same number of fraction bits,
you need to keep track of that and adjust your shifts for each calculation.
You also need to do shifts when you add two numbers with different numbers of
fraction bits. This stuff is all taken care of for you by floating point
numbers.

With fixed point, if you mess up a calculation and it overflows, you get
completely whacky results (wrap around) which will cause massive bugs. With
floating point, you lose a bit or two of accuracy.

There are definitely places where fixed-point is needed. If you're on a
microcontroller without an FPU, you'll have to learn your way around floating
point. If you absolutely can't deal with the loss of precision you get with
big floating point numbers, you'll have to use fixed point and make sure you
understand the range of parameters that can go into each calculation and
prevent overflow.

If floats work for you, just use them.

------
JabavuAdams
Better description of how this issue affects video games, here:

[http://home.comcast.net/~tom_forsyth/blog.wiki.html#%5B%5BA%...](http://home.comcast.net/~tom_forsyth/blog.wiki.html#%5B%5BA%20matter%20of%20precision%5D%5D)

It's funny, I was just thinking about this a couple of weeks ago for a
simulation project.

Google Code search reveals that Celestia uses a 128b (64.64) position class to
solve the problem.

------
beza1e1
As a programmer it's easier to insert a bug with floats than with integers,
because IEEE 754 has various subtle stumbling blocks.

For example, remember to think about NaN and that x != x, if x is NaN. And did
you know, that there are actually two kinds of NaN? Or did you know, that
1.0/0.0 == +infinity?

~~~
palish
You say "1.0/0.0 == +infinity" like it should be surprising, but it makes
perfect intuitive sense to me.

You assert that it's "easier to insert a bug with floats", but your first
example (NaN) is one which programmers shouldn't be worrying about. The only
case you'd want to check for NaN is in debug code (an assert).

~~~
Deestan
> You say "1.0/0.0 == +infinity" like it should be surprising, but it makes
> perfect intuitive sense to me.

But it _isn't_ correct. 1/0 is undefined.

1/x approaches infinity as x approaches 0 from the positive side. 1/x
approaches negative infinity as x approaches 0 from the negative side. If we
_were_ to define 1/0 it has to be both positive and negative infinity at the
same time.

~~~
palish
The goal for floats isn't to resolve equations. It's to perform arithmetic.
(1.0 / 0.0) has to be handled _somehow_.

From a mathematical point of view, handling it as "+infinity" has a 50/50
chance of being "correct". As floating-point math is inherently an
approximation, this is a reasonable tradeoff.

But from a pragmatic point of view, you don't care about the result of "x *
foo" if x is +infinity or -infinity. The result isn't usable.

~~~
andos
_1.0/0.0 has to be handled somehow._

Yes, by raising an error [1]. Or whistling innocently and pretending it's not
my code while leaving the room.

 _handling it as "+infinity" has a 50/50 chance of being "correct"_

It has _zero_ chance of being correct because infinity can never be the result
of any arithmetic computation. Infinity is not a number. Infinity only appears
as the result of evaluating limits, an operation which is not arithmetic.

[1] <http://en.wikipedia.org/wiki/SIGFPE>

~~~
marshray
<http://en.wikipedia.org/wiki/IEEE_754-1985>

The spec allows the programmer to decide if the various degenerate cases in
arithmetic will raise a signal, or result in a special value which propagates
through the rest of the calcuations (INF, NaN, etc.).

Certainly there are times to prefer one over the other, but the facilities
that were standardized back in the 80's are generally more than adequate for
most uses.

Whether or not your platform C compiler lets you access these settings in a
convenient and portable way is another question however...:-)

~~~
andos
As long as everybody decides _consciously_ whether to propagate garbage or
raise flags, the world is safe.

~~~
marshray
My policy is to prefer to signal or throw an exception. It's not something
that can be accidentally ignored by an unconscious programmer.

------
dustingetz
there's a better reason, per TomF's blog[1] linked below by JabavuAdams. \--

"any time you do a subtract (or an add, implicitly), consider what happens
when the two numbers are very close together, e.g. 1.0000011 - 1.0. The result
of this is going to be roughly 0.0000011 of course, but the "roughly" is going
to be pretty rough. In general you only get about six-and-a-bit decimal digits
of precision from floats (2^23 is 8388608), so the problem is that 1.0000011
isn't very precise - it could be anywhere between 1.0000012 or 1.0000010. So
the result of the subtraction is anywhere between 1.2 _10^-6 and 1.0_ 10^-6.
That's not very impressive precision, having the second digit be wrong! So you
need to refactor your algorithms to fix this.

The most obvious place this happens in games is when you're storing world
coodinates in standard float32s, and two objects get a decent way from the
origin. The first thing you do in rendering is to subtract the camera's
position from each object's position, and then send that all the way down the
rendering pipeline. The rest all works fine, because everything is relative to
the camera, it's that first subtraction that is the problem. For example,
getting only six decimal digits of precision, if you're 10km from the origin
(London is easily over 20km across), you'll only get about 1cm accuracy. Which
doesn't sound that bad in a static screenshot, but as soon as things start
moving, you can easily see this horrible jerkiness and quantisation."

[1]
[http://home.comcast.net/~tom_forsyth/blog.wiki.html#%5B%5BA%...](http://home.comcast.net/~tom_forsyth/blog.wiki.html#%5B%5BA%20matter%20of%20precision%5D%5D)

------
atakan_gurkan
To make various computations, you generally need powers of a given quantity.
It seems to me that when you take large powers for intermediate results, the
precision in fixed point arithmetic will rapidly drop, no? Furthermore you
need to put together various quantities with different scaling factors. IMHO,
that would make things very error prone (or you can again sacrifice
precision).

I am not aware of any physical problem that actually requires 128 bit floats.
I can make up some, but those would be either chaotic (in which case precision
does not help), or I would be telling you about a solution that looks like it
can be improved. There are of course some problems for which we have no
physical insight, then high precision provides a nice safety buffer.

~~~
eru
I agree in general.

Precision does help to some degree in chaotic circumstances. For example, when
forecasting the weather, you know that it's chaotic. But the chaos takes a
while to show itself.

------
smlacy
Floats are for math. Math includes computing things like 100! (100 factorial),
the volume of the observable universe (~3.3* 10^80), or any other countably
finite value, most of which are far greater than 2^128.

Fixed point is a special purpose encoding that is useful in many places
(computer graphics and constrained simulations) when your range and precision
are known and fixed. "Math" doesn't follow these rules.

Floats are good for math.

~~~
JabavuAdams
You have it exactly backwards. 100 factorial cannot be represented by a float
(or double, or long double)

100! = 9332621544394415268169923885626670049071596826438162146859296389521759\
9993229915608941463976156518286253697920827223758251185210916864000000\
000000000000000000

This is 2^(524.765) > 2^32 or 2^64 or 2^80 or 2^128

Note that "floating point" is commonly used to refer to 32-bit or 64-bit or
80-bit representations.

Floats are for games (with caveats), and possibly engineering, while purer
math needs arbitrary-precision representations that are too slow for games and
many simulations.

~~~
brazzy
Floats are _primarily_ designed for engineering, and not just floats. Read the
original von Neumann paper. Basically, out dominant computer architectur was
originally designed to optimally support floating-point for numerical
calculations.

EDIT: I just looked at the paper again and realize that I was wrong. Von
Neumann actually wanted to keep things simple and thus assumed all numbers to
be normalized to be between -1 and 1...

------
juiceandjuice
Author has probably never tried to optimize an algorithm on SSE2 or higher.

Doubles are awesome and often used in science. 128 bit numbers aren't quite so
awesome.

[http://download.oracle.com/docs/cd/E19963-01/html/821-1608/e...](http://download.oracle.com/docs/cd/E19963-01/html/821-1608/epmpv.html)

------
thisrod
In the 40s, floating point numbers were the hot new thing in programing. John
von Neumann argued against them, because anyone smart enough to use a computer
could track the exponents in their head. Or so I've heard.

This is the classic trade of computer efficiency against human effort. The
typical scientific program is run once, so it leaves as much work as possible
to the machine. The efficient tradeoff might differ for games and such
applications as AutoCAD or Splice.

------
nikcub
Great post. A question which doesn't have much to do with the main point -
don't his calculations result in a plane with 2d co-ordinates rather than
three dimensional?

ie. with our galaxy, wouldn't you have to include the thickness to get any co-
ord:

(100 000 light years) * (12000 ly) / (2^128) = 3.336e^-17

------
guard-of-terra
How can he fit three-dimensional coordinates into a single 128-bit number?
Seeing how he divides diameter, I assume he only took care of one axis.

I'm curious how many bits do we need to address a point in an universe with
reasonable accuracy.

~~~
CapitalistCartr
We'd need three axes to define a coördinate in space, four to include time.
Given his estimation of the known Universe as 9.3e10 light years across,
giving each axis 2^96 would yield an accuracy of about 1.1 kilometers, plenty
enough for space navigation. Multiply that by the number of axes.

~~~
jimktrains2
yay! someone else who uses diæresis marks:)

~~~
eru
You should get a subscription to the New Yorker.

------
lutorm
Interesting, but actually it's no longer the case that fixed point math is
faster than floating point. Certainly GPUs are _a lot_ faster at floats than
ints.

~~~
SomeCallMeTim
Tell that to Android developers who are writing simulation apps that need to
work on ARMv6 phones, like the LG Optimus series. No FPU there.

------
ck2
Sorry to be pedantic but since the universe is constantly expanding, aren't
such coordinates already inaccurate in only a short while after they are
created (unless they are highly localized which likely defeats their purpose
in the first place)?

~~~
temptemptemp13
I see 2 possible solutions:

1\. The coordinates expand with the universe.

2\. The coordinates are static and the universe just expands. Note that this
isn't awful, planets and stars are always in motion. When planning inter-
galactic missions you're always going to have to know where's the starting
point and to calculate where the ending point will be at the relevant time.

So to say where Earth is you're going to write down a 4D vector, specifying at
what time this was measured.

Btw, the article didn't mention 3D space did it?

------
ubasu
This displays a fundamental lack of understanding of how floating point works.
In particular, we are concerned about the _accuracy_ of the floating point
representation, which is characterized by the machine precision. Typically,
single-precision has 8 digit accuracy, while double-precision has 16-digit
accuracy.

Furthermore, we don't use 128-bit representations because they are slow for
numerical computation, because we need more trips through the smaller-sized
bus.

See <http://en.wikipedia.org/wiki/Floating_point>

~~~
JabavuAdams
No, this is a real problem.

Lack of precision of floating point is something that comes up often in game
development, now that levels are larger.

If you have a level that is 1-2km across, and you store all your positions in
floats, then you _will_ notice that animations for instance are quantized.

Near the world origin, everything is fine, but far from the world origin
individual vertices in the character and props will become quantized to the
smallest representable delta. This results in atrocious geometric fizzing and
popping.

~~~
stcredzero
Yeah. I've also tried a galactic simulation game and ran into the limitations
of floats. 32 bit floats only give you a usable resolution of 1 in 100000 for
global coords. Even doubles are too small. This becomes an issue if you want
to zoom from local ship view out to the local planetary body. You end up
having to do goofy things like make a scaled billboard out of the nearby
planets and place them far enough away that nobody notices.

~~~
marshray
I think we can conclude that galactic simulation games are probably a special
case for which 128 bit integer coordinates make sense.

Once you've subtracted out the offset of your spaceship though, everything you
view from there is easier done with float or double.

~~~
stcredzero
Of course, in particular since in most cases, one must communicate with the
GPU using 32 bit floats. (I'm going to guess that this is changing, though.)

~~~
marshray
Yes, the recent generation of Nvidia GPUs has added hardware support for
doubles. This was done to be able to sell hardware for use in financial
analysis rather than any use in graphics.

~~~
stcredzero
The Panda3D Python library uses double-precision floats, due to it's being a
Python library. Currently, these are converted to 32 bit floats which are then
sent to the GPU.

