
The End of Numeric Error: An interview with John L. Gustafson - ozdave
http://ubiquity.acm.org/article.cfm?id=2913029
======
muizelaar
Here's a python port of Gustafson's Mathematica unum implementation that I
did:

[https://github.com/jrmuizel/pyunum](https://github.com/jrmuizel/pyunum)

~~~
jepler
Hi, thanks for sharing your implementation. Am I totally misusing (py)unum?

Using pyunum, repeatedly rotate the point (1,0) by 45 degrees. Print out the X
coordinate every 8 rotations. Mathematically, the value printed each time
should be exactly 1. Unfortunately, after just 79 iterations, the unum covers
the range (-27352064, 27352064)! This includes the correct result but is
useless.

A similar program using Python floats (i.e., platform doubles) prints
0.9999999999999937 at step 79 and runs in 1/540 of the time on my oldish core2
notebook. (or 0.9999999999371962 at step 799999 in less than 1/10 of the time
of the 79 unum steps)

    
    
        from unum import *
    
        a = divideu(x2u(1), sqrtu(x2u(2)))
    
        x, y = x2u(1), x2u(0)
        for i in range(80):
            x, y = timesu(minusu(x, y), a), timesu(plusu(x, y), a)
            if i % 8 == 7: print i, view(x)

~~~
AlexCoventry
Yes, you're misusing the concept, and pyunum makes it hard to use in this
situation (probably because the author had a different purpose in mind.) Try
re-running (in a fresh python invocation if you're in a shell) with this at
the top of your script:

    
    
      import unum_config
      unum_config.e = 3
      unum_config.f = 9
    

This gives 2^3=8 bits for the exponent, and 2^9=512 bits for the fraction. The
default values for e and f are 3 (8 bits) and 4 (16 bits) respectively.

The idea with unums is that your program sees the unacceptable bounds, and
recomputes with higher precision.

Comparing speeds is very unfair: one implementation is hardware which has
received maybe $100B-worth of optimization over decades, the other is one
guy's python script.

In this implementation, 9 seems to be the maximum value for f: if I set it to
10 unum fails to import because it tries to divide a float by too large an
integer. And 4 seems to be the maximum practical value for e. At e=5,
initialization of unum takes over two minutes and 250M of memory, probably
because it's computing the largest possible value in that floating point
scheme.

~~~
muizelaar
Yes. My intention with pyunum was to match the Mathematica code as much as was
reasonable so that it's possible to follow the code samples in the book
without requiring Mathematica. The result is unpythonic and very slow but
should be usable for experimenting with the semantics of unums.

It would certainly be nice to have more pythonic port.

------
aaibec
Did anyone find peer-reviewed papers on the topic (e.g. in an IEEE journal
given that it challenges the IEEE floating point standard)?

I think the book is nice, but popularizing "unum"s with a book targeting non-
experts feels like skipping an important step in due process, given the grand
claims that are being made.

~~~
jlgustafson
Plenty of experts refereed the book, believe me. William Kahan, David Bailey,
Gordon Bell, Horst Simon, Ulrich Kulisch, John Gunnels, and anonymous
reviewers as well. It was vetted for months before it was released. If you
prefer a more formal treatment, Ulrich Kulisch has written up the unum
concepts in a paper more palatable to mathematicians.

Incidentally, I submitted a short version to the ARITH23 conference detailing
how IEEE 754 could be repaired to give bit-identical results on different
systems, and they had it reviewed by three reviewers, one who liked it, and
the other two who served on the IEEE 754 committee and trashed it for some
pretty silly reasons, like using the word "subnormal" instead of "denormal".

There is no easy way to boil the ocean, but I'm trying.

------
ori_b
From what I'm understanding, this format uses a variable number of bits,
automatically growing and shrinking as needed.

How can this be implemented efficiently? If there's a cap on the maximum
number of bits (eg, 64), why not just use that maximum number all the time,
and not have to worry about shuffling things around (bitwise!) in memory?

~~~
dnautics
if it's supported in the hardware, you could turn off those gates altogether
and save energy.

------
conistonwater
Isn't this yet another implementation of interval arithmetic? What am I
missing? Calling his book "The End of Error" seems a bit much.

~~~
Someone
No, it is making a distinction between exact and inexact numbers, and uses
arbitrary precision.

I don't see it gain wide acceptance. If your problem is fixed-num, use a
bignum library. If it is not, chances are you will meet a number that isn't
representable in this format very, very soon (1/3, 1/5, sqrt(2), PI, etc.)

The moment you do, you will have to decide to how many bits of precision you
want to compute that inexact number. Problem is that you cannot make that
choice, unless you know exactly what you intend to do with the number. I bet
most users will choose a format that their hardware handles natively: IEEE.

There may be edge cases in numerical computing that benefit from this,
especially when problems are ill-conditioned, but even there, I expect people
to convert to conventional IEEE on the outside of their API. Also, I'm not
sure there is much room in this space below quad-width floats (but I am not an
expert)

~~~
barsonme
wouldn't most of the issues be solved with use of bcd/dcd? you get the speed
(compared to arbitrary-precision), fixed-precision, and exact answers up to
the precision limit?

~~~
bsder
Unfortunately, no.

BCD as "binary coded decimal" is, effectively, useless. Even arbitrary
precision libraries tend to be faster than BCD.

"decimal floating point", which is what I think you meant, just changes what
isn't representable. Instead of 1/5 causing a problem, 1/3 causes a problem.

The "classic" problem about this is matrix operations. Specifically, matrix
inversion. You can use interval arithmetic on matrix inversion, and you find
that the intervals blow up to almost infinity after just a couple of
operations even though the center of the interval is actually pretty close to
the solution. Of course, the reverse also holds: while most matricies invert
relatively cleanly, there are always pathological cases that really do need
humungous precision to invert.

PS: Don't know why you were getting downvoted as you asked a legitimate
question.

~~~
barsonme
Thank you the answer! It was very insightful. Also, no big deal about
downvotes. Can't please everybody.

------
ris
Call me when there's a hardware implementation.

~~~
trsohmers
<6 months for FPGA, ~12-18 months for silicon...

~~~
r0fls
Is that in the article? Or else, how do you know? I would like to write an
implementation for fun in another language, but unum is getting mixed reviews
here. Hence, it would be nice to see your source for it being incorporated
into hardware.

~~~
trsohmers
At the bottom of the article, John mentions REX Computing, which I am the
founder/CEO of. John has been a friend and advisor, and we have played around
with minimal unum implementations, but haven't been able to commit the
resources to a full one until after we tapeout our first chip in a little over
a month (which will be using IEEE float). We also funded "dnautics" (a HN user
who also commented here) to do a soft implementation of unum 1.0 in Julia.

~~~
oso2k
Has REX explored other alternative binary floating point formats? Say, dec64
[0]?

[0] [http://www.dec64.com](http://www.dec64.com)

~~~
r0fls
How is dec64 different from Floating Point (IEEE 754)? It seems IEEE 754
allows for using base 10. What else do they do differently, do you know?

~~~
oso2k
Besides using only base 10 representation, 1) all numbers are exact even if
there are multiple representations. 2) because of 1), .1 + .2 = .3, eg,
arithmetic follows the same rules you learned in school for 4 function
arithmetic. 3) It supports upto 56-bit precise integer math without loss of
precision. Look around YouTube for crockford talks about it. Usually they're
JavaScript related. Or check dec64.com

------
robinhoodexe
Not that relevant, but

>Because the Julia language is particularly good at adding new data types
without sacrificing speed, several people jumped on making a port to Julia,
including the MIT founders of that language.

I'm really impressed with the Julia language so far and considering
implementing some quantum chemistry models for many-body systems (Hartree-Fock
and probably Coupled Cluster if I ever get that far). Mostly because a) Julia
looks fun, b) FORTRAN is still king of speed (which I'll need all I can get
of) but is a pain to write and c) the alternative to FORTRAN is C(++), where
being OO isn't that much of a bonus. And Julie looks fun to write.

~~~
santaclaus
> FORTRAN is still king of speed (which I'll need all I can get of) but is a
> pain to write

I'm surprised that, with the restrict keyword, one would see better code gen
from Fortran than C. Isn't the lack of aliasing Fortran's primary advantage?

~~~
neutronicus
Leveraging "restrict" is not a trivial thing to do.

If you're a researcher first and a software developer second (which I suspect
is true of gp), the last thing you want to do is get into the weeds of
inspecting gcc assembly to figure out where you didn't put a "restrict" that
you should've.

------
dang
Url changed from [https://motherboard.vice.com/read/a-new-number-format-for-
co...](https://motherboard.vice.com/read/a-new-number-format-for-computers-
could-nuke-approximation-errors-for-good), which points to this.

------
ambrop7
Why bother? The precision of double is sufficient for the vast majority of
engineering problems. Seldom do you have measurements that are more precice
than the calculations you do with them.

~~~
bsder
"vast majority"? Maybe.

But anything which uses matrices (aka. systems of linear differential
equations), always runs into some pathological cases. Quite often, they aren't
even uncommon.

Systems which have vastly different time constants in the same system are
_ALWAYS_ a problem. Simulating RF circuits or phase locked loops is always an
issue.

Computational fluid dynamics always has boundary conditions that challenge
infinity/zero and drive the solver crazy.

Something as pedestrian as simulating _basic cotton cloth_ accurately is
problematic (the warp/weft redistributes energy on a much faster time scale
than gravity pulls the cloth down).

A lot of people did a lot of research and put in a lot of programming hours to
make solving these problems relatively easy without understanding the gory
details.

~~~
zevets
But will unums help? No matter how an ill conditioned matrix is represented,
it will still be ill conditioned, and we'll still be iterating, unless I've
missed something in this article.

I'm skeptical that changing how we represent the numbers will help - when the
physics are icky, they're icky.

~~~
tjl
It won't really help. You'll get the same issues with stiff DEs or ill-
conditioned matrices for perfect precision or floating-point. There will be
some cases where it will help, but not a lot.

I've implemented a few DE solvers and sparse matrix implementations before. I
think sparse matrices is one case where it would help.

