
The Surprising Usefulness of Sloppy Arithmetic - solipsist
http://web.mit.edu/newsoffice/2010/fuzzy-logic-0103.html
======
jerf
Despite what the article says, if you're using floating point numbers you're
_already_ using sloppy arithmetic. That's not just a sarcastic point, it's
actually important; given that you're already not being precise it isn't
necessarily surprising that you can trade away even more precision for speed,
rather than it being some sort of binary yea/nay proposition that cracks
numeric algorithms wide open.

"Off by 1% or so" leads me to guess it is implemented by using 8-bit numbers,
and not necessarily with any particular sloppiness added by the chips, just
the fact that the precision is small. Visual and audio processing could be set
up in such a way that you wouldn't overflow those numbers because you know
precisely what's coming in. You'd have to be careful about overflow and
underflow but, per my first paragraph, you _already_ have to be careful about
that. It also makes sense in such apps that silicon would be more profitably
used computing more significant bits more often rather than dithering about
getting the bits in the 2^-50 absolutely precisely correct, a good insight. I
don't know if that's what they're doing because it's hard to reverse engineer
back through a science journalist but "8 bit math processors in quantity" ->
"science journalist" -> "this article" is at least plausible.

~~~
modeless
Science journalism is so frustrating! Here's the version for smart people:

<http://web.media.mit.edu/~bates/Summary_files/BatesTalk.pdf>

In summary: Low-precision high-dynamic range arithmetic (floating point with
small mantissa, ~1% error) uses ~100x less power and die area than IEEE
floating point. The errors are acceptable for a huge class of applications
(basically anything you'd consider running on a GPU today).

~~~
alf
I don't quite understand how he gets a 10,000x speedup from a 100x transistor
count decrease. Does die area increase with the square of transistor count?

~~~
pjscott
What he's doing is representing numbers with their logarithms, with limited
precision. A floating-point multiplier/divider, then, turns into a fairly
small adder, which is much smaller and faster. Square roots and squaring turn
into bit shifting. They have some clever method for doing addition/subtraction
efficiently. And since they can fit all this in a small area with short
critical paths, they can clock it very, very fast, and include a lot of them
on a chip.

~~~
modeless
It would be rather interesting programming a machine where division was faster
than addition!

------
vilya
The chip architecture described in the article reminds me of the DEC MasPar
system [1] we had at uni back in the mid-90s. 2048 processors (IIRC), where
each processor could only communicate directly with it's 8 neighbours. If you
wanted to get decent performance out of it, you had to think carefully about
you were going to get your data onto each of the processors.

[1] <http://en.wikipedia.org/wiki/MasPar>

~~~
rlpb
Interesting. This reminds me of cellular automata. Are we headed towards some
kind of hybrid?

------
pjscott
This would be beautiful for protein folding. That particular application is
extremely parallel, numerically heavy, and should tolerate the loss of
precision very well. It also eats up processing power like a black hole, so a
few orders of magnitude speed improvement would definitely be nice.

~~~
cowsandmilk
Based on my experience, different stages of folding should require differing
precision.

For ClusPro (protein docking, <http://dx.doi.org/10.1002/prot.22835>), we the
first stage are rough energy functions for global sampling of the protein
surface. For these functions, we use floats because it is a rigid body
sampling and is very tolerant to clashes. However, when it comes to the
minimization/refinement stage, we have seen weird things happen with floats
and instead use doubles.

Similarly, the functions used in early stages of protein folding can probably
deal with loss of precision, but the stages for producing high quality
structures would not.

------
sanxiyn
The idea has been there for a long time: take a look at
<http://en.wikipedia.org/wiki/Logarithmic_number_system>

------
jorgem
I just need a sloppy python math library, now.

~~~
tkaemming
That's easy — just delete `decimal.py` from your standard library.

