
Kahan Summation Algorithm - killjoywashere
https://en.wikipedia.org/wiki/Kahan_summation_algorithm
======
nestorD
Worth noting that Kahan summation works under the assumption that, at every
step of the summation, the current sum will be larger than the term added.

It is not too hard modify the algorithm in order to remove this assumption and
it increase the accuracy of the algorithm sensibly (but sadly a test, as
suggested on the wikipedia page, is not the best way to do that).

Here is a Rust crate (not mine) that offers various implementation of
increasing accuracy and an exact sum (slow but it will return the floating
point that is closest to the sum of your inputs) :
[https://crates.io/crates/accurate](https://crates.io/crates/accurate)

------
snovv_crash
I've used it twice. Once when inverting large matrices, where the built-in dot
product algorithm was inaccurate, and once when working with a very small
process noise with Kalman filters, with 32 bit floats due to platform
restrictions, where the process noise could only increase uncertainty to a
certain point after which there was underflow.

Both times it has worked brilliantly, allowing a targeted increase in accuracy
on only the components that needed it, allowing to choose the memory /
computational hit desired, similar to using extra bits in certain temporaries
when using fixed point.

------
radford-neal
There's not much point to this algorithm now. It is faster to produce the
correctly-rounded result of the exact sum (at least if you are summing more
than a few terms), rather than use Kahan summation to produce a still-not-
entirely-accurate result. The wikipedia page mentions the python fsum function
that does this, but my method at
[https://gitlab.com/radfordneal/xsum](https://gitlab.com/radfordneal/xsum) is
faster.

~~~
lidHanteyk
This is quite interesting, and I'll be examining your method more closely in
the future. A skim of the paper convinces me that there is merit.

The main upside to the Kahan method is that it can be incremental and online.
Imagine that one is writing a Prometheus-like metrics client, and keeping a
running tally. One cannot reorder the summation, and one cannot take advantage
of parallelism. In these cases, a small Kahan accumulator can perform
incredibly well.

~~~
pixelpoet
> The main upside to the Kahan method is that it can be incremental and
> online.

I got excited when I heard the claims of a method better than Kahan/Neumaier
summation, but the storage requirements are enormous (6700% versus for the
"small" version, versus 200% for Kahan), so I definitely wouldn't call Kahan
summation "pointless". It's in fact one of the most amazing and underutilised
bits of CS out there IMO.

------
ccapo
I remember reading about Kahan summation, and I naively thought it would help
improve the accuracy/consistency of a n-body code I was using for my PhD. I
was trying to resolve a discrepancy between summations when performed serially
versus in parallel using OpenMP. In the end I abandoned the approach, since it
seems the issue lay elsewhere.

~~~
kardos
I've employed this Kahan summation in a different physics simulation that was
giving parallel vs serial discrepancies due to global sums that sum in
different orders in parallel. It did help, but I ultimately concluded that it
isn't a complete solution for this issue... in short, I think it just bumps
the problem several orders of magnitude down.

For what it's worth, and in the event that your issue was indeed the differing
order of summing-- there is a solution that /does/ work, which is to cast the
numbers to a large (eg, 256 or 512 bit) fixed point representation and
leverage the reproducibility of integer sums.

~~~
snovv_crash
The issue here is that unless you have checks that your values dont over or
underflow at every stage of the calculation, you can never actually know if
your answers are accurate, even if they might be repeatable.

~~~
tgflynn
I would think that in most cases you should be able to prove an upper bound on
the sum and make sure the accumulator is large enough to accommodate it. My
guess is that 512 bits should be enough to span all physically/computationally
relevant scales.

~~~
pascal_cuoq
This article lists some real examples that require more than 512 bits:
[https://res.mdpi.com/d_attachment/mathematics/mathematics-03...](https://res.mdpi.com/d_attachment/mathematics/mathematics-03-00337/article_deploy/mathematics-03-00337.pdf)

Note that digits of precision are expressed in decimal unless explicitly given
as “bits”, so that the phrase “10,000-digit arithmetic” really means around
33,000 bits.

~~~
tgflynn
Certainly if you have errors that grow exponentially all bets are off. I
wouldn't think that would apply to simply summing a set of numbers though.

------
neonate
Implementations in a bunch of languages:

[https://rosettacode.org/wiki/Kahan_summation](https://rosettacode.org/wiki/Kahan_summation)

