
Things we learned about sums - bluestreak
https://questdb.io/blog/2020/05/12/interesting-things-we-learned-about-sums
======
ken
Pet peeve alert: not distinguishing between "IEEE-754 type" and "real number".

> System.out.println(5.1+9.2);

> We ask to add 5.1 to 9.2. The result should be 14.3, but we get the
> following instead: 14.299999999999999

That's misleading. You put the characters "5.1" in a Java source file, which
in Java (like many languages) means "the IEEE-754 64-bit binary floating point
value closest to the decimal number 5.1". It's not equal to the real number
5.1. 9.2 and 14.3 can't be represented exactly in binary floating point,
either.

The number 5.1 + the number 9.2 should be the number 14.3.

The Java double 5.1 + the Java double 9.2 should not be the Java double
represented by 14.3.

The addition isn't really what's hurting you. The representation is. You're
confusing the matter by changing type systems in the middle of a sentence.

> It is a small difference (only 0.000000000000001), but it is still wrong.

The answer is "wrong" in large part because the question was wrong. The error
in IEEE-754 "14.3" is only slightly smaller than the error in "5.1+9.2".

The sum _looks_ more wrong than the literal "14.3" because Java's Double
toString() truncates its output according to some specific rules [1] designed
to guess how those 64 bits got there.

> CPUs are poor at dealing with floating-point values. Arithmetics are almost
> always wrong

Wouldn't it be more accurate to say they're "wrong" at addition 50% of the
time? You picked two numbers whose IEEE-754 representation are both just below
their actual value, and whose sum is just above its IEEE-754 representation.
Had you added "5.1+5.2" (which happen to be just below and just above their
actual values, respectively), the representation errors would have cancelled,
and you'd have gotten "10.3" as you expect.

[1]:
[https://docs.oracle.com/javase/7/docs/api/java/lang/Double.h...](https://docs.oracle.com/javase/7/docs/api/java/lang/Double.html#toString\(double\))

~~~
bluestreak
Thank you for taking time to elaborate on the subject. This is both
informative and enjoyable to read! We knew that all real numbers cannot be
represented using IEEE-754. It wasn’t all clear though because toString()
appears to be perfectly transitive for double 5.1 yet “sum” is non-
associative.

I guess it would be more accurate to say that arithmetic is wrong at some
probability value. I just feel that in this instance it is appropriate to call
“wrong” what is not 100% right :)

------
bluestreak
Author here.

About a month ago, I posted about using SIMD instructions to make aggregation
calculations faster. I am very thankful for the feedback so far, this post is
the result of the comments we received last time.

Many comments suggested that we implement compensated summation (aka Kahan) as
the naive method could produce inaccurate and unreliable results. This is why
we spent some time integrating kahan and Neumaier summation algorithms. This
post summarises a few things we learned along this journey.

We thought Kahan would badly affect the performance since it uses 4x as many
operations as the naive approach. However, some comments also suggested we
could use prefetch and co-routines to pull the data from RAM to cache in
parallel with other CPU instructions. We got phenomenal results thanks to
these suggestions, with Kahan sums nearly as fast as the naive approach.

A lot of you also asked if we could compare this with Clickhouse. As they
implement Kahan summation, we ran a quick comparison. Here's what we got for
summing 1bn doubles with nulls with Kahan algo. The details of how this was
done are in the post.

QuestDB: 68ms Clickhouse: 139ms

Thanks for all the feedback so far and keep it going so we can continue to
improve. Vlad

~~~
zackb
Thanks for you work on this and making it FOSS. This is the first I've heard
of QuestDB. Is it production ready? Whats the scaling story? Thanks

~~~
bluestreak
Thank you! QuestDB is in production. Scaling is a part of commercial product,
built using QuestDB.

------
radford-neal
While Kahan summation is more accurate than naive summation, it still does not
produce the precise result (the true sum rounded to the final precision).

It is not too costly to do the summation exactly. Several algorithms for this
have been developed. You can read about mine at
[https://arxiv.org/abs/1505.05571](https://arxiv.org/abs/1505.05571) and get
the code for them at
[https://gitlab.com/radfordneal/xsum](https://gitlab.com/radfordneal/xsum)

~~~
bluestreak
Thank you Radford! This is a long piece and it would take me a while to absorb
properly. Does your algorithm lend itself to vectorised implementation?

~~~
radford-neal
It doesn't vectorize in an obvious fashion.

But it has occurred to me that one could accumulate the sum in several
"superaccumulators", rather than just one as in the present code. (For
instance, by adding even terms to one superaccumulator and odd terms to
another superaccumulator.) One could then add the superaccumulators at the
end. This would allow for vectorization, or for other forms of instruction-
level-parallelism.

I haven't tried this, however. My guess is that it would be advantageous for
summing a very large number of terms, but maybe not for less than at least
hundreds of terms.

One could also use multiple cores, again with multiple superaccumulators. One
advantage of computing the exact sum is that the answer then doesn't depend on
what order you add things in (any order gives the same exact result),
simplifying the use of multiple cores.

------
ianmchenry
super interesting. love it.

