

Benchmarking Integer Compression in Go - zhenjl
http://zhen.org/blog/benchmarking-integer-compression-in-go/

======
lobster_johnson
Please, please use bar charts. Line charts are suitable only for data series
where the X axis is scalar, not separate categories.

Also, the article has absolutely nothing about testing methodology. I'm
guessing you're using Go's built-in benchmarking, but it still needs to be
clarified; are the timings the mean, 95th percentile, or what? What about
variance -- better CPU cache usage would lead to less variance, I would guess?

Interesting article, though. Thanks for writing it.

~~~
zhenjl
Thanks for commenting!

And you are absolutely correct about the bar charts. Not exactly sure why I
chose line but they have been replaced now.

The testing didn't use the built-in benchmark mechanisms but it did use the
*testing.T mechanism. It's basically runs of different algorithms on the same
set of pre-loaded integers. And then I recorded the time (ns) it took to run
compression and decompression (time.Since(now).Nanoseconds() where now was
time.Now()).

The caching effect should be pretty similar for all the algorithms since they
work off the same LARGE slice of data, afaict.

I will update the article to reflect the testing methodology (may be a few
days before I can get to it though.)

THanks

~~~
lobster_johnson
That's so much better.

PS. I noticed you have an article about bitmap indexing. I hope you realize
that the algorithm you're using is based on WAH, which is patented by the DoE.

~~~
zhenjl
The algorithm is EWAH, not WAH. It's also by Daniel lemire.

~~~
lobster_johnson
Last I checked (my Internet connection is wonky, so I can't verify right now),
EWAH was based on WAH, so it's affected by the patent.

~~~
zhenjl
Sorry for the late reply. I was traveling for the past week or so.

In any case, EWAH is designed to avoid the WAH patent. According to the
author, the Apache people checked things out when they adopted it for Apache
Hive and they were apparently satisfied.

If Apache is satisfied, I think we are safe. :)

