

TurboPFor – DIRECT ACCESS Integer Compression - powturbo
https://github.com/powturbo

======
nkurz
Thanks for putting this up! I'm trying to finish up a paper today on
vectorizing VByte decompression, so have been thinking about some of these
issues a lot recently.

It would be help orient people (like me) if you added more description of the
data format you are using for each approach. It's difficult reading
templatized C at a glance, so I'm not certain what's happening in each format.
For example, since the average bit widths for your SimpleV are less than 8,
it's clearly doing something different than traditional variable byte integer
implementations. Nothing wrong with this, but would be helpful to know the
details.

I'm mostly with you on the "Realistic and practical benchmark with large
integer arrays" part. There are some arguments for variations, though. In
particular, reading from memory but decoding to cache can be a more realistic
benchmark for search engine applications. Feel free to write email to me or
Daniel if you'd like to try to integrate your benchmarks with ours.

------
powturbo
TurboPFor - DIRECT ACCESS Integer Compression w/o decompressing entire blocks.
Instant access to individual compressed array elements X-Times faster.

