
Fast integer compression: decoding billions of integers per second - ColinWright
http://lemire.me/blog/archives/2012/09/12/fast-integer-compression-decoding-billions-of-integers-per-second/
======
nkurz
This blog post [[http://blog.mikemccandless.com/2012/08/lucenes-new-
blockpost...](http://blog.mikemccandless.com/2012/08/lucenes-new-
blockpostingsformat-thanks.html)] and the associated Lucene Jira issue
[<https://issues.apache.org/jira/browse/LUCENE-3892>] may be of interest to
those who are excited by this paper.

If I'm reading it right, I think the main conclusion from the Lucene patch was
that using a straight "FOR" (Frame of Reference) approach was faster in the
real world than using "PFOR" (Patched Frame of Reference), and only slightly
worse in index size.

I haven't read this new paper yet, but would be interesting to compare and
contrast the approaches.

------
binarymax
Nice summary. Slightly off topic, I first learned about the basics of integer
compression with some good context in "Data Intensive Text Processing with
MapReduce", which I recommend not necessarily for that section, but otherwise
as a great book. <http://lintool.github.com/MapReduceAlgorithms/>

------
kevingadd
The repeated patterns in the SIMD implementation (see
[https://github.com/lemire/FastPFor/blob/master/src/bitpacksi...](https://github.com/lemire/FastPFor/blob/master/src/bitpacksimd.cpp#L649))
seem like they're dying for some simplification via macros. I can't imagine
trying to spot a bug or typo in an implementation written out raw that way.

On the other hand, this seems like a pretty clever approach to improving
compression for this space. SIMD is often underutilized.

~~~
sedachv
There's a couple of Common Lisp projects that generate SSE and CUDA code:

<http://common-lisp.net/project/sb-simd/> <https://github.com/angavrilov/cl-
gpu>

CMUCL comes with SSE2. There hasn't been any effort to revive StarLisp or
Paralations or NESL (<http://www.cs.cmu.edu/~scandal/nesl.html>) as libraries
on top of that though. Not to be snarky, but the state of the art in data-
parallel programming languages hasn't even caught up with NESL yet.

------
ecubed
Am I reading this correctly that you can do integer compression at the L1
cache level? Does this go even deeper into the actual processor registers, or
would that be going overboard?

------
jaytaylor
This sounds neat, but did I miss the link to the publications and/or source
code?

~~~
cube13
It's buried in the article, but here's the paper:

<http://arxiv.org/abs/1209.2137>

Here's the source:

<https://github.com/lemire/FastPFor>

