
EWAH – A compressed bitmap class in C++ - espeed
https://github.com/lemire/ewahboolarray
======
pelario
At first I found intriguing to see no mention of SDSL[1], which probably is
one of the best libraries in the field of compressed data structures,
including compressed bitmaps.

Probably because the index is built to answer a "different kind of queries".
There is a blog commenting on those "other" queries[2]. In the same blog the
author discusses some other compressed data structures included in the
library.

1: [https://github.com/simongog/sdsl-lite](https://github.com/simongog/sdsl-
lite) 2: [http://alexbowe.com/rrr/](http://alexbowe.com/rrr/)

~~~
cbsmith
a) The EWAH library is reference implementation provided to support the paper.
SDSL is a library that collects together a bunch of such papers. It isn't odd
that there is no mention of SDSL. It is a bit odd that SDSL makes no mention
of any of Daniel's work.

b) I haven't used SDSL much or RRR bit vectors, but IIRC it's more designed
for maximum compactness as well as quick popcount & select type operations.
You can do that with EWAH, but a lot of the power of EWAH and its ilk (BBC,
WAH, etc.) is with set operators like AND, OR & ANDNOT. Does SDSL have the
same kind of optimizations?

~~~
pelario
The bitmaps in SDSL are indexed to support rank and select queries: How many
1's are from the beginning until position i, and conversely, which is the
position of the j-th 1) [it uses popcount but is beyond it]

So yes, the problems they are solving are different, even though the input
data is the same.

------
t1m
We used this incredibly fast/compact library for Hustle DB indexes. Anyone
needing a good Python interface to it could pry it out of the dependencies:
[https://github.com/tspurway/hustle/tree/master/deps/libebset](https://github.com/tspurway/hustle/tree/master/deps/libebset)

------
tmaly
I have had good results using his roaring bitmap index. there is a Go and Java
version in addition to C++

~~~
cbsmith
Lemire FTW. ;-) The Roaring bitmap index is kind of the successor to EWAH.

------
bshimmin
EWAH is "Enhanced Word-Aligned Hybrid", for anyone wondering.

(It also reminded me of the joke "Why does Edward Woodward have so many 'd's
in his name?")

------
devty
I appreciate projects with README like this - one including a gentle survey on
the domain of the project. Nice writeup.

