
IndexR, a fast and realtime storage format for Apache Drill - FlowWei
https://github.com/shunfei/indexr
======
QLeelulu
> Realtime ingestion speed - maximum over 30K events / second / node / table.
> e.g. 10 nodes each serving 10 realtime tables can consumes 3M events within
> one second. We believe this is the best score ever around all similar
> systems.

> Scan speed - You may find much better performence in real production
> environment because IndexR can process multiple values at the same time with
> the help of modern CPU and processing platform(like Drill).

> * cold data: over 30 millions values / second / node.

> * hot data: over 100 millions values / second / node.

> * ~2.5 times as Parquet.

> OLAP query - In our production, we see 95% < 3s query latency on tables with
> 100 billions+ rows with 20 nodes.

> * 3~8 times as Parquet under the same Drill environment.

> Compression - In our production, we see average 10:1 compress rate comparing
> to raw csv format.

> * ~75% size of ORC file format.

These looks great

