
An LSM-Tree-based Ultra-Large Key-Value Store for Small Data [pdf] - jorangreef
http://www.ece.eng.wayne.edu/~sjiang/pubs/papers/wu15-lsm-trie.pdf
======
jsherer
This looks like an interesting finding. Unfortunately, the trade-off for this
type of efficient small data storage is real:

 _> Note that LSM-trie uses hash functions to organize its data and
accordingly does not support range search._

Range search, while not directly applicable to all data sets, is an important
feature of the LSM data stores compared (LevelDB & RocksDB).

The authors acknowledge this and say:

 _> There are techniques available to support the command by maintaining an
index above these hash-based stores_

So, don't plan on using an LSM-Trie for a direct replacement for your LevelDB
or other LSM-Tree based projects that rely on Range searches without
considering the additional complexity of building and maintaining an index to
perform those Range searches.

~~~
jasonwatkinspdx
For folks curious how you might build up a range index atop a hash store,
here's a general scheme:
[https://www.eecs.berkeley.edu/~sylvia/papers/pht.pdf](https://www.eecs.berkeley.edu/~sylvia/papers/pht.pdf)

------
parenthephobia
Of potential interest, the code itself: [https://github.com/wuxb45/lsm-trie-
release](https://github.com/wuxb45/lsm-trie-release)

------
eyan
Every time I come across a new KV store, I go out looking for an userland
implementation from the Acunu guys:
[http://arxiv.org/abs/1103.4282](http://arxiv.org/abs/1103.4282). So far, none
seen yet.

I'd like to see Stratified B Trees kick LSMs' butts. Or the other way around.
I can't code this yet. But I can hope that somebody's already on to this.

Need to mention Percona's take on this space too:
[https://github.com/percona/PerconaFT](https://github.com/percona/PerconaFT).
The patent notice makes it scary tho.

------
kragen
I haven't read the whole paper, but "LSM" is "log-structured merge"; I guess
they merge runs over time to keep write amplification down? As eyan points
out, Twigg's (Acunu's) "stratified B-trees" may have obsoleted the whole copy-
on-write-B-tree family of data structures.

~~~
jorangreef
This is a form of LSM tree which sacrifices range queries to gain at most 5x
write amplification over 5 levels. That translates into a significant increase
in throughput compared to LevelDB and RocksDB.

------
jorangreef
This paper targets datasets of 1-10 billion keys, keeping write amplification
down to at most 1x per level (i.e. 5x over 5 levels), and in-memory indexes to
a minimum.

Here are the slides:
[https://www.usenix.org/sites/default/files/conference/protec...](https://www.usenix.org/sites/default/files/conference/protected-
files/atc15_slides_wu.pdf)

------
CyberDildonics
I tried LevelDB with high hopes, but it is ungodly slow. The claims did not
line up with reality in any way.

~~~
jorangreef
This is very different to LevelDB. But LevelDB's claims are actually valid,
provided your dataset is not massive, and you understand how your workload
interacts with its compaction and write amplification. This paper presents a
form of LSM-tree which solves both those issues (massive dataset, low write
amplification).

------
jkot
Interesting, will try to add that into my project.

------
elcapitan
Now that we know that "number of atoms in the universe" is not a good
extremely large number [1], maybe we can use "number of key-value-stores" as
the new reference?

[1]
[https://news.ycombinator.com/item?id=11588918](https://news.ycombinator.com/item?id=11588918)

