
Dynamically Adjustable Key-Value Store by Combining LSM and COW B+ Tree [pdf] - ngaut
https://greensky00.github.io/pdf/jungle_hotstorage19.pdf
======
otoburb
Looks like the team presented at HotStorage 2019[1] from July with slides from
their talk[2].

[1] [https://www.usenix.org/conference/hotstorage19/workshop-
prog...](https://www.usenix.org/conference/hotstorage19/workshop-program)

[2]
[https://www.usenix.org/sites/default/files/conference/protec...](https://www.usenix.org/sites/default/files/conference/protected-
files/hotstorage19_slides_ahn.pdf)

~~~
hinkley
Did they benchmark against B+Trees and LSM? I don’t see that in the slides.

~~~
otoburb
Slides 13, 14 and 15 show Jungle's performance using a range of compaction
factors (C=[2, 3, 5, 10]) measured against LSM-tree using leveled compaction
and LSM variant using tiered (aka size) compaction.

Having said this, the paper itself (in the original link) under Section 4
"Evaluation" describes the differences in more detail, and Figure 6 probably
does the best job of showing Jungle benefits in a compact human-readable line
of charts.

If I'm reading the paper correctly, and as summarized on slide 16, using the
combination of CoW B+ and LSM means that instead of a 3-way tradeoff between
Read/Write/Space, Jungle can minimize the cost trade-off such that the only
remaining material trade-off is Write/Space.

Pretty cool stuff.

~~~
hinkley
I looked at those slides twice and only saw them as comparing different
settings of their algorithms. I dunno if that says more about me or how long
PhD students (haven’t) spent with Tufte.

Maybe three colors for the bars.

~~~
otoburb
>> _[...] how long PhD students (haven’t) spent with Tufte._

I think this :) I also had to look more closely a few times.

------
willvarfar
its exciting that things are happening again in db-land. A few years ago it
was Fractal Trees, then leveldb arrived and got wider adoption, then Facebook
put it in MySQL with MyRocks etc.

Very recently timescaledb did some really really interesting stuff with
compressing tables, and I’m wondering if that is useable with non-time-series
data too etc?

I’m a heavy tokudb user because of the compression and I’m looking forward to
seeing if a b+ lsm with compression is going to turn up in MySQL or even
better Postgres.

~~~
StreamBright
Is there any reason Postgres or other projects could not adopt tokudb's
solution?

~~~
willvarfar
Architecturally, MySQL went with a “the storage engine is a plug-in” and
Postgres went with a more traditional “the storage is part of the core”.

Postgres has slowly grown some ... tolerance ... for alternative storage
engines, via federation and forks like greenplum and timescaledb that put
their own engines in, but they are always feeling like second class citizens
and hitting integration limits.

------
continuations
Is source code available?

------
jules
How does the performance compare to B-epsilon trees, which are B trees with a
write buffer in each node to make writes more efficient.

------
ivanjaros
looks like great candidate for bolt(b+) and badger(lsm)

