Hacker News new | comments | show | ask | jobs | submit login

RocksDB has an LSM architecture, similar in nature to HBase, leveldb, etc. But the implementation is based on a Theorem that we will be publishing shortly. I am working on the Theorem with a colleague of mine.

Cache Oblivious B-trees is an interesting paper. Similarly fractal trees. Most of them optimize the case when index nodes are not in memory. However, in our use-cases, we typically configure the system in such a way that most index nodes are in memory.

For an LSM database, the key component is "compaction". You can ingest data only as fast as you can compact, otherwise u get a unstable system. .1 RocksDB replaced the Level-style compaction of leveldb with UniversalStyleCompaction that has reduced write amplification. This boosts performance. 2. RocksDB implemented multi-threaded write, which means that parallel compactions on different parts of the database can occur simultaneously. This boosts performance. 3. Bloom filter for range-scans: this boost read performance 4. MergeType records that allows higher level objects (counters, lists) use only-write instead of a read-modify-write. Improves performance. 5. And many more...




Can you share with us the statement of that theorem?

What is "UniversalStyleCompaction", and why is it capitalized and missing spaces?

How does a Bloom filter for range scans work? Standard Bloom filters (as you know) are for existence only.


I'm guessing this may have been an early draft of some of the statements of the theorem:

http://webcache.googleusercontent.com/search?q=cache:fTxlRmb...




Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: