
Ask HN: Embedded Database Design Papers - gfs
In order to learn more about database design principles I thought it would be great to read papers on embedded databases. I have searched around for material on the popular open source databases like RocksDB, LMDB, or LevelDB but haven&#x27;t found anything concrete. Aside from reading the source code, how can I teach myself about the data structures and abstractions that these provide? Eventually I would like to write a toy implementation after getting acquainted with a subset of this knowledge.
======
hyc_symas
Where have you been looking? LMDB's design is fully documented. Aside from the
papers

[https://symas.com/lmdb/technical/#pubs](https://symas.com/lmdb/technical/#pubs)

every aspect of the source code is documented with Doxygen.

[http://www.lmdb.tech/doc/](http://www.lmdb.tech/doc/)

------
SamReidHughes
RocksDB and LevelDB use a "log-structured merge tree" so you might want to use
that as a search term. There is a section "Papers" at the bottom of the web
page at [http://leveldb.org/](http://leveldb.org/) .

LMDB uses a B-tree variant, according to Wikipedia. Maybe it's a B+ tree as
Wikipedia says, but keep your thinking cap on -- it's probably not in exactly
the form you see described on Wikipedia.

If you want to see a _toy_ LSM-tree implementation, I have one at
[https://github.com/srh/nihdb](https://github.com/srh/nihdb) \-- it was
written as a starter Rust project, not as a fast LSM-tree engine. It's almost
the simplest plausible LSM-tree. But you'll miss out on design considerations
relating to concurrency and performance.

------
philix001
Goetz Graefe [1] published interesting surveys on B-Trees, query
optimization...

The Red Book can give you many ideas [2].

Mark Callaghan has been published a lot of stuff related to LSM recently [3].

[1]
[http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.219...](http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.219.7269&rep=rep1&type=pdf)

[2] [http://www.redbook.io/](http://www.redbook.io/)

[3]
[https://smalldatum.blogspot.com/?m=1](https://smalldatum.blogspot.com/?m=1)

~~~
philix001
Check my own B+-Tree implementation that I did as an exercise. It's clean and
well-commented.
[https://gist.github.com/philix/236f82183bbb27bd01033f94fe42e...](https://gist.github.com/philix/236f82183bbb27bd01033f94fe42e69b)

Real database systems are more sophisticated, but the most basic B+-Tree is
still very interesting.

