If you're interested in working on projects like this, we are hiring - contact me tom at shopify.com.
Bolt was originally just a side project but we found that it fit our needs for our analytics database so we eventually swapped out LMDB for Bolt.
The way this is worded, it sounds like an improvement, but in reality bolt is no safer to use than LMDB was, it just lacks some configurable tradeoffs that were present in the original engine, and were safer to use than those that remain in bolt.
That said, I'm happy to see APPEND go ;)
I try to point out in the docs that DB.NoSync should only be used for offline bulk loading where corruption is typically less of an issue (since you can simply rebuild).
The wording was intended to make it sound like an improvement for people wanting smaller, more concise code bases. Bolt is 2KLOC vs LMDB's 8KLOC. Smaller surface area makes it easier to test thoroughly. It's definitely a tradeoff for people who want that really low level control though.
Note that APPEND mode won't allow you to corrupt the DB, despite what the doc says - you'll get a KEYEXIST error if you try to load out of order. We just left the doc unchanged to emphasize the importance of using sorted input.
I'd honestly suggest reading through the source code of some database projects. That's the best way to learn the structures and understand the engineering tradeoffs. Some are more approachable than others though. Bolt and LMDB are 2KLOC and 8KLOC, respectively. LevelDB is 20KLOC and RocksDB is 100KLOC, IIRC.
If you want to go further then I'd suggest porting one of those over to a language you know well. It doesn't have to be code you release to the world or use in production but just porting the code really helps to cement the concepts in my head.
To understand why the LMDB implementation does what it does will require much more practical knowledge, not theory. The references in my MDB papers (and the paper itself of course) will give you more of that. http://symas.com/mdb/#pubs
B+trees are trivially simple, in theory. Writing one that is actually efficient, safe, and supports high concurrency requires more than theory. It requires understanding of computer system architecture: CPU architecture/cache hierarchies, filesystem and I/O architecture, memory, etc.
It does not require 100KLOC though. RocksDB is just a pile of gratuitous complexity.