What kinds of use cases do you all use LMDB or other key-value stores for? I can somehow never really find a good one. Almost all databases, no matter how trivial, will sooner rather than later need more features so I always reach for SQLite instead.
Still, I'd really like to make use of it. But for what? Caching?
It was originally designed for LDAP, so there is one example. Using a relational system almost certainly means you're using key-value structures indirectly. That's how table indexes are often implemented: a map from some key to a row coordinate (or what have you.) LMDB has been used as a backend for SQLite.
Key-value stores are primitives. Sometimes your use case needs nothing more elaborate than this primitive. Sometimes you have no choice but to limit yourself to that primitive because there are no resources for anything more complex. Often you're using it as a component of a larger storage scheme.
Mail systems are an example where you're forever looking up senders and/or recipients for some policy. Recently I've dealt with GeoIP stuff, where you're constantly resolving network addresses against a huge map. Bloom filters are a great use of a key-value store for large data sets.
Understand that often there is a lot of custom code around queries and other operations that use key-value stores. You might, for example, have some tuple that can map to multiple values in a sparse map. In that case you can iterate over variants of the tuple (selectively remove, normalize or otherwise alter parts of it,) and make multiple queries according to some precedence rule until you get a hit (or not.) There are many such tricks that will greatly extend the utility of a map.
LMDB in particular is worth understanding, even if you never use it. It has an extremely simple design that essentially makes the OS page cache into an ACID database with concurrent readers that are not blocked by writes or require coordination, and has no "transaction log." Pretty astonishing given that it boils down to about 10K LOC.
I've never used LMDB directly myself, but it's used as the backing store for stuff like some LDAP servers. You get a high performance low-level layer and anything extra you build on top of it yourself
not sure why all the existing crates have gone so far out of date. i feel like generating bindings isn't that hard if you've done it before but can be a blocker for people who haven't so maybe this will help.
some of the existing crates had the -sys suffix but it didn't seem like they were actually checking for it on the system (i'm not sure that LMDB is installed by default on most systems anyway?) so opted to not use it.
chose to keep the version in step with LMDB (current looks like 0.9.70 in the lmdb.h).
will regenerate anytime i become aware of a version bump for LMDB.
LMDB Python maintainer here. 0.9.70 is not really the latest release. It isn't binary compatible with official "releases" though it has sneaked out in a few bindings.
Only slightly off topic, but any of these support fast concurrent writes? LMDB really slows down for me when I'm trying to write a lot to it and locking means i cant use concurrency to speed it up.
Batching helps, and if you can avoid syncing after every write that helps too. In practice I get great throughout, but then I’m OK with losing a small amount of data if my server dies during writes, as long as that doesn’t lead to corruption, which it doesn’t with MDB_NOSYNC and kernel guided fsyncing
> Symas LMDB is an extraordinarily fast, memory-efficient database we developed for the OpenLDAP Project. With memory-mapped files, LMDB has the read performance of a pure in-memory database while retaining the persistence of standard disk-based databases.
> Bottom line, with only 32KB of object code, LMDB may seem tiny. But it’s the right 32KB. Compact and efficient are two sides of a coin; that’s part of what makes LMDB so powerful.
Still, I'd really like to make use of it. But for what? Caching?
reply