Up to date Rust bindings for LMDB

Pesthuf · 2024-10-02T00:03:21.000000Z

What kinds of use cases do you all use LMDB or other key-value stores for? I can somehow never really find a good one. Almost all databases, no matter how trivial, will sooner rather than later need more features so I always reach for SQLite instead.

Still, I'd really like to make use of it. But for what? Caching?

topspin · 2024-10-02T02:16:10.000000Z

It was originally designed for LDAP, so there is one example. Using a relational system almost certainly means you're using key-value structures indirectly. That's how table indexes are often implemented: a map from some key to a row coordinate (or what have you.) LMDB has been used as a backend for SQLite.

Key-value stores are primitives. Sometimes your use case needs nothing more elaborate than this primitive. Sometimes you have no choice but to limit yourself to that primitive because there are no resources for anything more complex. Often you're using it as a component of a larger storage scheme.

Mail systems are an example where you're forever looking up senders and/or recipients for some policy. Recently I've dealt with GeoIP stuff, where you're constantly resolving network addresses against a huge map. Bloom filters are a great use of a key-value store for large data sets.

Understand that often there is a lot of custom code around queries and other operations that use key-value stores. You might, for example, have some tuple that can map to multiple values in a sparse map. In that case you can iterate over variants of the tuple (selectively remove, normalize or otherwise alter parts of it,) and make multiple queries according to some precedence rule until you get a hit (or not.) There are many such tricks that will greatly extend the utility of a map.

LMDB in particular is worth understanding, even if you never use it. It has an extremely simple design that essentially makes the OS page cache into an ACID database with concurrent readers that are not blocked by writes or require coordination, and has no "transaction log." Pretty astonishing given that it boils down to about 10K LOC.

bvrmn · 2024-10-02T10:38:56.000000Z

LMDB is kinda unique in KV segment. It's fast, gives access to zerocopy values, full ACID. I use it for inverted index where SQLite quite sucks.

The only issue with LMDB is a size reservation for a db file.

nightfly · 2024-10-02T01:51:52.000000Z

I've never used LMDB directly myself, but it's used as the backing store for stuff like some LDAP servers. You get a high performance low-level layer and anything extra you build on top of it yourself

seanwatters · 2024-09-28T06:01:25.000000Z

not sure why all the existing crates have gone so far out of date. i feel like generating bindings isn't that hard if you've done it before but can be a blocker for people who haven't so maybe this will help.

some of the existing crates had the -sys suffix but it didn't seem like they were actually checking for it on the system (i'm not sure that LMDB is installed by default on most systems anyway?) so opted to not use it.

chose to keep the version in step with LMDB (current looks like 0.9.70 in the lmdb.h).

will regenerate anytime i become aware of a version bump for LMDB.

jnwatson · 2024-10-01T23:14:26.000000Z

LMDB Python maintainer here. 0.9.70 is not really the latest release. It isn't binary compatible with official "releases" though it has sneaked out in a few bindings.

Latest is 0.9.31.

bvrmn · 2024-10-02T10:43:36.000000Z

Thank you for your work! I'm a user of py-lmdb for 8 years.

montymintypie · 2024-10-02T04:56:02.000000Z

I always saw the "-sys" suffix as "raw, unsafe bindings", not pulling it from the system. And non-sys is your safe Rust abstraction.

Shawnecy · 2024-09-28T06:15:23.000000Z

> not sure why all the existing crates have gone so far out of date.

Pure speculation on my part: maybe people are choosing Rust-based embedded key value stores like sled or redb?

sbt567 · 2024-09-28T07:35:35.000000Z

Or https://lib.rs/crates/fjall

jszymborski · 2024-10-01T22:47:35.000000Z

Only slightly off topic, but any of these support fast concurrent writes? LMDB really slows down for me when I'm trying to write a lot to it and locking means i cant use concurrency to speed it up.

nahnahno · 2024-10-02T01:30:32.000000Z

Batching helps, and if you can avoid syncing after every write that helps too. In practice I get great throughout, but then I’m OK with losing a small amount of data if my server dies during writes, as long as that doesn’t lead to corruption, which it doesn’t with MDB_NOSYNC and kernel guided fsyncing

mmastrac · 2024-10-01T22:53:21.000000Z

For those of us who don't know, LMDB is a DB, somewhat akin to sqlite or Berkeley DB. It took a bit of time to find the info by chasing links.

Please add READMEs to your projects with appropriate links, folks.

https://crates.io/crates/liblmdb <-- says it is bindings for LMDB

... which links to the repo

https://github.com/ordinarylabs/liblmdb

Which has a link to the LMDB repo here (just a submodule) ...

https://github.com/LMDB/lmdb/tree/9c9d34558cc438f99aebd1ab58...

Which has no info about the project, but links to

https://www.openldap.org/software/repo.html

Which also has no info about the project, but has an LMDB link which leads to

https://www.symas.com/lmdb

Which finally explains:

> Symas LMDB is an extraordinarily fast, memory-efficient database we developed for the OpenLDAP Project. With memory-mapped files, LMDB has the read performance of a pure in-memory database while retaining the persistence of standard disk-based databases.

> Bottom line, with only 32KB of object code, LMDB may seem tiny. But it’s the right 32KB. Compact and efficient are two sides of a coin; that’s part of what makes LMDB so powerful.

There's also a wikipage:

https://en.wikipedia.org/wiki/Lightning_Memory-Mapped_Databa...

diggan · 2024-10-01T23:12:00.000000Z

Funny how people use the web differently :)

I also saw "LMDB", didn't knew what it was, searched LMDB on Kagi, got https://en.wikipedia.org/wiki/Lightning_Memory-Mapped_Databa... as the first hit. Bing and Google also has it in the top, seems to be relatively known.