Aren't they interested in persistence of the key-value data? In my experience, once data is persisted to disk or SSD, LMDB is way slower from alternatives because it needs to operate in synchronous mode to avoid corruption (effectively flushing to disk after after every transaction committed). If operated in the non-default MDB_NOSYNC mode (which is the mode chosen in the above benchmarks), then there is a high probability to be left with an unreadable database file after a crash, thus losing all your data.
It is not fair to compare with other databases in sync mode, since they might operate safe but faster in async mode. For example sqlite with PRAGMA journal_mode=WAL and PRAGMA synchronous=NORMAL can operate in semi-asynchronous mode (fsync()ing sporadically) without fear of corruption in case of crash, because it keeps a WAL journal and is able to properly roll-back after a crash. This should be much faster than LMDB's default-and-safe synchronous mode, that msync()s on every value written.
Would remove the MDB_NOSYNC then, and see how it goes...
And LMDB beats the crap out of SQLite, in any mode.
Replacing SQLite's Btree engine with LMDB makes the SQLite footprint smaller, faster, and more reliable too.
Alright, I went through this page to see why my experience is different. While it mentions that they enabled SQLite's WAL journal, it also mentions that the synchronous writes were performed with PRAGMA synchronous=FULL.
I believe that if they set PRAGMA synchronous=NORMAL, they will get an ACI-reliable database that is way faster than LMDB with write-to-disk workloads.
Most of the measurements on that page are not really useful to me; they do not persist the data (db on tmpfs) or they do not care about reliability, using dangerous settings. Only few measurements are both persisting the data and writing safely, but the SQLite configuration is sub-optimal for them.
For my purposes as an application developer, I care about comparing databases operating in safe mode i.e. a system crash should never cause total data loss. According to my experience SQLite's safe mode is many times faster than LMDB's safe mode, with a write workload. While LMDB is thrashing the disk and achieves only a handful of write transactions per second.
LMDB is not.
"using a 512GB Samsung 830 SSD and an ext4 partition.
The actual drive characteristics should not matter because the test datasets still fit entirely in RAM and are all using asynchronous writes"
Here is the comparison back in 2017: https://blog.dgraph.io/post/badger-lmdb-boltdb/
It supports concurrent ACID transactions with serializable snapshot isolation (SSI) guarantees.
I assumed this meant an API callable from webasm/js. Did I miss something?
"Not-yet or never goals for this proposal are:
Standardization via a standards body as a web API."
So this is "internal" stuff I guess.
Edit: sadly, it looks like you've been posting quite a few uncivil comments to HN. We ban accounts that do that, so would you please review https://news.ycombinator.com/newsguidelines.html and follow the rules from now on?
The idea here is: if you have a substantive point to make, make it respectfully and thoughtfully; if you don't, please don't comment until you do.
Regarding NFS: I have recently started testing LMDB on NFS v4 and had no issues so far, but with a single process using the database. AFAIK the warning at http://www.lmdb.tech/doc/ is only for multiple processes using the DB concurrently. I am still not entirely sure there won't be any mmap-related issues, but so far so good.
Regarding "being careful": this is a very important point. The LMDB API does not hold your hand, it lets you do dangerous things which will result in corruption of your database, which you will discover too late. I suggest writing a wrapper around the API to ensure you are using it correctly. (I wish there was a compile flag like LUA_USE_APICHECK  for LMDB, which could help detect problems like this, but there isn't.)
Not sure what you're talking about re: the API letting you corrupt your database.
LMDB automatically detects read-only filesystems, and turns off its locking in that case, so it should perform as well as anyone could expect NFS to perform.
C doesn't have destructors. If you free stuff in the wrong order, you lose. That's nothing to do with LMDB's API. Even if LMDB cleaned everything up internally, you'd still have dangling pointers in your app space. No language or API can prevent that.
The only thing you can do in the API that might possibly corrupt the DB is to use MDB_RESERVE and then store a value bigger than the space you reserved. That generally causes a SEGV, and you'll discover very quickly that you have a bug in your code. LMDB will fail fast, every time, and every failure will be a bug in your own code. Makes debugging very quick.
To be honest the only real corruption issues I had with LMDB in practice were dues to me using MDB_NOSYNC, and broken mmap behavior on some Android devices (on external storage).
1) what kind of problems do databases actually face.
2) what kind of scenarios create those problems.
3) how does a programmer go about testing them?
So, let's say you have a client/server application... the client is telling the server (database) to write some records to the database. In the middle of the write, you pull the plug. Some questions you'd want to know: what does the database look like when it restarts? Can we read it? What is the current state? Did any of the new data get written? What does the client think was written? If there was an uncommitted database transaction, was the database left unaltered?
It's just as important to test the client in these scenarios. While the server may have crashed, what does the client think happened? Was it waiting for an ACK or "OK" message? Did it get the message? If the update failed, what does the client do in that situation?
Things can get even more complicated if you're thinking of replication across different servers. If one of the servers fails, how does the replication work? Do sessions fail over to other servers? How many servers are required? If there was a corrupted record, did it propagate or was it scrubbed?
To you and others, are there any other scenarios too that happen in production?
The disk become inoperative during a write, this can be either silent or writes start to return errors. Again, how does the database look like after the problem is solved.
A large operation exceeds the capacity of the server to deal with intermediary state. It runs out of memory, disk, or in some not great DBs it loses control of some locks and gets deadlocked. Can it recover with only the partial log data?
Disks lie about data being written, what happens if one of the problems happen between the disk saying the data was written and it actually getting written?
And, of course, when you move beyond a single server things get way more complex.
There is more than one implementation of LevelDB. This one (https://github.com/syndtr/goleveldb), in Go, is used in major projects such as https://github.com/ethereum/go-ethereum.
Batch writes provide a tiny subset of the full possibilities of transactions. While sufficient in many cases, that cannot be generalized to "LevelDB supports transactions".
Schizophrenic people don’t accept facts like “this wall exists”, yet we still reach agreement as a society that a wall exists and don’t try to walk through walls because they exist.
You are of course welcome to hold any belief you wish, but believing something that puts you in direct contradiction to an entire industry significantly raises the bar of proof you must provide in order to convince others to listen to you.
Your reply does not provide that proof, and thus your argument is not persuasive.
But that's just like, the industry's opinion, man.
Because if I read this, I could conclude LMDB could have troubles in that areas...
"One appealing aspect of LMDB is its relative ease of use from multiple processes, above and beyond its basic capabilities as yet-another-fast-key-value-store."
simdb is only lock free and thread safe. While LMDB is benchmarked at around 10k writes, this should be able to do millions of mixed reads and writes with 4 modern cores. LMDB seems to use a separate lock file to sync multiple threads/processes. The only catch here is that the keys aren't sorted, which doesn't seem to be a requirement of theirs.
Disclaimer: I'm one of the last people to make functional changes to mork.
"We propose ‘buying’, not building, the core of such a solution, and wrapping it in idiomatic libraries that we can use on all platforms.
We propose that LMDB is a suitable core (see Appendix A for options considered): it is compact (32KB of object code), well-tested, professionally maintained, reliable, portable, scales well, and is exceptionally fast for our kinds of load. We have engineers at Mozilla with prior experience with LMDB, and their feedback is entirely positive."
Who knows how it should work on 32-bit systems?
And isn't endianess also a problem? And doesn't SQLite solve both problems by default? And isn't also possible to configure SQLite to be very fast, if one know what he's doing?
Btw I'd expected that the "notes here" https://docs.google.com/document/d/1bwbpqPb58a0GcEyB4W424pyf... are conclusions, but the document seems to be not publicly accessible.
And what about the limitations on the 32-bit system? Isn't there also needed to use memory space of RAM for the complete size of the database, that's how that database works if I understood? Which makes LMDB effectively unsuitable for 32-bit systems:
"your user will need to either enable PAE on their system, or upgrade to 64-bit CPU. If neither of these is an option in your application, then you cannot use a memory mapped file larger than your available address space"
In short it still looks that LMDB is designed effectively only for one-endianness and only for 64-bit systems, which is still limiting for many use cases.
Again, it seems that for LMDB is again "solving" the problem by ignoring it. Which is OK if it fits your use case... But shouldn't Firefox work properly on 32-bit systems? Or did they completely decide that they don't want to target any 32-bit platform anymore?
Moreover, there are use-case scenarios where the memory mapped file approach can suffer from problems of unnecessarily reading the page that will anyway be completely overwritten. My conclusion is... if LMDB "works for you" fine, but do properly your research first. I still believe SQLite covers much more use cases and is safer starting point for many of them, including Firefox use cases, until I read that they really decided to reduce these.
There are no limitations on 32bit systems using MDB_VL32 in the 1.0 branch, you're spewing crap without any idea what you're talking about. Whoever's writing on stackoverflow doesn't know what they're talking about either.
> Moreover, there are use-case scenarios where the memory mapped file approach can suffer from problems of unnecessarily reading the page that will anyway be completely overwritten.
That's only true if you use a writable mmap, and the DB is larger than RAM. The default for LMDB is not to use a writable mmap, so this isn't an issue that affects most LMDB users. It is also already documented, so you choose it at your own risk.
Yes and no, respectively. Yes, Firefox should work properly on 32-bit systems; and no, we didn't completely decide that we don't want to target 32-bit platforms anymore.
Rather, Firefox plans to use LMDB where it fits one of its many use cases for persistent storage (and not use it where it doesn't).
The StackOverflow thread you referenced describes a blockchain program that expects to use "a few GB of lmdb diskspace." Whereas Firefox's use cases for LMDB are sized in the range of a few KiB to a few MiB. Firefox has no plans to start storing a blockchain.
Claimer: I'm the engineer integrating LMDB into Firefox.
I’ve successfully used LMDB in 32-bit land, though it takes some effort. I had to scan available virtual memory for the largest contiguous chunk and use that.
Bigger issues are growing the database size and performance in low memory. No transactions can be running to increase the map size, which is usually hard to coordinate.
Also, in low memory situations, LMDB’s performance suffers tremendously. It can be 100x slower, and commits can take seconds. You won’t run into it unless you are really hammering it, and the developer usually won’t notice because they usually have lots of RAM.
Even when the DB is 5x or 50x larger than RAM...
If any single commit takes multiple seconds, you've probably written too much data in a single transaction. Above about 512MB it starts preemptively flushing intermediate pages out, to keep the in-RAM footprint limited. Some of those intermediate pages will need to be read back in if your transactions are hitting all over the DB. This is where the main slowdown comes.