Hello HN! I'm building mvsqlite, a distributed variant of SQLite with MVCC transactions, that runs on FoundationDB. It is a drop-in replacement that just needs an `LD_PRELOAD` for existing applications using SQLite.
I made this because Blueboat (https://github.com/losfair/blueboat) needs a native SQL interface to persistent data. Apparently, just providing a transactional key-value store isn’t enough - it is more easy and efficient to build complex business logic on an SQL database, and it seems necessary to bring a self-hostable distributed SQL DB onto the platform. Since FoundationDB is Blueboat’s only stateful external dependency, I decided to build the SQL capabilities on top of it.
At its core, mvsqlite’s storage engine, mvstore, is a multi-version page store built on FoundationDB. It addresses the duration and size limits (5 secs, 10 MB) of FDB transactions, by handling multi-versioning itself. Pages are fully versioned, so they are always snapshot-readable in the future. An SQLite transaction fetches the read version during `BEGIN TRANSACTION`, and this version is used as the per-page range scan upper bound in future page read requests.
For writes, pages are first written to a content-addressed store keyed by the page's hash. At commit, hashes of each written page in the SQLite transaction is written to the page index in a single FDB transaction to preserve atomicity. With 8K pages and ~60B per key-value entry in the page index, each SQLite transaction can be as large as 1.3 GB (compared to FDB's native txn size limit of 10 MB).
mvsqlite is not yet "production-ready", since it hasn’t received enough testing, and I may still have a few changes to make to the on-disk format. But please ask here if you have any questions!
I’m guessing virtual table extensions work with this since you’re just replacing the storage engine? So we could in theory use FTS5 and even OSQuery and other extensions right?
However since this is using FoundationDB I’m also guessing we can’t use this as a serverless embedded DB since since you’ll probably need a foundation db cluster to use this. Is that right?
So if I understand correctly this is a SQLite query engine on top of FoundationDB with distributed transactions and we can theoretically use SQLite ecosystem stuff like FTS5 and datasette on top of it.