Regarding the LSM engine, you can find all the relevant implementation details here: https://sqlite.org/src4/doc/trunk/www/lsm.wiki#summary
> The in-memory tree is an append-only red-black tree structure used to stage user data that has not yet flushed into the database file by the system. Under normal circumstances, the in-memory tree is not allowed to grow very large.
SQLite: The Database at the Edge of the Network with Dr. Richard Hipp
> The in-memory tree is an append-only red-black tree structure used to stage user data that has not yet flushed into the database file by the system.
Hmm, ok, so this contradicts my assumption. Actually, now that I think about it, other LSMs like rocksdb / leveldb work like this too (library-like model with some in-memory component when you "open" the database).
Anyway, without diving into the details of the code, there's other technical decisions that would affect this stuff.
One thing is how big your in-memory structure is (in relation to available memory and insertion workload) and how often you flush to disk is a key thing. Another thing is what your LSM tree looks like - aside from the data structure, how many tiers/levels you have is a big thing. I assume some of these are configurable parameters. E.g. rocksdb has an enormous set of parameters that handles this stuff. It's also annoying to tune.
I found this benchmark here that is illustrative: https://sqlite.org/src4/doc/trunk/www/lsmperf.wiki
The first graph is underwhelming, but when you adjust the buffers (look at the last graph) ~250k writes / second constant regardless of database size (this is why you want an LSM tree) is darned good! And this is on a spinny drive, not an SSD. Their "large" buffer sizes aren't even that large IMHO.
So maybe his mention that the LSM storage was underwhelming was overblown :-) I don't know.
Another difference is with other LSM-based systems that aren't just key-value, it's usually in the context of column stores: you keep a separate LSM for each column family (could be 1-n columns). But I can't think off the top of my head how this would cause a difference. Perhaps in how reads happen - the query engines work quite differently.
Anyway, my talk is cheap, I'm just guessing here, actually doing the analysis is the hard work :-) Also, I'm something of an amateur currently, so take my words with a grain of salt. Anyone else have any ideas re: this?
One big difference is that fossil includes wiki and ticketing.
Philosophy differs: Fossil intentionally limits "distributedness". For example, fossil push/pull always transfers all branches with their name. Private branches are considered an anti-feature.
Minor differences are the licence (GPL vs BSD) and the data store (files vs sqlite). Under some circumstances these details matter, but not for the majority of developers.
The rest is not significant, imho. For example, "Lots of little tools vs stand-alone executable". Who cares? In both cases you type "$VCS $SUBCOMMAND $PARAMETERS" commands.
Stand-alone executable is pretty significant. Git is available on most servers -- fossil is not. If it's packaged in your OS, it's often outdated. Stand-alone kinda makes up for this as you can easily get the latest version with a wget & chmod on any computer, on all 3 platforms.
As for sqlite, it is an astoundingly solid rdbms that is well battle-tested. I consider that a big difference.
He refers to himself as D. Richard Hipp.
The weird error may have been mine.