
Replicating SQLite using Raft Consensus - xena
http://www.philipotoole.com/replicating-sqlite-using-raft-consensus/
======
otoolep
Hello -- rqlite author here.

It was definitely interesting to code this. It was 50% to meet a need, 50% to
better understand Raft consensus in general, and Ben Johnson's go-raft
implementation in particular
([https://github.com/goraft/raft](https://github.com/goraft/raft)). As it
happens Ben and I now work together, on InfluxDB
([http://influxdb.com/](http://influxdb.com/)), which has a new Raft
implementation at its core.

Definitely fun.

------
teraflop
This exact problem was the final challenge of last year's Stripe CTF contest.
They put together a really cool test harness that would systematically probe
for any possible way that an unreliable network could cause the system to
fail. In order to complete the challenge, you had to achieve a desired
throughput without ever allowing the replicas to become inconsistent.

Unfortunately Stripe doesn't seem to have kept the main site[1] online, but
the test harness is still available on Github[2].

[1]: [https://stripe-ctf.com/](https://stripe-ctf.com/) [2]:
[https://github.com/stripe-ctf/octopus](https://github.com/stripe-ctf/octopus)

------
kijin
> _While it is possible to continually copy the SQLite file to a backup server
> every time it is changed, this file-copy must not take place while the
> database is being accessed._

What about the `.dump` command?

I've been using it to backup SQLite databases since forever. It produces an
SQL script (similar to `mysqldump`) that you can import back into another
SQLite database, or even into another RDBMS. It's much more elegant than
copying the database file itself.

But come to think of it, I never really checked whether the `.dump` command
produces a consistent backup, and the official documentation doesn't seem to
answer that, either. If you're familiar with the internals of SQLite, please
let me know whether I should keep trusting `.dump`.

~~~
jakobegger
I'm not sure a backup that was never tested qualifies as backup.

~~~
kijin
The backups do restore properly all the time, I made sure of that.

I simply don't have enough transaction volume to test all the edge cases that
might produce an inconsistent dump (and if I did, I wouldn't be using SQLite).

------
StavrosK
SQLite is amazing. I wonder if it's usable as a replacement for Postgres for
web apps, I'm not sure what happens when multiple processes try to write to
it.

Regardless, this is a very interesting project, good job.

~~~
pavpanchekha
Multithreaded use is not supported. SQLite is not a replacement for
Postgres—Postgres supports many advanced data structures (lists!), SQL
commands, and is faster. But in certain cases, it is enough, and of course
setting up SQLite is much easier.

~~~
teraflop
> Multithreaded use is not supported

To be more precise, it's perfectly legal to use SQLite from multiple threads
or processes at the same time. It will use its own mutexes and OS file-locking
operations to ensure that at any given time, a database is being accessed by
either a single writer, or an arbitrary number of readers.

In my opinion, the bigger issue is that clients can only safely access a
SQLite database if it exists on a local filesystem. As soon as you decide you
want to run multiple web servers, you're in trouble.

~~~
StavrosK
Sorry, yes, that's what I meant. I use it for a mostly-read-only Web app, but
wasn't sure if it's a disaster waiting to happen. Good to know it's a viable
alternative to postgres in multithreaded or multiprocess scenarios.

~~~
reitanqild
I ran it as a replacement for MySQL for 3 years, light usage php web app,
around 100 - 200 users I guess.

It saved me quite some hassle and reduced the attack surface compared to
MySQL.

FWIW I never bumped into speed issues. My biggest issue against it is that it
would let me alter columns directly which effectively sabotaged my use of
alembic when I started using Python/flask.

~~~
StavrosK
Huh, that sounds like it worked very well. Plus, the reduced overhead of
backups etc is definitely worth it for small apps.

------
tveita
This looks like statement based replication, which can come out of sync if
you're not careful to make your statements deterministic.

Any use of random(), as a trivial example.

------
lsb
This is pretty cool!

Interestingly, once SQLite 4 comes out you'd probably implement this the
opposite way, using Raft consensus (and something like etcd) for block
storage, and using the SQL engine over arbitrary pluggable block storage.

~~~
andrewstuart
sqlite4 seems not to be moving so fast - has it stalled?

------
biokoda
ActorDB takes this concept much further. It replicates the WAL file (not sql
statements).

[http://www.actordb.com](http://www.actordb.com)

[https://github.com/biokoda/actordb](https://github.com/biokoda/actordb)

------
shazeline
I skimmed the source and didn't see anything regarding leader election. Am I
missing something or is this implementation encapsulating some of the
underlying terminology/analogies traditionally used by Raft?

~~~
masklinn
Wouldn't that be done by the underlying Raft implementation (goraft)?

~~~
otoolep
Yes, that's right. 'go-raft' is imported by rqlite, and uses that package for
consensus and leader election.

