
Distributed SQLite for Go applications - ofrzeta
https://github.com/CanonicalLtd/dqlite
======
lobster_johnson
It's fascinating how Raft has democratized distributed computing -- writing a
consistent, distributed state machine now can be done in a few lines of code,
assuming you have a Raft implementation lying around, like Hashicorp's
excellent Go library.

What this project (and similar projects such as Rqlite) doesn't address is the
hard problem -- sharding. Raft makes it quite trivial to write a master/slave
system where every node is guaranteed to be identical, but it won't scale
writes and won't distribute storage across your cluster.

You can build sharding on top of Raft, but in an SQL setting this would
require distributed transactions, something that's difficult enough that
relatively few projects have tackled it so far, the notable ones being
CockroachDB and TiDB. (Google Spanner, being proprietary, is not relevant in
this context.)

~~~
jxub
An excellent interactive explanation of Raft can be found here:
[http://thesecretlivesofdata.com/raft/](http://thesecretlivesofdata.com/raft/).
It would be interesting as hell to build a small implementation in the
language of choice (elixir?, pony?).

~~~
usermac
That was good to see. Thank you.

------
hardwaresofton
Damn it, I was going to work on exactly this -- my goals were slightly
different but basically the same paxos/raft/gossip + sqlite and see how far I
could take it.

Note for those who are interested in cool-stuff-with-SQLite, there's a similar
project called RQLite (mentioned in the README):

[https://github.com/rqlite/rqlite](https://github.com/rqlite/rqlite)

The main differences are laid out in the README, but IMO it really all boils
down to the fact that since it's single-writer quorum'd there's only one node
actually doing writes, and they've just chosen to replicate right-before
(right after?) the WAL commit (I think this is what they're calling frames,
basically one chunk of the WAL content), not at the point of receiving a
query. This is more like having a read secondary more than anything (I also
assume writes are redirected to master).

It says in the readme that it should be expected to be slower than RQlite, I'd
love to see some numbers.

I do really like this though -- can't wait to pour through the code and see if
there's anything I can learn.

~~~
free-ekanayaka
Hello, author of dqlite here.

Things are still in flux, but yes, I plan to publish benchmarks before making
a 1.0 release, as well as improving documentations and introduce some more
abstractions to make it easier to use.

The reason it might be slower is some cases is that RQlite replicates
statements and dqlite replicates WAL frames (where "frames" roughly means
"disk pages"), which are typically bigger.

I didn't run benchmarks against RQlite yet, but I'd expect performance
differences to be negligible for most use cases.

~~~
hardwaresofton
Hey thanks for putting this out there, it's a pretty cool project.

I see what you meant by it being slower, but that's only replication speed
right? maybe that should be pointed out in the documentation when you find
time to update it.

Does it seem like the replication patch is going to land in Sqlite? I sure
would like it to...

~~~
free-ekanayaka
"slower" means this:

when you perform a SQLite write, SQLite writes a new page on disk to the its
write-ahead log. dqlite needs to replicate that page write across all nodes. A
page is typically 4kb, so when you do a write on the leader node you need to
transfer 4kb across the wire to all other follower nodes and wait for a quorum
of them to come back and say "I got it" (which includes writing the page to
disk). On the contrary rqlite just needs to transfer and store the SQL text,
which is typically less than 4kb.

In practice it's still pretty performant, but I'll publish benchmarks later
on.

I plan to submit the patch to SQLite upstream too, yes.

------
bbayer
sqlite is fantastic piece of software. I am using it in every single side
projects of mine. I can not describe how I satisfied with results even in
relatively high load (300K pages/day on $5 Linode VPS). I found that most of
time using a full featured RDBMS is not necessary depending on use case.(I am
doing mostly reads)

~~~
majewsky
> 300K pages/day

Does that qualify as "high load"? That's just below 4 pages/second. Given how
beefy server CPUs and SSDs are, I would consider something like 100+
requests/second "high load".

~~~
bbayer
You are right. That is why I said "relatively". It is relatively high for my
setup.($5 linode with 2gb ram that runs 4 instance gunicorn server.)

------
nudpiedo
I wonder how this kind of project can be tested in order to ensure the
correctness of the use cases and error cases.

does anyone know a resource to learn more about testing in this kind of
situation? I hope there is an alternative to just try all cases in different
real hardware setups (which is the mantra at my company)

~~~
lifty
One very good resource are the writings of Aphyr where he tests various
distributed databases:
[https://aphyr.com/tags/jepsen](https://aphyr.com/tags/jepsen) . Jepsen is the
framework he wrote to test them. Next would be formal verification but you get
very far by understanding the distributed db semantics and applying practical
tests.

------
notamy
After looking through the repo, I'm still not sure exactly _why_ you would
want / need this. What exactly is the use-case here?

~~~
free-ekanayaka
If your application is:

1) Distributed (e.g. n equal processes running on n machines) 2) Needs some
shared state across nodes 3) Would like that state to have SQL semantics
(relations, transactions, etc.) 4) Not to heavy on writes to this shared state
(I might provide some ballpark numbers at some point) 5) Wants to avoid the
operational overhead of an external storage system (mysql, postgresql) 6)
Wants to be fault-tolerant and have transparent failovers

then dqlite might be a good choice.

~~~
notamy
Interesting. Thank you for the explanation!

------
PirateBay
Projects like this(sparsely documented, tiny) always seem to be just internal
tools that the company just threw out into the world just because they can.
Not that I'm ungrateful, but without any sort of explanation as to why I'd use
this over a clustered MySQL or Postgres it's a little hard to make heads or
tails about whether I should care.

~~~
eloff
It's work someone else did and shared with you for free. It strikes me as lazy
to complain about how you don't know what to do with it and want everything
spoon-fed to you. It's go+sqlite+raft. That means it's an embedded database
for Go that can be used across a cluster of Go processes while maintaining
ACID goodness. I got that out of it in 2 minutes of looking at the readme, I
think you could manage the same if you take the time it took you here to make
a new account and comment.

~~~
PirateBay
I don't ever remember complaining, just saying I have no idea what it's even
useful for. This is usually where a good, comprehensive, thoughtful readme
helps people figure out why they should care about a project over others that
do the same thing and what sets it apart. But I guess expecting explanations
for libraries is spoonfeeding. Oh well!

------
throwaway5752
How does it compare to RocksDB?

edit: I'm sorry, I was thinking of BedrockDB.

~~~
free-ekanayaka
BedrockDB requires to operate a separate process, whereas you can embed dqlite
in your Go application (pretty much in the SQLite philosophy). Also, afaik
BedrockDB patches upstream SQLite with some more intrusive changes than dqlite
(e.g. for supporting concurrent writes). The SQLite patch that dqlite requires
is pretty minimal and just adds hooks to internal WAL events.

~~~
anentropic
so the use case is: you're building a distributed Go application that needs
some shared state in the form of an SQL db?

~~~
free-ekanayaka
Yes, dqlite is currently used for clustering support in LXD 3.0:
[https://github.com/lxc/lxd](https://github.com/lxc/lxd)

------
c17r
Reminds me of the last Stripe CTF. I miss those.

