
Dqlite – High-Availability SQLite - stubish
https://dqlite.io/
======
bb88
Interesting nugget about golang from their FAQ:

[https://github.com/canonical/dqlite/blob/master/doc/faq.md](https://github.com/canonical/dqlite/blob/master/doc/faq.md)

Why C?

The first prototype implementation of dqlite was in Go, leveraging the
hashicorp/raft implementation of the Raft algorithm. The project was later
rewritten entirely in C because of performance problems due to the way Go
interoperates with C: Go considers a function call into C that lasts more than
~20 microseconds as a blocking system call, in that case it will put the
goroutine running that C call in waiting queue and resuming it will
effectively cause a context switch, degrading performance (since there were a
lot of them happening). See also this issue in the Go bug tracker.

The added benefit of the rewrite in C is that it's now easy to embed dqlite
into project written in effectively any language, since all major languages
have provisions to create C bindings.

~~~
MuffinFlavored
> since all major languages have provisions to create C bindings.

Does WebAssembly or any of its runtimes provide a way to do this?

~~~
szczepano
you can probably compile c to webassembly using clang/llvm

~~~
derefr
That's not what the parent was asking. They were asking if WASM has an FFI
interface for runtimes that want to _not_ execute the WASM code entirely
sandboxed from the host, but rather want to allow you to dlload(2) code into
the OS process hosting the WASM interpreter and call it through WASM ops.

Presumably,
[https://github.com/CraneStation/wasmtime](https://github.com/CraneStation/wasmtime)
would benefit from such an FFI API being specified.

~~~
zbentley
That sounds like ActiveX 2.0 at best, and an utter security nightmare at
worst.

------
emmelaich
I was going to ask what is the difference between this and rqlite, which also
uses Raft.

Found the answer on Reddit:

> rqlite is a full RDBMS application, but dqlite is a library you must link
> with other code. It's like the difference between MySQL and libsqlite3.so.

* [https://www.reddit.com/r/golang/comments/8a8h8y/dqlite_distr...](https://www.reddit.com/r/golang/comments/8a8h8y/dqlite_distributed_sqlite_for_go_applications/dxg4why/)

~~~
hardwaresofton
There's one more big distinction, rqlite's replication is command based [0]
where as dqlite is/was WAL frame-based -- so basically one ships the command
and the other ships WAL frames. This distinction means that non-deterministic
commands (ex. `RANDOM()`) will work differently.

It looks like dqlite's documentation has changed -- for some reason frames are
no longer mentioned anywhere[2]. So _maybe_ this isn't the case any more, but
this was once the biggest differentiator for me.

[0]:
[https://github.com/rqlite/rqlite/#limitations](https://github.com/rqlite/rqlite/#limitations)

[1]:
[https://webcache.googleusercontent.com/search?q=cache:p1XBgh...](https://webcache.googleusercontent.com/search?q=cache:p1XBghg1YZEJ:https://github.com/CanonicalLtd/go-
dqlite/blob/master/README.md+&cd=1&hl=en&ct=clnk&gl=us)

[2]:
[https://github.com/canonical/dqlite/commit/35ea7cd56e93a36c5...](https://github.com/canonical/dqlite/commit/35ea7cd56e93a36c5e640f1d33dbe024c4c4fa29#diff-04c6e90faac2675aa89e2176d2eec7d8)

~~~
rlpb
> dqlite is/was WAL frame-based...maybe this isn't the case any more

According to
[https://github.com/canonical/dqlite/blob/master/doc/faq.md](https://github.com/canonical/dqlite/blob/master/doc/faq.md)
this is still the case.

~~~
hardwaresofton
Ahhhh thank you -- that information just got pushed into the FAQ -- I was
thinking "surely they didn't _just_ remove this information" but didn't look
hard enough at all. Direct link:

[https://github.com/canonical/dqlite/blob/master/doc/faq.md#h...](https://github.com/canonical/dqlite/blob/master/doc/faq.md#how-
does-dqlite-compare-to-rqlite)

------
dragonsh
This is developed by LXD team for it's cluster. It's used by us in production
as a part of LXD cluster. Initially there were some issues but now it can
support thousands of nodes in a cluster easily in our regression tests.

It's good they made it as a separate project can be used independent of LXD
containers.

------
Scorpiion
I was reading through the docs and this FAQ is worth checking out:
[https://github.com/canonical/dqlite/blob/master/doc/faq.md](https://github.com/canonical/dqlite/blob/master/doc/faq.md)

In includes an answer about the difference with rqlite.

To me reading the docs it seems like dqlite has been developed by the team who
develops LXD at Canonical as LXD is listed as the biggest user of the project
and it says on the authors github that he works at Canonical at/with LXD/LXC.

Interesting project, good luck to the author/authors if you read this!

~~~
otoolep
rqlite creator here.

Yes, good luck to the creators of this project, it looks very interesting and
I've been watching it for a few years now.

------
FreeHugs
The one annoying thing about SQLite is that there is no easy way to change the
table structure. Adding/Removing/Renaming columns is super complicated and
afaik there is no good command line tool that does it for you.

That is the primary reason why I do not consider it for new projects. It's
just to slow to iterate on.

~~~
Dowwie
That's a hell of a reason not to use sqlite. Staging data in a temporary table
while a table is dropped, recreated, and then data is reinserted is not much
of an inconvenience.

~~~
FreeHugs
It is. It is a bunch of complex commands. Just look at the proposed solutions
on Stackoverflow:

[https://stackoverflow.com/questions/8442147/how-to-delete-
or...](https://stackoverflow.com/questions/8442147/how-to-delete-or-add-
column-in-sqlite)

~~~
Dowwie
So I'll keep my opinion about this out of this response. To those who are
interested, the approved answer on that SO thread recommends:

    
    
        1. create new table as the one you are trying to change
        2. copy all data,
        3. drop old table,
        4. rename the new one.
    

It's up to others to decide whether that is complex enough to warrant avoiding
sqlite

~~~
Someone
That gets more cumbersome if the table has indexes (you will have to create
them on the new table), and even more cumbersome if foreign keys point to it
(you will have to drop them before step 3 and recreate them after step 4)

“Copy all data” also can be difficult if the table has data that the database
created that must stay the same because you use it elsewhere. That shouldn’t
be a problem with SQLite, as it doesn’t allow rowid as foreign key, but if you
use it as a foreign key outside the database, or use the hash of a full row to
detect changes, it may still bite you.

It also may mean being offline for a significant amount of time, but that also
often is (effectively) the case for databases that support deleting columns

~~~
Dowwie
good points

------
crusader1099
I really like the design of this website. It's simple, information-rich, fast,
and doesn't contain a ridiculous number of images or dynamic components. It's
a shame that I can only say this for a select few websites these days.

~~~
ken
Yet it's still loaded with hundreds of KB of custom fonts, because the
designers would rather I look at a blank page for a couple seconds than gaze
upon their design with a typeface that isn't exactly the same one they have on
their computer.

My resolution, window size, color settings, text zoom, font rendering, etc,
are almost certainly different, too, but at least they've made the page more
than twice as slow by forcing the correct font.

~~~
zamadatix
It's not so much the size as the way the font was defined and the trash
latency/TTFB of whatever they are hosting it on.

------
JacKTrocinskI
What can this be used for (example use cases)? Is there 24/7 support
available? How long has it been around and is there a commitment to long term
releases?

~~~
free-ekanayaka
It has been around for 2 years and half, but we released v1.0.0 just
yesterday.

I added answers to your other questions to the FAQ:

[https://github.com/canonical/dqlite/blob/master/doc/faq.md#w...](https://github.com/canonical/dqlite/blob/master/doc/faq.md#what-
are-the-use-cases)

[https://github.com/canonical/dqlite/blob/master/doc/faq.md#w...](https://github.com/canonical/dqlite/blob/master/doc/faq.md#what-
are-the-use-cases)

[https://github.com/canonical/dqlite/blob/master/doc/faq.md#i...](https://github.com/canonical/dqlite/blob/master/doc/faq.md#is-
there-a-commitment-to-long-term-releases)

~~~
JacKTrocinskI
Thanks, those are important items for me especially when recommending a new
technology to a client and/or boss, hopefully others will find it useful as
well.

------
nilsjuenemann
Hm. Just enterprise-grade. I need military-grade und planet-scale.

~~~
free-ekanayaka
Give us time, we'll get there :)

------
hamandcheese
What is the difference between this and rqlite?

[https://github.com/rqlite/rqlite](https://github.com/rqlite/rqlite)

~~~
duiker101
From the FAQ[1]

The main differences from rqlite are:

\- Embeddable in any language that can interoperate with C

\- Full support for transactions

\- No need for statements to be deterministic (e.g. you can use time())

\- Frame-based replication instead of statement-based replication

[1]
[https://github.com/canonical/dqlite/blob/master/doc/faq.md](https://github.com/canonical/dqlite/blob/master/doc/faq.md)

~~~
otoolep
rqlite creator here.

More fundamentally, as mentioned above, Dqlite is a library, whereas rqlite is
a RDBMS (albeit a pretty simple and lightweight one).

------
stickupkid
I used Dqlite for a side project[1], which replicates some of the features of
LXD. Was relatively easy to use, but Dqlite moves at some pace and trying to
keep up is quite "interesting". Anyway once I do end up getting time, I'm sure
it'll be advantageous to what I'm doing.

1\.
[https://github.com/spoke-d/thermionic](https://github.com/spoke-d/thermionic)

~~~
free-ekanayaka
Oh I had no idea somebody was using it in the wild! It has been unstable until
now, we just released v1.0.0 yesterday. So no more public API breakage from
now on.

------
zzzcpan
Since the author is in the comments, what are you planning to do about
operations: keeping consistent performance while adding/removing/resyncing
nodes, rebalancing, dealing with bitrot, disk errors, disk performance issues,
filesystem issues, dealing with unstable network performance, etc.? It doesn't
look like there is anything to address operations in the code at the moment.

------
nightfly
Can you query the DB on disk using the regular sqlite3 cli tool?

~~~
stickupkid
Yes you can, or you could when I was using < 1.0.0. It made developing with it
really easy.

------
bryanlarsen
Has anybody run the Jepsen distributed database tests against dqlite?

~~~
free-ekanayaka
Not that I know, but it's on my todo list (dqlite author here).

~~~
lima
Does sqlite have consistent reads (i.e. are reads guaranteed to be served by
the current leader)?

------
continuations
> fully async disk I/O

I thought Linux didn't support real async disk IO. Is that not the case?

If Linux has no real async disk IO, how does Dqlite achieve fully async disk
IO?

~~~
free-ekanayaka
The support of async disk I/O in Linux differs depending on kernel version and
file system type. But it is possible to get 100% async I/O with the
is_submit(), and dqlite will leverage that if detected.

There is now a new async I/O API available in Linux (I'm not remembering the
name right now, but it was developed by folks at Facebook). It looks promising
so I'll check it at some point. (dqlite author here)

~~~
brauner
Jens Axboe, io_uring :)

------
rockwotj
I hope the claim to being fully async I/O is just a buzz term, as it's no
longer supported in SQLite.

[https://www.sqlite.org/asyncvfs.html](https://www.sqlite.org/asyncvfs.html)

~~~
free-ekanayaka
It's not a buzz term. It's really fully async disk I/O. Dqlite does not use
SQLite's stock vfs implementation for writing to disk, as it's an entirely
different model (based on raft).

~~~
liability
> Q: _How does dqlite behave during conflict situations? Does Raft select a
> winning WAL write and any others in flight writes are aborted?_

> A: _There can’t be a conflict situation. Raft’s model is that only the
> leader can append new log entries, which translated to dqlite means that
> only the leader can write new WAL frames. So this means that any attempt to
> perform a write transaction on a non-leader node will fail with a
> ErrNotLeader error (and in this case clients are supposed to retry against
> whoever is the new leader)._

Correct me if I'm wrong, but isn't that essentially the same limitation WAL
mode in normal sqlite has? With WAL you can have as many reads going on as you
like, in parallel to a single write. That seems directly comparable to what
the dqlite FAQ says, unless I'm missing something.

------
zellyn
Hey, free-ekanayaka, a few more questions for your FAQ if you're still paying
attention:

Does this store the entire log for all time? When you bring up a new node,
does it replay the entire history? If not, how do you bring up a new node
without data?

How does backup/restore work?

How do upgrades work? Is the shared WAL low-level enough that it's 100%
stable/compatible between sqlite/dqlite versions? If not, what happens if half
your cluster is on the old version while you're upgrading, and sees things it
doesn't understand yet?

Is it possible to encrypt node/node traffic? Or can you easily send the node-
node traffic over a proxy, like Envoy? How about over a unix domain socket or
"@named" unix domain socket (which we use for Envoy here at Square)

Looks awesome, by the way!

------
weitzj
I did not know this is from Canonical. Interesting

------
yeasayer
What is the typical use case for such a thing?

~~~
free-ekanayaka
[https://github.com/canonical/dqlite/blob/master/doc/faq.md#w...](https://github.com/canonical/dqlite/blob/master/doc/faq.md#what-
are-the-use-cases)

~~~
goatinaboat
This is great - good work to you and/or your team

------
bbmario
[https://github.com/canonical/raft](https://github.com/canonical/raft)

That's interesting. Didn't know about it.

------
jchanimal
Using Raft for the WAL sounds a bit like Calvin. How does Dqlite compare to eg
FaunaDB in terms of distributed ACID guarantees?

------
usgroup
I'm not brimming with ideas of where this would be a good fit tbh. Is anyone
using it or considering it for anything?

~~~
nightfly
Depending on some details, it would be perfect for storing state for a libvirt
cluster management tool I'm working on. Concerns are I can read the data on
disk using the regular sqlite3 cli tool, and lack of rust bindings.

~~~
regularfry
8 years ago or so I was involved in building a cloud platform. The very first
version of the design for keeping the VM and storage allocation metadata
synced across the cluster involved syncing sqlite databases (which we moved
off once we realised we'd pretty much have to invent something a lot like raft
to make it work). If this had existed then, we'd just have picked it off the
shelf.

------
maurodelazeri
how is the performance running 3 nodes? how many inserts/reads can be done in
average ?

~~~
free-ekanayaka
Depends on how fast is your disk, what file system and kernel you use, and how
low is your network latency. Difficult to predict. But it's basically as fast
as it can get given 1) hardware constraints 2) raft consensus.

If you want light-speed insert/delete, you could probably don't use the disk
at all: as long as a majority of your nodes don't die, you won't lose any
data. You can also go somewhere in between and save to disk only at specific
intervals.

~~~
zellyn
Do all nodes participate as full RAFT nodes? Or can you have read-only nodes?

~~~
setr
Aren't non-leader RAFT nodes already read-only?

~~~
zellyn
I guess I meant observer nodes, with read-only DB access.

~~~
setr
I just mean that RAFT nodes are already read-only for most of their lives —
it’s not clear to me why you’d want nodes to be dedicated as read-only; it
would only reduce consistency guarantees (and I believe you wouldn’t be
avoiding the heartbeat and data consistency network chatter).. I _think_ the
leadership functionality comes cheap

~~~
zellyn
That makes sense. I need to read up on Raft again.

I know Zookeeper, for example, supports observer nodes: essentially a cheaper
read-only cache. Chubby at Google had the same thing.

------
peterwwillis
Some thoughts: A consensus protocol is like 1/50th of what you need for a
stable, reliable distributed database, and it's developed by a company, so
expect it to be abandoned once they stop developing it. I wouldn't use it at
work (yet) but could be fun for personal projects.

~~~
linsomniac
I wouldn't paint Canonical with that generalization. It's not unheard of that
they have dropped projects, but I wouldn't say it's common. But looks like
their primary use is LXD, which doesn't seem to be going anywhere...

~~~
heavenlyblue
But the main point made by the parent is entirely correct: the biggest issue
isn’t that of implementing a consensus protocol: the biggest issue is the
reconfiguration of the cluster, management of the dead/live nodes, addition of
extra nodes for replacement, copying of the data before reconfiguration.

All of that needs tooling.

~~~
linsomniac
Indeed. I was reading the dqlite page thinking "where is the monitoring
endpoint to tell if the cluster is healthy or degraded?" Too often that seems
like an afterthought if it's thought of at all.

I have a teeny, tiny cluster using MySQL+galera as a multi-master cluster, but
it took a while to iterate to monitoring that tells me when one node is
unhealthy and getting the correct repair and restart procedures.

~~~
otoolep
Totally.

FWIW, I built all the functionality into rqlite from the very start, for
exactly those reasons. In the real world a database must be _operated_.

[https://github.com/rqlite/rqlite/blob/master/DOC/DIAGNOSTICS...](https://github.com/rqlite/rqlite/blob/master/DOC/DIAGNOSTICS.md)

[https://github.com/rqlite/rqlite/blob/master/DOC/CLUSTER_MGM...](https://github.com/rqlite/rqlite/blob/master/DOC/CLUSTER_MGMT.md)

------
hootbootscoot
what's the comparison with rqlite?

[https://github.com/rqlite/rqlite](https://github.com/rqlite/rqlite)

~~~
hootbootscoot
never mind. c. already in ubuntu reps as a one-step, ok... this is cool. I'm
gonna check it out.

------
gwbas1c
> Runs on ARM, X86, POWER and IBM Z architectures

Odd, I'd think x64 would be the most commonly used architecture. Is this a
mistake?

~~~
secretvrdev
Therefore its called x86-64. It is the "same"

------
mlevental
does anyone know a good on disk nosql db like sqlite with python bindings? I
know sqllite has a json extension but I don't think you can query

