rqlite does support a form of transactions -- you can send a set of SQLite statements, and either all will be successful, or none will be. You can even send a BEGIN statement, and a SQLite transaction will be opened (and you can COMMIT later). However, due to the nature of Raft, restarting the cluster in the middle of an open transaction will not clear the transaction -- and most folks probably don't expect this. That's why I need to think more before advocating use of traditional transactions with rqlite -- distributed transaction functionality is easy to get wrong, so requires careful thought. You can use SQLite transactions, but you must be careful. That's why drivers don't tend to support it yet.
> the behavior of a cluster if it fails while such a manually-controlled transaction is not yet defined
Do you envision manually-controlled transactions working seamlessly one day? If not, have you thought about what guarantees are feasible/you hope to provide?
Yes, I have a design in mind, but it's a fair amount of work. It will require a new API too -- and API that allows clients to explicitly create a new connection, create a transaction on it, and close that transaction when finished.
MySQL didn’t have transactions for years! A transaction layer can be implemented by adding a transaction table, a transaction column to each other table, and joining the two in your queries.
where table.transaction_id = transaction.id and transaction.committed = true
You would begin a transaction by inserting a new row and then updating it to committed when done.
With some metaprogramming you may be able to do this all transparently.
Very cool project putting together raft and sqlite. Definitely seems like it's biggest benefit would be super simple operation when you also really need the distributed behaviors. Is that accurate? What do you guys use it for and where do you think the sweet spot is for its application?
Yes, simplicity of operation is a key goal of rqlite. To quote from the FAQ:
rqlite is very simple to deploy, run, and manage -- in fact, simplicity-of-operation is a key design goal. It's also lightweight and easy to query. It's a single binary you can drop anywhere on a machine, and just start it, which makes it very convenient. It takes literally seconds to configure and form a cluster, which provides you with fault-tolerance and high-availability. With rqlite you have complete control over your database infrastructure, and the data it stores.
Another commenter mentioned automatic sharding to which you replied "no". But short of that what do you think about use cases where the database is "naturally" very sharded, say where every user has a separate database? With such a design the operator might want to scale rqlite down to zero instances when a db is unused for a while and start it back up quickly on demand.
Have you thought about how rqlite could fit into this design space?
I have considered it, but it would introduce a very large amount of complexity -- and make rqlite much more complicated to operate. Adding this type of functionality would push mean rqlite would no longer be a worthwhile system to use. It would do something (sharding) probably no better than other systems, but no longer be trivial to operate. rqlite does something somewhat narrow, but does that very well (though I would say that :-) )
Can you also auto-shard with rqlite, or is it only designed to distribute a single set of data among many SQLite instances?
EDIT: Looks like no: "rqlite is about replicating a set of data, which has been written to it using SQL. The data is replicated for fault tolerance because your data is so important that you want multiple copies distributed in different places, you want be able to query your data even if some machines fail, or both."
Understandable. Sharding is a huge can of worms in itself. Balancing, changing shard keys, changing table attributes before and after sharding... it's a mess.
Yeah, it would distract from the key goal of rqlite -- which is a trivial to deploy, simple to operate, reliable distributed store for critical relational data.
dqlite is library, written in C, that you need to integrate with your own software. That requires programming. rqlite is a standalone application -- it's a full RDBMS (albeit a relatively simple one). rqlite has everything you need to read and write data, and backup, maintain, and monitor the database itself. rqlite and dqlite are completely separate projects, and rqlite does not use dqlite. In fact, rqlite was created before dqlite.
EDIT: Title has been slightly changed, now. It was originally "Jepsen testing of rqlite, the distributed DB built on Raft and SQLite"
Hmm, perhaps a bit of confusion from the title. It sounds like they ran the Jepsen suite of tests against rqlite, which is great, but not done _by_ Jepsen / Kyle (https://jepsen.io). Others have done this themselves too and that's fine, but half the of the problem is correctly implementing the tests which has been done incorrectly by others in the past.
I don't see any need to rewrite the title here, so I've reverted it from "Jepsen-style testing of rqlite, the distributed DB built on Raft and SQLite".
OK, how do you suggest I change the title? I'm open to a better one. I'm used to the casual use of "Jepsen" for this type of testing. But yes, there is also the case where Kyle does the testing himself -- and this is not that.
Their implementation is up on GitHub, I'm sure they would be interested in any feedback on it.
I'd probably call it "Jepsen-testing". English is ambiguous, only so much you can do, so don't worry about it. People rag on titles too much on this site anyway.
My "ragging" is appropriate here because Jepsen is both the name of a reputable database testing group and the test suite. A lot of people do drive-by HN, without reading the specific submission too closely, and having an accurate title is important. The new title edited in is much better than the original was.
I meant to point it out in the sense of the organization, and called him out by name mostly just because he’s kind of the guy out front. I always forget his exact username.
https://github.com/rqlite/rqlite