
ActorDB – How and why we run SQLite on top of LMDB - biokoda
http://blog.biokoda.com/post/133121776825/actordb-how-and-why-we-run-sqlite-on-top-of-lmdb
======
anilgulecha
This would work great for many classes of Sass applications, where each user's
data is largely separated from others. The fact that you've delegated the SQL
bits to sqlite, and worked on the clustering aspect means I can trust the
actorDB architecture more easily. Congrats on the release.

Do you have any prominent users running this on their production deployments?

~~~
biokoda
> Do you have any prominent users running this on their production
> deployments?

Not that we know of unfortunately. We use it for basically every project that
we do. It's definitely not some side project. It is the core of everything at
the company.

We would be very happy to get any bug reports or suggestions.

~~~
lobster_johnson
Does that mean you recommend it for production use right now?

~~~
biokoda
Yes it has been very stable for us.

~~~
lobster_johnson
That's awesome. It looks very cool and pragmatically design. So as far as I
can see:

\- There is a database schema per actor type.

\- Each actor has a separate physical (SQLite but with the backend replaced by
LMDB) database.

Is this the case? Won't this be using a lot of file handles if there are
thousands or tens of thousands of actors? I'm assuming you already optimize
this by closing "idle" actors, but what if they're all "live"?

How do you coordinate safe schema changes across all actors? You mention that
schema changes are performed "just in time", that is, not when an actor is
idle. I can definitely see the benefit of doing that. But for many types of
schema changes you _do_ want a controlled migration where you ensure that all
tables are fully updated. Can you force the schema upgrade by touching all
agents with a dummy query, perhaps?

It looks like ActorDB is designed for schemas where data is strictly
partitioned, and where data wouldn't cross these boundaries, because joins
would be impractical/not performant. For example, in a Dropbox-style system,
every user would be an agent, and every user would really only talk to a
single agent at a time. So that's a perfect fit.

You also show a discussion thread example, where there is one actor per
thread. This would mean that if you wanted to do something like "get all
threads where user X has commented", you would need to either query all agents
of type "thread", or you would need to maintain an agent that keeps track of
[user, thread] mappings, and then either query each manually, or use the
looping construct (section 4.1.4 in the documentation) to perform the query
across the relevant agents. Am I getting this right?

How much failover testing have you done on your Raft implementation?

~~~
biokoda
\- Correct.

\- Incorrect. They are all inside a single LMDB file. So there is not an open
file handle for every actor.

> How do you coordinate safe schema changes across all actors?

Before committing a schema change, we test it against the existing schema. If
SQLite runs the SQLs successfully (on a dummy memory-only database) it is
committed. We do not guard against writing updates which might succeed in some
actors and not succeed and others. It is the developers/administrators
responsibility to create schema updates that are valid.

We assume schema changes are reasonable (add tables, columns, indexes). Not
rewrite the entire schema kind. Of course you can do that, but it's your
responsibility that it is working.

If there is any doubt of a schema update, one can always create a dump of the
live database, start it up in a staging environment and perform the planned
update there first.

> But for many types of schema changes you do want a controlled migration
> where you ensure that all tables are fully updated. Can you force the schema
> upgrade by touching all agents with a dummy query, perhaps?

I don't really see a benefit in doing that. But you can very easily write such
a query: "actor mytype(*); select id from sometable limit 1;"

> Thread example...

The user actor would be the right place to store the list of threads where it
has commented. So every post would be a transaction to thread (where the
message and actor id would go) and user actor (that would just store a list of
thread ids).

There is also another way of modelling a forum that I think would be better.
Have two KV types, users and threads.

The schema:

\------------

actor user kv

CREATE TABLE actors (id INTEGER PRIMARY KEY, hash INTEGER, name TEXT);

CREATE TABLE messages (user INTEGER, thread INTEGER, FOREIGN KEY (user)
REFERENCES actors (id) ON DELETE CASCADE, PRIMARY KEY (uid,thread)) WITHOUT
ROWID;

actor thread kv

CREATE TABLE actors (id INTEGER PRIMARY KEY, hash INTEGER, title TEXT, date
INTEGER);

CREATE TABLE messages (id INTEGER PRIMARY KEY, thread INTEGER, uid INTEGER,
txt TEXT, FOREIGN KEY (thread) REFERENCES actors (id) ON DELETE CASCADE);

\------------

KV types with sub-tables are quite powerful. You can have a rich data model
and the data will not be too fragmented over a large number of actors.

> How much failover testing have you done on your Raft implementation?

Replication receives the bulk of testing.

------
zeckalpha
Your website claims linear scalability, but since you're using Raft for
consensus, don't you see a slowdown due to synchronization overhead the more
nodes you add?

(I've been blogging about eventual consistency for SQL databases
[http://kyle.marek-spartz.org/posts/2015-09-08-eventual-
consi...](http://kyle.marek-spartz.org/posts/2015-09-08-eventual-consistency-
for-sql-databases.html))

~~~
zeckalpha
I see, it looks like you're partitioning your data to keep the number of nodes
the data is on down, even as nodes increase.

~~~
biokoda
Yes. An ActorDB setup is made out of any number of clusters. Raft only runs
within a given cluster.

------
rakoo
> LMDB writes will keep data exactly where it was placed the first time it was
> written.

Wait a minute. I was under the impression that the reason LMDB works so well
under concurrent access is that it _never_ modifies anything in place, but
when it has to modify a value it will copy the existing content of the page
along with the new value to some unused place... so data _is_ moved between
updates.

Or am I missing something ?

~~~
biokoda
You understand correctly. My point was LMDB does not have
checkpoints/compaction steps where it "silently" in the background copies data
around.

When writes are large, this is an important point. LevelDB/RocksDB type stores
work great for small values, but perform very poorly for large writes because
they copy data around in compaction steps.

------
ngrilly
A few questions to the author:

\- Is it possible to create an index without blocking concurrent operations?
(The lack of background indexing is a well known limitation of SQLite)

\- Is it possible to drop a column?

\- Is there a backup/restore functionality?

~~~
biokoda
\- No but the key point is you will be creating this index in small steps.
Adding an index is a change in schema, schema updates are always executed "per
actor". This means actors who are currently not doing anything, will not
perform this schema change. Once that actor is required to do something it
will perform the schema update. This can be right after schema change, or it
can be 10 days later.

\- We are limited to what SQLite supports. If you absolutely must drop a
column, you can only do it by re-creating a table. Again the above point
stands. You will not be doing this for the entire database at once.

\- There is a backup tool. It basically uses lmdb backup functionality. It
will perform a database copy very quickly if you are copying to a second disk
(at the performance limit of sequential writes).

------
progx
If i understand it right, the developer has to know exactly which actor
(types) has the needed data?

~~~
biokoda
ActorDB is not a distributed clone of MySQL. So yes you must know which actor
you are querying.

This is why we say it is mainly a database for apps. If your queries are
always hitting a specific user, then you make that user into an actor. Fully
relational, within an user.

There is another data type, which is a KV store. But there you are always
operating on keys. With the added functionality compared to most other KV
stores is that the value can be structured. You can have multiple columns and
tables all tied to a specific key and you can incrementally update the value
(most other KV stores only support complete rewrites of value).

~~~
progx
Thanks. "ActorDB is not a distributed clone of MySQL" that was the point why i
was thinking about it.

