
SQLite – The “server-process-edition” branch - yread
https://sqlite.org/src/artifact/0c6bc6f55191b690
======
quinthar
It's scales so perfectly we are deploying it across 3 datacenters, each with
four 384 core machines with 3 terabytes of RAM. I can't speak highly enough
about sqlite and the team behind it.

~~~
da_chicken
I don't understand why you'd use SQLite for a deployment this heavy over
PostgreSQL or MySQL. Yes, SQLite is fantastic, but why would you _choose_ to
do it this way?

~~~
andrewguenther
Because SQLite is often the right tool for the job, more so than Postgres and
MySQL. It scales insanely well and requires minimal configuration and near
zero administration. I try to use SQLite for as long as is feasible in my
projects purely because it "just works".

Don't let the "Lite" fool you. Depending on your needs, you can scale to 10s
of thousand of users using just SQLite. Or not! It is all about knowing your
system and properly evaluating your options. I choose to use SQLite because it
often fits the use case of small to medium projects the best.

~~~
eismcc
The lite doesn’t refer to light, it refers to a “mineral”.

[https://changelog.com/podcast/201](https://changelog.com/podcast/201)

~~~
oblio
I was confused by this, so I read the transcript. In case anyone else reads
this, the name is supposed to be read as:

SQL (S Q L or sequel, your call) - ite

So it does sound like the name of a mineral: bauxite, boehmite, hematite, etc.
Heck, kryptonite :D

------
coleifer
It may be a bit out-of-date, but BerkeleyDB provides a SQLite front-end that
supports page-level locking -- and thus has better TPS when multiple threads
are writing. It's basically SQLite with the btree ripped out and replaced with
BerkeleyDB. More information here [0].

[0] [http://www.oracle.com/technetwork/database/database-
technolo...](http://www.oracle.com/technetwork/database/database-
technologies/berkeleydb/overview/sql-160887.html)

------
xenadu02
Mods: there is a much more up-to-date version of the readme at this link:
[https://sqlite.org/src/artifact/b98409c486d6f028](https://sqlite.org/src/artifact/b98409c486d6f028)

------
charleslmunger
If this requires turning off pragma synchronous, doesn't that disqualify it
for anything but read only tasks? Power loss etc would corrupt the database...

~~~
bl4ckm0r3
You could have a master slave architecture where you disable it for the slaves
only. It really depends on the use cases.

~~~
est
wait, sqlite can do master-slave in some way?

~~~
siscia
No, vanilla SQLite cannot.

But think about it as a library, you can build master-slave on top of it.

Which is basically what I did with
[https://redisql.com/](https://redisql.com/) in the pro version exploiting
Redis AOF

------
quinthar
Expensify is powered entirely using sqlite (inside bedrockdb.com) and it is
freaking amazing.

~~~
ris
Good thing you don't need 64bit integers.

Seriously, as impressive as SQLite is for what it is, people shouldn't be
given the impression that it doesn't come with some _massive_ and often
surprising caveats when viewed as a general purpose database.

~~~
ris
Looks like I misremembered this (re 64bit integers) and it was one of the
other slightly odd limitations SQLite has around rowids and foreign keys that
caused me to abandon the last system I was designing with it...

As for what most concerned me most when using SQLite was the ease of which it
would happily allow me to make broken queries and not raise a fuss, e.g.
referencing non-GROUP BYed terms in an aggregation.

~~~
ak39
Hi ris,

Can you provide example(s) of the broken query with non-GROUP BY columns in an
aggregation? (Or a reference link if this is well known - I've searched
briefly but can't find anything.)

~~~
lmz
Not the parent poster but it's queries like these:

Select user.id, user.name, sum(tx.amount) from user inner join txn on (user.id
= txn.uid) group by user.id

User.name is not in the group by list so it is picked arbitrarily from the
rows in the group. It's harmless here but not in all cases.

~~~
Groxx
TBH I like the ability to do this. Sometimes you truly _don 't_ care which
value you get, you just want one. Sometimes you have application-level
constraints which mean the one it picks is guaranteed to be correct, but you
can avoid the cost of enforcing that (at insertion time _and_ at lookup time).
It's a handy tool when it's appropriate.

But hell yes, tools should warn you about this loudly, as it's not generally
correct.

------
siscia
What I wasn't able to understand is if this follow the main SQLite develop
efforts or of it is just a frozen branch.

Is there any reason why it is not incorporated into the main branch and
activated by compile time flags? Just complexity?

~~~
std_throwaway
The branch

[https://sqlite.org/src/timeline?r=server-process-
edition](https://sqlite.org/src/timeline?r=server-process-edition)

looks active.

I would hesitate to integrate such a feature into mainline prematurely. If you
really need it you can use the branch.

------
quinthar
we have been using this page locking technology for quite a while now and it
works incredibly well. Having both read and write concurrency is super
powerful and scales fantastically.

------
mbrumlow
This is good news! Already storing over 7PB of data in sqlite, and have
already needed to make many work arounds with readers/writers.

~~~
rbranson
This sounds really interesting. Can you expand on this installation?

~~~
mbrumlow
Thanks, and Sure!

First off, it is not a single instance, but many sqlite databases. Many wich
has 20+ TiBs of data.

For the last 10+ years I have worked at a backup company and we had gone threw
some iterations of storage backends, including a inhoues system that fell on
its face. So far the only thing that has been able to keep up with our demands
has been Sqlite, although there have been some hurdles.

Because of how we want isolation between users having any sort of shared
database system really is not a option. And in the early days our users liked
the idea of simply copying a users dataset and sending it some place else.

In any case we tested, I should say a guy named Brain on our team tested many
different systems and ended up picking Sqlite many years ago. It has stuck and
served us very very well. It was not until we wanted to access the data in
different ways that we ran into concurrency issues and made a few weird
decisions on how to access the data to avoid some of these problems.

These databases store the volume data for each volume we back up. And because
it is a database we can track the blocks in such a way that allows us to
present a consistent view of each backup in time. And it has been good enough
to be able to export volumes as iscsi luns (custom iscsi server I wrote to
interface when the backend), and in turn allow us to virtualize peoples
systems we backed up on the fly without moving data from one format to
another.

Now that I think about it ~7PB is only half of it, that is just the data we
have managed by our hardware. We have many other people storing just as much
if not more on their systems.

Looking on my original comment it may seem like I implied that we have a
single 7PB database, but no, it is 1000s of smaller ones between 8 and 30TiB
in most cases.

Have we had data loss? yes, but it was mostly to improperly using sqlite.
Except one occasion which we were sure it was sqlite itself, which I am sure
Mr. Hipp will refute :p, but I forget the details and a fix was applied
quickly.

Sqlite is nice, and I would recommend it to anybody for just about any use,
even uses you might have not considered for a embedded database.

~~~
philips
Are you putting binary blobs in the sqlite databases directly?

~~~
mbrumlow
Yes, the block data is directly inserted into the database.

This has worked fine, but I don't think it is optimal. I think we might get
some performance increases if we just used seqlite for keeping block offsets
into a flat file. But then there are a bunch of other problems to solve.

Data is compressed, do we would have to handle free hole allocations for
variable sized data. Or simply waste free space. Not that sqlite does does any
better with those things, just things we would have to solve in general.

------
rb808
I don't understand why SQLite would be used instead of writing a data
structure. To me going through SQL (& ORMs) is a hindrance.

Why use a database if you only have one process?

~~~
api
That's fine when you don't need complex queries or your data is small. As
complexity increases you will eventually end up with a buggy slow relational
database.

~~~
Drdrdrq
And even then it's better to use a standard solution that other people will
know too, if nothing else, to make maintenance easier.

------
le-mark
_The collision rate (percentage of attempted transactions that failed due to a
page-level locking conflict) in all tests was between 1 and 2%. Failed
transactions are not included in the TPS counts below._

I read this as: retry logic required. That's a bit onerous IMO. Wrapping a
single writer in a mutex gives near rw=1 performance in the cases I tested.

------
deepakprakash
Can someone explain a bit more?

(a) Does this mean that there will be an sqlite version that can be deployed
as a process accessible over network in servers?

(b) Does this mean that sqlite will now have the capability to do concurrent
reads/writes, but that it is upto users to implement a process that can take
advantage of this and create something like (a)?

~~~
X-Istence
> (a) Does this mean that there will be an sqlite version that can be deployed
> as a process accessible over network in servers?

No.

> (b) Does this mean that sqlite will now have the capability to do concurrent
> reads/writes, but that it is upto users to implement a process that can take
> advantage of this and create something like (a)?

Yes... this allows a multi-threaded application for example to have multiple
readers/writers without blocking each other.

~~~
nbevans
... as long as those threads don't touch the same page as each other. Page
locks <> row locks.

~~~
blattimwind
Depends on the width of your rows _and_ the relation of access patterns to
physical row allocation.

------
boksiora
amazing!! cant wait for this to be in the main branch

------
mingodad
How this compare with
[https://github.com/bloomberg/comdb2](https://github.com/bloomberg/comdb2)
that adds stored procedures using LUA scripting language and decimal types ?

~~~
tenken
In what ways should they be compared?

This appears to be a means to run Sqlite in a server environment with multiple
processes hitting it.

What you linked is just some Lua based extension that has little todo with
Sqlite performance on certain types of infrastructure.

~~~
toosleepy
Comdb2 is a database system that uses SQLite as a query planner. It also
allows Lua stored procedures. It has a different backend from SQLite with
different capabilities.

