
Appropriate Uses for SQLite - tzhenghao
https://sqlite.org/whentouse.html
======
allyjweir
This is one of my favourite technical sites. It is very clear and to the point
about its good use cases while being clear where an alternative would be a
better choice.

In a similar vein, VueJS has a comprehensive exploration/comparison of itself
compared to other frameworks[0].

I really appreciate open source maintainers that have the humility to compare
their work fairly with others within the domain while still taking pride in
their own acheivements.

[0]:
[https://vuejs.org/v2/guide/comparison.html](https://vuejs.org/v2/guide/comparison.html)

~~~
majewsky
I'm trying to make a habit of adding a "Do NOT use if..." section near the top
of my projects' READMEs whenever there are simple exclusion criteria.

For example [https://github.com/sapcc/swift-http-import#do-not-use-
if](https://github.com/sapcc/swift-http-import#do-not-use-if) \-- This
particular instance was born out of watching users bending over backwards to
use the software even though standard software does the job just fine or
better in their concrete case.

------
JepZ
Well, actually it is very simple when you understand the why SQLite performs
differently than other database systems. The major difference is that 'normal'
database systems have a process which receives and processes SQL queries and
has the exclusive access to the database file.

SQlite on the other hand doesn't have such a process so, every client has to
open the database file itself, process the SQL and close/write the file again.

So the biggest difference is, that if you want to use SQLite, you will have to
limit your database operation to one or fewer ;-) processes to not get into a
situation where one process has to wait until the other finished writing the
database file to the filesystem just to read it again in the next moment.
Otherwise your filesystem and the continuing reading and writing of the
database file will make the whole thing very slow.

But as long as you have only one process using the SQLite file and keep it
open for a while you will see a comparable performance to other database
systems. Actually, SQLite can be very fast if you do use it in a sane way.

~~~
beagle3
SQLite is fine with multiple processes as long as they are read mostly. And if
you switch journal to WAL mode, it is as good any multiprocess DB without MVCC
(which, depending on access patterns might be as good as it gets - although
mist patterns do benefit immensely from MVCC)

~~~
kuschku
Not really.

In a usual quassel (irc bouncer that can, optionally, store logs in SQLite)
setup, you’ll have between 1 and 10 writers, constantly writing hundreds of
IRC messages a second to SQLite, while you’ll also have readers that need to
respond with low latency for queries loading hundredthousands of messages,
including those that were just written.

The result is that SQLite gets so slow that the writers queue up so much work
that the IRC connection times out.

SQLite is absolutely useless for multiple writers.

~~~
rpcope1
> SQLite is absolutely useless for multiple writers.

I don't really quite agree with this, but would say that with SQLite it's more
important be be aware of how concurrency is being used with regards to the
database, in comparison to something like MySQL or PostgreSQL. As far as IRC
logs, I think the important thing (that would almost certainly hamstring any
SQL database that's actually being ACID) is to not use autocommit on inserts,
and to not use a transaction for every single insert (which is possibly what's
happening). I wrote a plugin for ZNC to do exactly what you're doing above,
and by batching my writes (either waiting 15 seconds or until n (where n is
maybe 1000 or more) messages are queued), I've gotten superb performance out
of SQLite3 running with a WAL journal, with the trade off being a little bit
of data might be lost if your bouncer happens to fall down in an unhandled
way. I was also able to get really good read performance without hurting raw
write performance by leveraging partial indices, which something like MySQL
doesn't give you.

~~~
kuschku
Well, with PostgreSQL everything works just fine. Read and write performance
is amazing, and full text search returns instantaneous results.

With SQLite, it’s basically useless.

~~~
tracker1
It really depends... I've worked on systems that PostgreSQL (or any SQL rdbms)
wouldn't be able to keep up with. I've seen many others that would do fine
with SQLite. It depends.

In the end, it's a matter of testing. Also, with faster drive options (Optane
and whatever is next), it could very well be a decent option. It all depends.

Frankly, for logging, I'd lean towards elasticsearch (ELK) these days, which
is about the most write heavy situation you're likely to see. Depending on the
scaling Mongo, RethinkDB, ElasticSearch and Cassandra can all be better
solutions than RDBMS. I'd still reach for SQLite or PostgreSQL first depending
on the situation though.

~~~
kuschku
Sure, with optane it may work — but the usual usecase for SQLite is on the
slowest SD cards or spinning HDDs you can imagine, where the user didn't want
to spend time configuring postgres

------
Cyberdog
One of my favorite points about SQLite which hasn't been addressed yet in the
comment thread is that its databases are just single flat files. Need to copy
data from a server to a local dev machine? No need to log into the server,
make a dump, copy the dump file to the local machine, and import the dump.
Just copy the database file as you would any other file. That's it.

Making a backup before trying something risky (and then restoring the backup
when something screws up) is similarly trivial.

Maybe to some of you it's not worth that much, but it's a legitimate timesaver
in my opinion.

~~~
PricelessValue
That's why it is sql "lite". It's great for certain things because it's
generally lightweight and manageable.

But you wouldn't want a single file format on enterprise level systems where
enterprise level performance and stability is required. At the very least,
you'd want your data and transaction log files to be different files on
different disks. Especially when IO throughput is important.

Also, since DBs tend to have important data on them, being able to copy a file
(db) and move it anywhere you want isn't much of a selling point on serious
systems.

Having said that, sqlite is great for application/client side storage.

------
virtualwhys
Sadly, in the browser SQLite has been kicked to the curb, replaced by
IndexedDB.

It's deeply disappointing to not be able to use SQL on both client and server.
Yes, there's Lovefield, which employs its own DSL that has nothing to do with
SQL, so you can scrap everything on the server and redo it on the client.

Or just give up and go with NoSQL on client and server, that seems to be what
the browser vendors are implicitly pushing developers toward.

~~~
dahart
> Sadly, in the browser SQLite has been kicked to the curb, replaced by
> IndexedDB

Are you thinking of WebSQL? I don't remember a SQLite interface client-side in
any browser. I might be wrong, but I thought IndexedDB was originally
implemented on top of SQLite in both Chrome and Firefox. I think Chrome moved
to a different backing store, but that has no bearing on the API.

~~~
tejasmanohar
SQLite has been ported to the browser via Emscripten. I’ve seen it used in a
couple niche projects (generally those that revolve around the SQL language),
but I don’t remember it being a standard or even a primary use of SQLite.

[0]: [https://github.com/kripken/sql.js/](https://github.com/kripken/sql.js/)

~~~
virtualwhys
I'm aware of it, nice project, but 2.6MB + in memory only storage, no thanks.

~~~
hajile
It shouldn't be super difficult to backup/restore your db from localstorage
using a dataURL.

[https://hacks.mozilla.org/2012/02/saving-images-and-files-
in...](https://hacks.mozilla.org/2012/02/saving-images-and-files-in-
localstorage/)

~~~
kbenson
At that point, why wouldn't you just look for some SQL to LocalStorage
API/shim, since it's bound to be faster than (and likely more stable with
regard to page resets than) putting a full SQL engine in place that just
serializes to a LocalStorage object anyway?

------
mwexler
I find this page to be somewhat vague. For example, any data that revolves
around dates or time may require some addition hoops to jump through since, as
[https://www.sqlite.org/datatype3.html](https://www.sqlite.org/datatype3.html)
points out, "SQLite does not have a storage class set aside for storing dates
and/or times."

As much as I love SQLite, I find this little issue always adding an extra bit
of work to deal with. Not a blocker, but useful to know. I've found that in
some (rare) cases, H2 became a better solution (horrendous start time, slow
data load, but ability to use real date and time types reduced errors and
improved analyses).

Finally, knowing that SQLite is not right for a situation is great, but it
wouldn't hurt for them to mention a potential alternative, if the devs or
community know of one.

~~~
oneweekwonder
Not as lightweight as sqllite. But I always wonder why firebird embedded-able
server[0] does not get more love? The 32-bit so library just under 7MB and
then you have a full rdbms available to your application.

[0] [https://www.firebirdsql.org/pdfmanual/html/fbmetasecur-
embed...](https://www.firebirdsql.org/pdfmanual/html/fbmetasecur-
embedded.html)

~~~
lwf
Well, for one, SQLite's 32-bit dll is only 858KiB :)

~~~
oneweekwonder
Feature richness comparable to oracle, mysql, postgres, comes at a price.

Another perspective; embeddable py 10MB, electron app 50MB[0], free pascal exe
1.5MB.

So 7MB for a RDBMS seems reasonable, but not comparable to SQLite as you
pointed out.

[0]:
[https://github.com/electron/electron/issues/2003#issuecommen...](https://github.com/electron/electron/issues/2003#issuecomment-135592382)

------
dgudkov
I wish SQLite was used for data exports from web-sites instead of CSV files.
It would work better especially when the exported data is a some kind of star
schema. I wrote about it here: [http://bi-
review.blogspot.com/2017/04/websites-should-offer-...](http://bi-
review.blogspot.com/2017/04/websites-should-offer-micromodels.html)

~~~
copperx
That will happen as soon as Excel implements an SQLite import feature.

~~~
discreditable
Maybe when it gets Python support! :)

[https://excel.uservoice.com/forums/304921/suggestions/105490...](https://excel.uservoice.com/forums/304921/suggestions/10549005)

------
petepete
> Dynamic content uses about 200 SQL statements per webpage.

Genuinely curious, what kind of page would require 200 queries? I've written
plenty of moderately-complex applications and don't think I've come close to
10% of that on a single (non-crazy dashboard) page.

~~~
qlm
Worst I've seen is a CMS site which, with caching disabled, used upwards of
40,000 queries to load the homepage.

~~~
hinkley
When the app is being dumb, people put caching in to hide the data flow
problems. When the data flow problems are hidden, they multiply, and then
turning it off again is highly improbable.

Caching has sucked all the fun out of too many projects I’ve been associated
with.

Do cache headers and cache busting first. You can always put a transparent
cache in front of your REST endpoints in prod. After debugging the browser
caching. Don’t write your own caching implementation. Please.

------
gaius
Bulk loading a CSV or similar into SQLite for analysis with SQL statements is
an incredibly productive technique. With a few lines of code you can load any
log file, load and merge multiple logs by timestamp, whatever you like

~~~
chrisweekly
The amazing but criminally under-appreciated 'lnav'
([https://lnav.org](https://lnav.org)) facilitates exactly this approach.
SQLite under the hood.

~~~
flamtap
Oh, I hope there is a Windows-compatible way to use this (i.e.: in a .NET dev
environment), if not now, then in the future.

~~~
tracker1
Would probably work under the Windows Subsystem for Linux (WSL), in Win10, or
via Docker for Windows.

I've done similar with node.js streams pretty easily.

------
joshvm
I use SQLite for logging data from a bunch of distributed sensors. The server
(RPi) sends a measurement request over MQTT, the sensors reply and the server
logs each response grouped by the request timestamp. Seems to work very well
for 25 sensors on the same WLAN and for continuous monitoring (about 100
measurements per minute).

This can then be displayed to the user via a web server which has read-only
access to the measurement db.

------
michaelmcmillan
I run a complex CRM that handles 1.2 million USD in transactions every year.
It is running sqlite with a simple in-memory cache (just a dict) that gets
purged when a mutating query (INSERT, UPDATE or DELETE) is executed. It is
simple _and_ fast enough.

~~~
simplify
Tell us more! How many transactions per second do you process? Do you use
multiple databases?

~~~
pvg
If you assume each of these transactions is a dollar, you get a bit under 0.04
transactions/sec on average. If peak load is 25x average, that would come out
to 1/sec.

------
drej
I may have gone a bit beyond what SQLite is useful for, but I just wanted to
try this. I built an experimental application-level file system (so not FUSE-
like, you can't mount this, it's just within an application).

[https://github.com/kokes/sqlfs](https://github.com/kokes/sqlfs)

Haven't had much time to finish it, but it's been fun to play with.

------
zbentley
> SQLite works great as the database engine for most low to medium traffic
> websites (which is to say, most websites).

I appreciate that there's just a hint of snark there. YAGNI, after all, is one
of the most often-forgotten important principles in this industry.

------
pedrocr
I've recently used SQLite to store some metadata for a FUSE filesystem that
pretends all your data is available but that just keeps a local cache and
fetches it from the server on demand[1]. Initially performance was atrocious
because INSERT is really slow if you don't batch several into transactions.
I've worked around it with some application caching but there does seem to be
a large gap between what the postgres engine can do and what sqlite can do. I
definitely don't want to depend on a postgres setup for this application but
it would be nice to have an SQL database for app use that had a bit more write
performance. It seems that when you want that you end up going to LevelDB or
LMDB and losing the niceness of an SQL interface.

[1] [https://github.com/pedrocr/syncer](https://github.com/pedrocr/syncer)

~~~
phaedrus
A lesson I've learned the hard way, from using Sqlite in many apps I've
written, is that if you need any kind of insert performance you can't abstract
away the transaction model, and retrofitting transactions onto an application
which previously ignored them is not easy either. You really have to design
for it up front.

Mainly this is because there is at most one, global transaction per Sqlite
database (in other words you can't nest transactions), but also because the
optimum window to gather up inserts for a bulk transaction might not match up
well with the logical function call structure of you program. I recently
learned about the Sqlite SAVEPOINT feature; this might help implement nested-
transactions, but cannot help with the second problem.

For example, what if you made a "logging.db" file for a GUI application, and
you're grouping writes to a log table in batches of 50 for each INSERT
transaction. What happens if your GUI app gets to 48 inserts but then stops
receiving user interaction events? You'd probably still want those 48 inserts
to show up (eventually) in the logging.db rather than just being "lost" if the
app subsequently crashes. So what do you do? Register a timer event to create
a watchdog to flush the transaction if enough time goes by? OTOH what if a
subroutine of your app is actually at that moment performing a lengthy "real"
(logical) transaction and wouldn't want it flushed? Or maybe the application
should use different database files for logging versus data where consistency
is vital (so that a database can be opened either in "accumulate INSERTs" mode
or "real logical transactions" mode)?

My point is not the detail of how to answer those questions; it is just to
highlight that these are questions which have to be thought about at a whole-
application architecture level.

------
fauigerzigerk
Sadly the lack of a decimal type complicates monetary calculations
sufficiently for me to stay with postgresql even for small projects that could
easily be done with SQLite.

~~~
dragonwriter
You could just store decimal strings and use application-language decimal
functions to operate on them, even in queries, since it's fairly trivial to
use application-language functions as SQLite (scalar or aggregate) database
functions, at least in most decent SQLite drivers I've seen. It's one of the
advantages of an in-process database.

~~~
BareNakedCoder
Strings don't work well with numeric aggregate SQL functions like SUM or MAX.
Also, won't work well with comparisons or sorting: "111.00" is less than
"9.99" string-wise (unless you store all numerics with a bunch of leading "0";
but then that takes a lot more space).

------
andyidsinga
I've used sqlite as an index for S3 objects ..and stored the DB file in S3
itself. It was very practical to use, and a handy pattern to apply to object
storage where the number of objects to be indexed for a given time frame was
reasonable for sqlite (I had in 100k range).

Also - easy to dump the sqlite db into a SQL text file then import into
something like postgres :)

------
Decabytes
I love SQLite and am currently working on a project that makes extensive use
of it. In the book "The Definitive Guide to SQLite" it says that SQLite can
scale to __2TB of data Looking at the website it says

    
    
      An SQLite database is limited in size to 140 terabytes (247 bytes, 128 tibibytes).
    
    

That's ludicrous. I wonder if anyone has come close to that in day to day use.

~~~
fnord123
Almost certainly not seeing as the single hw largest device is currently 12TB.
Someone would have to pool together 10+ (and more for redundancy) just to
store the single file.

~~~
krylon
I have heard of people running ZFS pools that were hundreds of TB in size. Not
sure if they were using those for huge SQLite databases. ;-)

------
neals
What would be a bottleneck / limit for SQLite? If you have to many SELECT
query because your site is too busy, what would happen?

~~~
shoover
While not strictly related to computation, sqlite is a non-starter for
deploying to PaaS without a persistent file system.

"SQLite on Heroku" [1], essentially: don't do it, switch to postgres.

I suppose it would work with AWS Beanstalk and EBS.

[1]
[https://devcenter.heroku.com/articles/sqlite3](https://devcenter.heroku.com/articles/sqlite3)

~~~
tracker1
If you're already in that environment RDS is probably a safer bet though.

------
hawski
Could anyone using SQLite FTS share their opinion/story? I'm especially
interested in indexing hundreds of gigabytes.

------
karmicthreat
Every once in a while I have sqlite eat it, and end up with a corrupt db file.
About as often as mysql really. I wish I had some more serious automatic
recovery in sqlite. I have to have a boot script check the db, export
everything to an sql file if the db is corrupt then create a new file.

~~~
dijit
That does not mirror my experience, I hate to be the guy suggesting this but
have you considered the fact that you're using it wrong?

What I mean by that is that SQLite really is a replacement for open() and as
such is not multi-thread compatible. If you're doing a lot of work that's
corrupting you can also turn up the level of binary logging until recovery
becomes easy enough.

I'm using SQLite effectively on a few thousand multi-threaded windows machines
and I've encountered corruption exactly once- which was when a hardware raid
had degraded non-gracefully.

~~~
jpeeler
My understanding is that using SQLite is completely fine in a multithreaded
architecture, but you have to ensure that the simultaneous database operations
match the correct threading mode:

[https://sqlite.org/threadsafe.html](https://sqlite.org/threadsafe.html)

I also believe that the common way of gaining performance while protecting
safety is to use multi-thread mode with connection pooling. The idea being
that your application can use different database connections to achieve
multiple writes without relying on the overhead of additional locking required
within SQLite itself for serialized mode.

Would be interesting if anybody has performance metrics that backs up what I
said with data converging on an optimum number of database connections.

~~~
dijit
that's exactly the case; make sure only one thread writes, but how you feed
data to that thread is up to you.

