Hacker News new | comments | ask | show | jobs | submit login
A Minimalist Guide to SQLite (marksblogg.com)
472 points by craigkerstiens on Nov 1, 2017 | hide | past | web | favorite | 120 comments

I love SQLite (and use it in a large number of places in our application stack!), but I feel that it suffers from a case of bad defaults. To name a few:

- Foreign key checking (and cascading deletion for that matter) are turned off by default. You need to enable them using `PRAGMA foreign_keys = ON;`.

- There are practically no downsides (and a number of upsides) to using the WAL journalling mode (at least for "use a local database and store it on the disk" use cases). The main one being that reads will conflict with writes if you don't enable it! (which is a problem in a multi-threaded environment) Unfortunately, that's another feature you must remember to enable: `PRAGMA journal_mode = WAL;` (for obvious reasons, this one "stays enabled" after you turn it on).

- Full auto-vacuum cannot be enabled after you start writing to the database unless you enabled incremental auto-vacuum. If you're unsure, it's a good idea to enable incremental auto-vacuum to keep that option open. But, here again, that's not the default: you need `PRAGMA auto_vacuum = INCREMENTAL`.

This tends to be explicitly problematic if you have to perform one or more ad-hoc queries using the SQLite command line on an app's database, and forget to apply the relevant PRAGMAs! I wish there was a way to add "default" PRAGMAs on a sqlite database file to avoid this.

(note: some of these defaults are configurable when compiling sqlite from scratch, but if you're dynamically linking with an OS-provided instance of the library, you can't really do that).

The disadvantage of enabling WAL is that your database just turned into two files and you have to recover the database to read it. Recovery is fast but it requires write access.

SQLite is one of the best pieces of software I have used in my career as a developer. It is performant, reliable, simple and consistent. There is a reason sqlite3 is deployed in so many places.

I think it's not used widely enough yet. For example, 99% of websites could benefit from using SQLite instead of MySql or (god forbid) PostgreSQL.

I mean Postgres is a fine piece of software but if your website gets 500 visits a day, you don't need Postgres; just use SQLite.

For me, using PostgreSQL is like using static typing: a bit more work, but it catches a lot of bugs.

SQLite favours a very permissive approach, while PostgreSQL favours locking everything down with strict types and strong integrity checks, and it's very easy to verify lots of details of your data before it's accepted into the database. That takes work to set up properly and maintain, but it catches bugs early and reduces the number of headaches you get when trying to use your data in new code.

Scalability is a very different problem that happens to favour PostgreSQL, but most projects indeed never get far enough that this matters.

Strictly speaking, SQLite is dynamically typed in a sense, because the database does not enforce the values in the rows match what the columns claim about the type.

But from a developer experience point of view, the main difference between dynamic and static typing is the following:

Dynamic typing requires a lot of extra checks to ensure everything is what you expect it to be, and a lot of annotations to document what everything is supposed to be.

In other words, dynamic typing has a high cognitive tax.

With this in mind, I find that PostgreSQL imposes a cognitive tax in terms of maintaining a separate process (potentially on a different machine) that a lot of times is outside your direct control. Not to mention all the strange ways in which it can break.

SQLite just works. There's nothing to configure. So there's a lot less cognitive tax.

There is still of course the tax of having to ensure that all your sql statements are valid and that they return what you expect them to return, but this is also the case with PostgreSQL.

I agree with your point, but disagree with the example.

RE static vs dynamic typing, I'd say that it's dynamic typing that has higher cognitive tax - you as a programmer are fully responsible for ensuring types agree up everywhere, whereas with static typing, all of that job is done by compiler. It's easier to change stuff in the code when you know the compiler will catch your dumb mistakes. And I say that as someone who loves writing in Common Lisp.

But the point about SQLite vs a typical RDBMS is spot-on, IMO. SQLite is a library. It operates on files. That's it. It's entirely local. You can use it without having to become a sysadmin, without having to set up a whole service on the OS on which you might not even have root privileges anyway. You don't have to make global changes to the system just for your program. Moreover, if your product is of the distributable kind (desktop apps, self-hosting web stuff) you don't need to make your users become sysadmins, manage a system-wide service, acquire root rights they might not have, etc.

SQLite is local. Its data is local. That's, IMO, its strongest benefit.

> I'd say that it's dynamic typing that has higher cognitive tax

Yes, that's exactly what I said.

In your experience, what are the strange ways that PostgreSQL breaks?

Initial setup is confusing. Users and rules are confusing. Taking backups and restoring them is confusing.

Sqlite is great but it is a poor substitute for a "real" database where you need any serious constraint enforcement at the database level. Yes, you can enable RI, but it's off by default. There is no field length constraint on text values, no native datetime, boolean, or guid types. You can't structurally alter tables without building a new one and copying the old.

If you actually need any of these things you can enforce them in code, by convention, but you're better off installing Postgres.

Also writes are serialized. That one really can't be worked around.

> Sqlite is great but it is a poor substitute for a "real" database

This is true, and explicitly acknowledged by the author:

> SQLite does not compete with client/server databases. SQLite competes with fopen().


If you're deciding between Postgres and SQLite, in most cases you're doing it wrong (there are a few exceptions, such as serving a low-traffic almost-entirely-read dynamic website can be accomplished easily with either). Usually you want to choose between SQLite and "anything else that involves writing directly to the local filesystem".

I would disagree. The proper question is: "do I need a structured RDBMS or not?" Then if yes, which should I use? Just saying "SQLite!" starts to sound like the Regex quote from 1997: https://blog.codinghorror.com/regular-expressions-now-you-ha... A SQL system can do many wonderous things, but there's a wide range of pain and benefit across the different systems.

Think about requirements first, then tech/implementation approach.

But yeah, like everyone else, I often just start with SQLite anyway!

> There is no field length constraint on text values

SQLite has supported CHECK() for a while. So, for example, you can do:

> Also writes are serialized.

Across multiple processes, but there is a shared cache and read uncommitted mode available otherwise.

So I'm actually doing this right now, I write about it on my blog.

None of the apps I've written have hitten any kind of crazy traffic peak, so I started wondering why the hell I was using postgres or other database types after reading https://www.sqlite.org/whentouse.html.

Seeing how simple SQLite has been for me to use has inspired me to write a bunch of dead-simple good-enough approximations for other tools that exist. High up on my list is a Graylog competitor that just uses SQLITE FTS (https://sqlite.org/fts3.html/https://sqlite.org/fts5.html), that just is super easy to start and handles that micro-to-mid-size case before you really need something like graylog or ELK.

Blog post if anyone's interested: https://vadosware.io/post/adding-sqlite-powered-fts-search-t...

Unfortunately I can't edit the comment anymore, but I meant to put both links there, and made the sily mistake of putting a "/" between them...

How does SQLite FTS compare to things like Sphinx and Elastic Search?

Probably not well, in the long run? FTS is just a plugin to SQLite where as Sphinx and ES are purpose-built software. It all comes down to what your actual need is though.

I think for most smal projects SQLite + FTS offers a pretty simple easy to setup solution that doesn't require much compute power, either. The simplicity is also a bonus.

SQLite is absolutely fantastic, but I’ve twice used on sites where I would have used a real DB and twice needed to switch back to a real DB after a few months.

The concurrency model just doesn’t match up well with multiple web processes doing work.

If you are writing on every request, Murphy’s law says that too many of those 500 visitors will be loading pages at exactly the same time...

It's really weird to have to write on every request.

If you're logging user visits for example, you can queue those up in a list (in the code) and flush it to the database every few seconds instead of flushing it on every visit.

> queue those up in a list (in the code)

That won't work in the language that most server-side website code is written in. PHP processes are created and destroyed per HTTP request, and even the FastCGI implementation won't let one share data across requests. PHP does have a sessions feature, but that data is written to either the filesystem (by default) or a database on every request!

Just add Kafka in between

Exactly. When my 10 visitors a day start writing too many comments, I usually jump to containerizing everything, deploying kubernetes, and installing a queuing system w/ workers to handle the load. A zookeeper cluster with Kakfa could definitely work too, and has the added benefit that you can replay history if your SQLite comment database ever got corrupted.

I just don't understand why people can't see the logic behind using something operationally simple like SQLite.

This comment and parent are pretty master level HN trolling.

So you avoid just using a "real" db by adding half a dozen different pieces of software to paper over your "not db" replacement?

It's pretty common to write session data in memory eg. memcache/redis.

Redis is yet another dependency that you can certainly do without.

Yes but if you use SQLite because you value minimalism you would also choose a language that compiles the server to a statically linked binary, e.g. Go, in which case this is a trivial "optimization" to do (if you would even call it that).

I've ran sites in the ~500 users per day category where MySQL and postgre use nearly nothing in resources while user interaction causes complex transactions and competing row-level locks. Sounds pretty minimal to me. Odd though that someone who values golang would want to use DB with a poor reputation for concurrency.

> I mean Postgres is a fine piece of software but if your website gets 500 visits a day, you don't need Postgres; just use SQLite.

On the flip side, setting up a PostgreSQL instance is so easy now with docker that you could go ahead and use it with a fraction of the admin overhead that it used to require:


Docker is its own nightmare :/

I just find it weird and confusing.

Also, how do you setup a database in docker? Docker does not support persisting data as far as I know. So if you need to restart the thing, all the data will be wiped. At least that's how it seems to be.

Other HNers are making it more complicated than it strictly needs to be. Kubernetes and storage drivers are great and all, but we're talking hello world here.

So, if you want to keep it simple:

  docker run postgres --detach --volume /save/my/data/in/this/path:var/lib/postgresql/data
That's it. (The --detach is to start it in background. Otherwise it would run in your terminal session and stop when you CTRL-C.)

Even if you don't specify a path for the data, the container would save its data in a Docker volume (a file), which doesn't get deleted until you explicitly ask it to. If you stop the container and restart it, it'll reattach to the same volume and find the data. If you delete the container without explicitly deleting its volumes as well, you'll be able to create another postgres container and attach the old volume to it.

> Docker does not support persisting data as far as I know. So if you need to restart the thing, all the data will be wiped. At least that's how it seems to be.

In some systems such as kubernetes, data is not saved on a persistent disk. But by default docker containers can retain data. However this is generally solved by using volumes, which you can read about here: https://docs.docker.com/engine/admin/volumes/volumes/

The instructions on the link I gave are clear about using volumes to persist data. It is a very quick read and definitely worth your time. Ease of postgres deploy is what made me a docker fan.

Do you understand what it does? Do you understand the common ways it can fail? When it does eventually fail, can you figure out how to fix it?

From postgres perspective, yes I administered pg databases for 10+ years. From a docker perspective, I have been learning over the past couple years, but mostly I have been able to identify failures.

Why do you ask?

You are meant to use a storage driver such as aufs or overlay with an attached volume to persist information between container lifetimes.

There are a variety of opinions on the goodness of using docker for running a database.

Which translates to: I basically don't want to use it. At least not for development. Not as a means to abstract away the setup of something like Postgres; Docker has its own complications when it comes to setting things up, which I must learn _in addition_ to learning everything related to setting up Postgres.

Running Postgres in a container for development environments is one of the best features of Docker in my opinion. It's really not hard to set up at all.

Install docker and docker-compose, write a simple docker-compose.yml file, and then `docker-compose up -d` in the same directory as the docker-compose.yml file. You now have a full Postgres instance running on your machine at localhost:5432. Stop it with `docker compose down`

DB files are stored in the `pg` folder in the same directory as docker-compose.yml. You can specify this on the `volumes` property.


    version: '2'
        image: postgres:9.4.10
          - "5432:5432"
          - $PWD/pg:/var/lib/postgresql/data/pgdata
          - PGDATA=/var/lib/postgresql/data/pgdata

Instead of doing all of that, you could also do it without docker. Simply install postgres via your Linux distro and run e.g.:

    /usr/lib/postgresql/9.5/bin/postgres -D .mydatadir

Sure you can do that, but if your app needs to use other services (Redis, for example) you can easily throw that in the compose file as well, and then it's easy to maintain versions and have every developer be on the same page.

This is especially helpful if some developers use OS X or Windows.

I don't understand the "all of that" phrasing, as if what I just described was a large amount of work.

How many people run webwpps with changing version requirements for their _database_?

Docker is useful for running components of your own code or for isolation, not for managing versions of core applications like a database or a mail server for most people deploying ready-made apps.

Running a database in Docker is useful if you think more broadly of containers and orchestration, and think less about nodes and more about the cluster as a pool of resources.

All our "system software" — Postgres, Redis, Memcached, RabbitMQ, Elasticsearch, etc. — run as containers on Kubernetes. Doing so allows us to start new versions without considering the host OS. For example, we can upgrade Postgres to a new release the moment it's out, without having to wait for Debian or Ubuntu or whatever.

It also allows us to run different versions (e.g. a new one for testing in our "staging" environment) concurrently without any concerns about which node is running what. In fact, almost nothing needs to be configured with information about "nodes"; everything is routed to the right place, and the client only needs a name to connect to.

Sometimes we start temporary instances of software, e.g. to experiment with. Bring the container up, do the testing, bring container down; nothing ever got installed on the host OS.

Different web apps may use differing DB versions, it happens. I'm merely suggesting docker compose is a fairly simple way to handle this without managing global installations on a dev machine. Keep in mind this is for local development only, I wouldn't use a docker hosted DB in production though you certainly could if you wanted to.

I've tried several different dev environment setups and this approach yields the fewest frustrations and inconsistencies.

Production is a different story but that's not what I'm discussing here.

I'm not trying to beat a dead horse here, but what kind of app runs on a database version 5 but not version 6 (for example)?

I can see an app relying on features not available in older releases, but databases - of all software - have rock-solid backwards compatibility.

I mean yeah, changing DB minor versions probably won't affect anything, perhaps even changing major versions is alright in most cases.

To me, the value is in reducing the number of potential issues a developer can have in their environment and making the build and runtime as consistent as possible. And when the solution is so easy to use, I don't see much of a downside to the docker approach.

My team switched from Vagrant to docker compose for our development environments and since then we've had much more stability and have wasted much less time tending to them.

Why would I want to use redis?

My whole point is that I want to simplify my setup, so that I don't need tons of third party servers.

Do you reimplement caching and data storage on every new project? Redis is just one example. Surely in any major project you've made use of a third party server? Maybe not, but tons of projects do and I'm simply describing an easy way to manage those services in a development environment. What's so hard to understand about this?

The running docker image will still need many more resources than opening up a SQLite dB and running a few queries.

True, it's not as low resource as a single file. But with the official Alpine based images it's not too bad. For self contained app storage definitely would go with sqlite, but any database that needs multiple users (like a service website even if it doesn't have many users yet) I would go with postgres.

Despite a lot of habitual nay-sayers, this is absolutely correct. The vast majority of sites simply do not need the black-boxery of handing stuff off to external storage. SQLite will carry the load just fine, and not only be as good as, but an actual improvement.

I have written fairly large web-apps taking fairly decent traffic, and relying solely on SQLite, unless really pressing circumstances required otherwise. Healthy exercise too: Sharpens your focus, "Do I absolutely need to do a write here?".

Why a (god forbid) for PostgreSQL but not MySQL? I've used both and compared to SQLite they have around the same setup complexity.

Last time I used MySQL was more than 10 years ago, but my impression from back then is that it's more beginner friendly and easy to setup and use than Postgres.

More than 10 years ago, that was true.

Not so much for quite a while, though.

I'm trying to follow current news and knowledge on SQL databases, but 99% of my career was on the client-side, so I lack practical knowledge.

How easy it is, in practice, to switch from one SQL database to another?

Of course you have sites built with bloated software, like Wordpress...

Only if it were just bloated... Global variables to this day.

That too, and no code/"view" separation.

Eh. Definitely not reliable in my experience. I don't use SQLite directly, but I use apps that save stuff in it. I've had issues of losing data because the SQLite DB became corrupt.

Although to be fair, I have no idea if this is any less common with other DB's.

If you don't think sqlite is reliable, don't ride in an airbus a350 ;)


I have a certain amount of difficulty believing this, and the only reference is that page.

Anyone know more details?

Edit: ok, I believe it now: https://www.sqlite.org/th3.html

Here's Dr. Richard Hipp mentioning that Airbus requested sqlite support for the lifetime of the airframe: https://www.youtube.com/watch?time_continue=3134&v=Jib2AmRb_...

Sqlite is great as data store for storing JSON. We receive realtime data from different sources in JSON format. I dump the raw data into MongoDB, but it's still not very easy to query. This is where Sqlite comes in handy. I import the data straight into Sqlite from MongoDB which then enables me to run SQL queries using the JSON1 extension. Currently I'm using it to store data that drive feed simulators. We can extract the data using queries like:

  SELECT json_extract(RawData, '$.sourceId') SourceId
  , json_extract(RawData, '$.messageId') MessageId
  , json_extract(RawData, '$.message') Message
  , json_extract(RawData, '$.message.parties[0].name') Party
  , json_extract(RawData, '$.timestamp') Timestamp
  FROM RawFeed
  JOIN Source
    ON SourceId = Source.Id
  WHERE Source.Name = 'Foo Company'
  ORDER BY Timestamp

Don't get me wrong, I was hating mongo before it was cool to do so, but your example is one place which makes zero sense, just learn to write a mongo query already or use SQL straight up.

You could also just structure your data in MongoDB in a way that creates an index on source id and name? This just makes it seem like you're trying to fit a square peg in a round hole.

I feel this article misses out on "why". I tend to store stuff as csv or json, then slurp it into python to operate on it. It's not clear what benefit putting your csv into sqlite gets you, from this article.

"SQLite As An Application File Format" [0] and "What If OpenDocument Used SQLite?" [1], both by the author of SQLite, may help to answer why you'd use it over other file serialization formats, such as CSV, JSON, Python's pickle, etc.

0: https://sqlite.org/appfileformat.html

1: https://sqlite.org/affcase1.html

For one, memory usage. You either have to load your entire CSV file into memory or read it line-by-line in order to get the result you're looking for. SQLite intelligently uses disk, memory and indexes for that purpose.

Plus, you know, the whole SQL thing.

One advantage that may seem too obvious to mention: using SQLite allows you to work with data using SQL, which is far, far easier to learn (and be powerful with) than trying to learn Python or even R if you are new to languages.

Even as an experienced programmer, I find SQL to almost always more elegant for expressing logic than Pandas or R. My workflow is usually having a script execute a SQL statement to do the heavy data work, and then importing the results of that SQL statement into another programming environment (e.g. R/ggplot2 for visualizations).

SQLite is when the data is a little too big for your approach, but not worth the hassle of installing/running/managing a database process.

The flat files method may lead to partial or empty files in certain failure cases (crash during writes) that may or may not be more gracefully handled by SQLite.

Much easier to work on data with SQL. I can cleanse, deduplicate, fill in default, update, merge, join, and generate reports with a few SQL.

I had an interest rate app where the backend data collection process consists of a shell script to run curl to download rates from various places as CSV, import CSV into Sqlite, runs a few SQL to clean up/merge/fix default/dedup the data. The process is idempotent and can add new data to existing data. At the end I got a long history of clean up-to-date interest rate data in a Sqlite table.

So you can insert/delete/update records without rewriting the whole file. And you get many other features like transaction, spill cache pages, fflush & fsync behind the scenes.

An explanation of the considerable benefits that relational databases bring to a wide class of problems is beyond the scope of an article on one particular implementation, and also beyond the scope of an HN comment.

An sqlite db file is probably as light as your json or csv AND better formed/typed

It’s easier to compare it to CSV than JSON, because the former is tabular while the latter can have deep imbrications that aren’t well represented in SQL(ite).

I'd argue that the more large/complicated/nested the JSON structure, the more it would benefit from using a "real" database instead of a JSON file sitting on disk. If not SQLite due to document-relational mismatch, then LevelDB, RocksDB, or one of the other embedded key-value stores.

Of course, SQLite can also emulate a key-value store quite well, with a table with 'key' and 'value' columns.

Also the JSON1 extension.


Most SQL databases are much more like JSON than CSV: deeply nested and interconnected. It's not unusual to join three different tables together to get something you want, which is effectively the same as nested JSON (though a bit more flexible, because it allows for circular dependencies and the like).

`imbrications`: overlapping, as in roof tiles

(thought it was a typo, now I’m not sure)

Can but probably won't.

You don’t need to go deep: `{"name": "Bob", "hobbies": ["soccer", "cinema", "music"]}`. That’s 3 tables in SQL(ite): one for the people; one for the hobbies; and one to join both.

Two tables: Person: {id, name} and Hobbies: {person_id, name}.

Then you can "SELECT person.name, array_agg(hobbies.name) FROM person JOIN hobbies ON (person.id = hobbies.person_id)" to get your json representation back (at least with postgresql, "array_agg" isn't in standard SQL)

That's a pretty simple join, any real database layout I've seen goes much more complicated and much deeper than that.

Your version is denormalized. It is faster, but also more likely to have misspellings and duplicates. It really depends what is more important.

The JSON example that it was compared to was also denormalized.

Exactly. You could put it in one table if you really wanted to (not that it would be ideal by any means).

Not an SQL expert but pretty sure two tables joined together is extremely normal and well represented in RDBMS

Yep, I can write an INNER JOIN faster than I can remember jq's arcane syntax.

If you needed it all in one flat table for some reason, SQLite supports views.

Speaking of which, I've not grokked jq's functional approach to composing queries. Can anyone suggest a good approach to understanding jq please?

Sibling comments are on target but also the "relational" advantages of storing your data in a database.

I think this comment misses out on "why". I tend to store stuff in xlsx, then use Excel to operate on it. It's not clear what benefit Python or JSON gets you, from this post.

I never claimed to show advantages over excel. Excel is fine, enjoy it!

sounds wasteful and slow for anything larger than a few MB compared to the awesome (really) sqlite

and why SQLite and not something newer like datomic

One of my favorite sqlite features is the vtable mechanism: https://sqlite.org/vtab.html

It makes it possible to expose a custom, application specific database format as a sqlite table, and suddenly you can run SQL queries on your data!

Postgres also has that concept, though calls it Foreign Data Wrappers: https://wiki.postgresql.org/wiki/Foreign_data_wrappers

And it goes beyond single tables e.g. PG-Strom uses FDW to implement GPGPU scans, joins, aggregations & projections (and provides a pg/CUDA while at it)

Yes, this! Vtables are how [lnav](http://lnav.org) works its magic. Embedded SQLite makes so much sense for such use cases.


Once you're ready get yourself a cup of the SQLite koolaid

Very nice! I'm in the middle of writing a book based on SQLite (the overall topic is data analysis with SQL, but SQLite is the medium) while I'm teaching it to students. I used to teach MySQL but it was just so goddamned hard to get it configured correctly on people's computers (I always hated when work had to be done on pre-configured computers lab rather than my own laptop), nevermind the server/client/daemon aspects of MySQL and other variants. While SQLite lacks polished (and free) GUIs, it's easier to explain on a sysops level (with sqlite files being similar to XLS files) without burying students in irrelevant tech details on the way to learning the power of SQL queries.

Have you tried sqlitebrowser? It's fantastic, I use it a lot.


Yep, that's what I use. Because it is open-source and cross-platform -- not to mention, pretty high quality -- it has significantly increased the attractiveness of teaching SQLite. Previously, me and other SQLite teachers I knew would point students to the SQLite Manager plugin for Firefox, which is also free and open source, but a bit clunkier: https://addons.mozilla.org/en-US/firefox/addon/sqlite-manage...

The DB Browser (they changed it from sqlitebrowser at the request of sqlite IIRC) also has in my experience, a very helpful and responsive dev team: https://github.com/sqlitebrowser/sqlitebrowser/issues

> ... also has in my experience, a very helpful and responsive dev team.

Thanks. We definitely try. :D

Thank you. :)

(I'm one of the dev's on that project)

I suppose you are using SQLite as a tool to teach analysis so this is probably not a big issue.

But I really disliked that my database teacher decided to use MySQL to teach about databases (that was about 10 years ago).

Several times he would talk about specific concept, then mention that MySQL doesn't have it (like CHECK constraint, enforcing foreign constraints, aborting statements through triggers (emulated by throwing exception) etc). If we are going to learn universal database concepts, why not use database that supports them. I personally learn things better when I can try runs out and see how they work.

Sorry for my rant, it just reminded me about that.

I'm of course not suggesting you switch back to MySQL, but distributing preconfigured MySQL docker images perhaps might've helped with your configuration troubles.

This kind of use docker comment is being left on most threads to do with setup issues.

So you are advocating people who have issues with simple apt-get or yum install mysql or <app> to instead install docker and deal with the complexity of docker, before they can use the app?

It really depends on the use-case, though I'll concede that docker would've been overkill for the parent based on their comment below.

Maybe now, but it didn't seem feasible in 2014, or at least I just wasn't familiar enough with it at that point. In any case, this was a humanities class, and I wanted to keep the tech as straightforward as possible.

>It's used in systems as important as the Airbus A350 so it comes as no surprise the tests for SQLite 3 are aviation-grade

I stumbled upon SQLite around 2001 and have been using and admiring this gem of a software since that time, but hearing that it is aviation-grade is definitely a surprise to me...

From what I've heard it has one of the most complete and comprehensive test suites of any database. There's tests that simulate sudden power loss as well which is pretty neat.

> Data locality can be greatly improved by storing a SQLite 3 database in memory instead of on disk

My understanding (although I can no longer find the page in the sqlite3 docs) was that because of caching, using :memory: is unlikely to make much difference in practice.

For reading, for a small database, after the cache is warm, that's likely true. In all other cases, there is no avoiding disk access - which, on spinning rust, is slow.

Depends where your disk is. Many virtual machines, e.g. Amazon EC2, keep their persistent storage (EBS on AWS) on an external medium, so disk reads have a bit of latency and writes take forever. Keeping the database in memory is a huge benefit on these systems.

Only thing that I miss in a database software is some sort of spatial indexing implementation. I wish sqlite would have one by default.

The SpatiaLite extension (https://www.gaia-gis.it/fossil/libspatialite/index) handles spatial data and includes indexing. It's solid and we've been using it in production for years. The only real downside I've experienced is that the SpatiaLite-gui browser is a bit flaky.

What does "aviation-grade" mean in a software context?

It means that software is robust enough to be safely deployed on airplanes or vehicles so that safety is not compromised. Imagine if there were some bugs in the software that could potentially endanger safety of plane full of people.

I wonder why the Firebird database is not mentioned more often in such threads. It is a somewhat lightweight RDBMS that still has many advanced features that SQLite may not (by design, of course). It may be positioned somewhere between SQLite and heavyweight databases like MySQL, PostgreSQL, Oracle, DB2, etc. I've used Firebird, though only lightly, with Python. It also has different (lighter/heavier) versions for different needs.



A few excerpts from this page:

Get to know Firebird in 2 minutes:



Firebird is derived from Borland InterBase 6.0 source code. It is open source and has no dual license. Whether you need it for commercial or open source applications, it is totally FREE!

Firebird technology has been in use for 20 years, which makes it a very mature and stable product.

Don’t be fooled by the installer size! Firebird is a fully featured and powerful RDBMS. It can handle databases from just a few KB to many Gigabytes with good performance and almost free of maintenance!

Below is a list of some of the Firebird’s major features:

Full support of Stored Procedures and Triggers

Full ACID compliant transactions

Referential Integrity

Multi Generational Architecture

Very small footprint

Fully featured internal language for Stored Procedures and Triggers (PSQL) Support for External Functions (UDFs)

Little or no need for specialized DBAs

Almost no configuration needed - just install and start using!

Big community and lots of places where you can get free and good support

Optional single file embedded version - great to create CDROM catalogs, single user or evaluation versions of applications

Dozens of third party tools, including GUI administrative tools, replication tools, etc.

Careful writes - fast recovery, no need for transaction logs!

Many ways to access your database: native/API, dbExpress drivers, ODBC, OLEDB, .Net provider, JDBC native type 4 driver, Python module, PHP, Perl, etc.

Native support for all major operating systems, including Windows, Linux, Solaris, MacOS, HP-UX and FreeBSD.

Incremental Backups

64bits builds available

Full cursor implementation in PSQL

Monitoring tables

Connection and Transaction Triggers

Temporary Tables

TraceAPI - know what happens in your server


Disclaimer: I'm not associated with them in any way. Just have used the product a bit, and think it is somewhat good, based on that.

Also, from:



Firebird is free for commercial and educational usage: no license fees, installation or activation restrictions. No double licensing - Firebird license is based on Mozilla Public License.

The mix of high performance, small footprint, supreme scalability, silent and simple installation and 100% royalty-free deployment make Firebird a highly attractive choice for all types of software developers and vendors.

It is used by approximately 1 million of software developers worldwide.


Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact