Hacker News new | past | comments | ask | show | jobs | submit login
Why SQLite succeeded as a database (2016) (changelog.com)
128 points by Tomte on Jan 5, 2023 | hide | past | favorite | 85 comments



One reason it succeeded is because of Android. Once it was one a kazillion android devices a lot of the bugs got shaken out.

> Richard: We were going around boasting to everybody naively that SQLite didn’t have any bugs in it, or no serious bugs, but Android definitely proved us wrong. Look, I used to think that I could write software with no bugs in it. It’s amazing how many bugs will crop up when your software suddenly gets shipped on millions of devices.

https://corecursive.com/066-sqlite-with-richard-hipp/#enter-...

All credit to Richard, but I think having customers that demand a lot of stability and performance leads projects to have a lot of stability and performance. (The original use case was running on a battleship.)


Actually, it was Rockwell-Collins in Cedar Rapids, IA that advocated DO-178B certification.

Once that certification was obtained, Android immediately benefited.

"...some avionics manufacturers were expressing interest in SQLite, which prompted the SQLite developers to design TH3 to support the rigorous testing standards of DO-178B."

https://sqlite.org/th3.html#history


(I interned at Rockwell-Collins.)


...or it leads to projects that fail, or remain bug-riddled. Increased usage on Android was an input. The bugs had to be fixed, to get to the increased stability, and performance.


Yea, the real answer is that the SQLite team works hard and does a good job of prioritizing things and so people trust that software to the most important jobs.


I agree. I wasn't trying to minimize the amazing work that was done I was just trying to make the anti-fragile argument that by overcoming all the inputs the product got very strong.


I gotcha. Agreed.


probably one of the dark horse reasons for success is its "open source but not open contributions" model and strong personal leadership from Richard. that clarity of vision and very personal stewardship of the project (listen to him talk with pride about SQLite's aircraft certifications.. not a joke) is something very few open source projects ever have


I didn't know that, and I appreciate SQLite much more now.

I love open source, but I am a huge proponent of strong leadership and vision as the best model to create outstanding software. You need a BDFL, a Torvalds saying "no," a Jobs saying "this is what I want and I will not compromise." Too many open sources project adopt the anything goes model and remain mediocre. The Linux desktop is the most visible example of this.

Sadly 99% of open source projects tend to suffer from this problem, and add features just because someone took the time to write a PR. This is the only reason, IMO, why open source software is often a second rate alternative to commercial offerings: if the source is closed, it is harder for a product's vision to get diluted and lost over time. SQLite shows it is possible to create incredible open source software true to its original vision.


How does HN feel about "open source but not open contributions"? I have several applications with source on github, but I'm not really interested in contributions. I feel kind of bad about it, but they are "my" applications. The source is not secret and I don't mind people using it. That is why I publish it. Not to get help.


It's more than fine, it should be the default. The default absolutely should not be a social expectation to support and maintain something just because you made it available.

The commitment to run an open source project is huge, it's time consuming and draining.

Just putting the source code of something you built out there and saying "I built this, do what ever you want with it" is an incredibly generous thing to do and should always be encouraged.


> Just putting the source code of something you built out there and saying "I built this, do what ever you want with it" is an incredibly generous thing to do and should always be encouraged.

Indeed, and originally that is what open source was all about!

One of my pet peeves is the conflating of open source with some sort of communal development model. Open source certainly enables that model, but open source is much bigger than that. Furthermore, you need other things to make the communal model work well - specifically funding (like a foundation or other patronage funding model) to support development and community management.


SQLite is used by flight systems on the Airbus a350.

It is the only major database that has obtained DO-178B certification, allowing it to operate legally in this avionics environment and role.

Relaxing the strict controls upon it, and allowing contributions that do not pass the tests (in th3) will remove newer versions from avionics applications.

There are some people who are actively trying to fork, which means precisely the above.

I read at some point that Airbus has a software support agreement in place for SQLite that expires in 2050.

https://sqlite.org/th3.html#history


Note well, “Flight systems” and “avionics” are different things. Standards like DO-178 require all airborne software to be assigned a DAL (Design Assurance Level), i.e. simply being part of a “DO-178C certified system” doesn’t imply being safety critical.


It's fine. if it's not only source available but also open source, you are giving your users the four free software freedoms. That's already huge.

They can study the thing. If someone wants something you don't provide, they have the right to make their own derivative. Possibly with open contributions. You can take from these forks if they are public and open source too (but you control what you take and how).

If closed to contributions is the way you prefer writing your software, go for it! And you can always change your mind if needed.

Taking contributions can be very fulfilling, satisfying, nice and everything, but it can also be tiring since it can require to be pedagogical, polite, possibly make compromises, argue, etc. Your call and thanks for releasing your software in any case :-)


There's no problem with it: As you say, it removes one barrier people have to making their software Open Source, which leads to more Open Source software in the world, which is good. It also allows strongly curated projects like SQLite to be Open Source, which is even better. Anyone who feels strongly about contributing to such a project is free to fork it.


sqlite's test suite is substantial and not open source, I doubt you could really fork it in a meaningful way


That raises a practical barrier to forking, but someone who is willing to get a testing suite up and running can fork the project and benefit from upstream changes on an ongoing basis.


It's perfectly fine.

It's only bad if it seems open to contributing but actually isn't. Having the source available and even a copyleft license does NOT imply open for contributions.


I personally think that’s fine. There are still many benefits to an open source but no contributions model (such as others can see how you did things, it can be forked).

Taking contributions requires effort on the part of the maintainer and I think it’s completely fair to want to open the code but not deal with that aspect of open source.


For smaller projects that don’t obviously have a lot of contributors, I think it is good etiquette to ask the maintainers ahead of time if they’re open to PR’s before submitting one, because like you said, it takes effort to accept patches.


Well, contributors should check if there are stated policies, but I think it's the author's duty to document that they don't take contributions and to disable the relevant features in the tools they use.

A contribution is a nice thing from the contributor.

As a contributor, I won't ask and then open a MR for a trivial contribution. It's too much trouble. I will ask for a bigger contribution to know if the author had related plans or their ways though.

As an author, I do state that I'd rather not have medium/big contributions before we discussed about it in the project's CONTRIBUTING file.


> but I think it's the author's duty to document that they don't take contributions

It is polite to do so, and I'd make the effort, but I certainly wouldn't call it a duty at all.

> and to disable the relevant features in the tools they use.

That isn't always possible, and even when it is this does not stop contributions perhaps coming in by other means (email if you have an address publicly known, issue reports, …).


open but not open to contributions - should be the default. look at the clusterfuck known as the javascript ecosystem or linux desktop system as another poster alluded to earlier. quality is automatically improved by having few people contributing - that have a singular goal.


I have no principled objection to it, but it takes a very specific set of circumstances for it to make sense.

You need enough funding to be able to work without outside help, an opinionated enough development philosophy that the cost of integrating external contributions exceeds the value they provide, and a reason to want to give away the source code.

Plenty of companies have the first two, but consider the program they are developing a competitive advantage.


> but it takes a very specific set of circumstances for it to make sense.

> funding

Not everyone is coding for a commercial entity or otherwise monetising their projects, they might just be doing it for fun and want to share, but want to keep their version properly theirs.

A key reason for this can be licensing, if the licence is not a very permissive one. If you have a project that is AGPL and you later want to change to MIT, or if you do decide to go commercial, then if all the code is yours that is easy: just do it. You can't revoke the earlier licence of course, so the last GPL release is effectively dual-licensed. If you have numerous other contributors then you may have issues with relicensing their contributions. They might not care if you are going from AGPL to something more permissive (though that might*, depending on ideology) but they are more likely to if there is money involved.

Moving to a more restrictive license is often easier, but might still not be 100% plain sailing.


There's no need to ask how others feel about your code. You don't need to pass some purity test to share your code in a way you're comfortable with. You don't need to live up to some zeitgeist idea of what OSS is or follow a template. Blaze your trail as far and as wide as you want.


I think contributions can definitely be good if they are going into the project’s intended direction. Issues containing, especially containing "why don’t you just", on the other hand turn me off from open source development.


I can't find the source but I remember reading about BSD how they hated taking on other people's code, in the rare cases they did they "pissed on it until it smelt like ours" which stuck in my mind as one of the greatest quotes ever.


> open source but not open contributions

I never understood this. Does it mean other open source projects are "open contributions" and anyone can just go ahead and commit to master? No. Literally every project requires you to have a discussion with the maintainer(s) before your contribution can be merged in. In larger projects, it requires going through a change process.


> Literally every project requires you to have a discussion with the maintainer(s) before your contribution can be merged in.

But they can be merged in.

SQLite clearly says "not interested in your code". At all.

It's not like they are unfriendly. They will happily discuss things in detail with you, but you can publish your own extension. SQLite proper is developed by Mr. Hipps and his friends/employees.

A very sensible model, if you ask me.


What amazes me is how people are insensitive about other choices.


With almost any other project, I can take the source code, make some change, and then propose that the upstream project incorporates my change. If the upstream project thinks my code is good, they will incorporate it as-is.

With SQLite, that's just not a process. There is no process whereby code written by anyone other than the SQLite maintainers enters the git repository.


How do the people who work on/maintain this full time "afford" to do so? Is it their day job?


I believe so. See https://sqlite.org/consortium.html for how they're funded. Plus of course you can pay for SQLite itself with various levels of support, etc. See: https://sqlite.org/prosupport.html


My company pays the sqlite team a significant amount annually for a support contract. (Without going into too much detail, the primary database for our application is an approximately 2TB sqlite database - no, that's not a typo).


[flagged]



I'd say SQLite entered into a new dimension of applicability once local disks got very fast. You can insert a SQLite row in ~40uS on local NVMe storage today, assuming you can be bothered to set 2 pragmas.

For certain applications, a millisecond is too long to wait for a trip to the database. I've got some control loops written around SQLite that operate at 1kHz and beyond.


I tried and failed to read up on what pragma queries are. Would you like to elaborate? This wasn't able to inform me https://www.sqlite.org/pragma.html


These are the specific pragmas I am referring to:

https://www.sqlite.org/pragma.html#pragma_synchronous

https://www.sqlite.org/pragma.html#pragma_journal_mode

I set them to "NORMAL" and "WAL" respectively.


> A transaction committed in WAL mode with synchronous=NORMAL might roll back following a power loss or system crash

That's a pretty big failure to preserve durability.


I mean if you read the full quote:

> When synchronous is NORMAL (1), the SQLite database engine will still sync at the most critical moments, but less often than in FULL mode. There is a very small (though non-zero) chance that a power failure at just the wrong time could corrupt the database in journal_mode=DELETE on an older filesystem. WAL mode is safe from corruption with synchronous=NORMAL, and probably DELETE mode is safe too on modern filesystems. WAL mode is always consistent with synchronous=NORMAL, but WAL mode does lose durability. A transaction committed in WAL mode with synchronous=NORMAL might roll back following a power loss or system crash. Transactions are durable across application crashes regardless of the synchronous setting or journal mode. The synchronous=NORMAL setting is a good choice for most applications running in WAL mode.

It seems extremely reasonable to use WAL with synchronous=NORMAL


You might roll back committed transactions in case of a system crash. It might be acceptable for some applications, but I wouldn't certainly call it extremely reasonable.

If you do not care about system crashes, why sync at all?


I've written plenty systems that would be fine with a valid, consistent but maybe slightly older state (for example when the db describes both work to be done and work results, or the db otherwise implicitly records where to resume work). On the other hand there are obvious examples where this is unacceptable, say recording the transactions of a cash register.


I have done as well! My concern is the assertion that this behavior is reasonable for the vast majority of applications.


    If you do not care about system crashes, why sync at all? 
Well, you have to sync sometime if you care about persist your data at all, right?

And there are scads of use cases where you want to:

- persist data

- enforce some sort of schema (even though SQLite is lax here)

- query/join it in SQL-y ways

- are willing to trade ACID-compliant journaled durability in exchange for performance

Cache stores are maybe the most obvious use case. This is roughly how Redis is configured out of the box after all (sans the SQL) IIRC. Not ideal if your cache is trashed but also, not a huge deal.

There are also a lot of logging/timeseries type applications where it's just not a big deal if you lose, say, a few minutes of data per year.


If you only need best effort and preserve data only in the case of an application crash, you do not need sync(2) at all, just call write.


WAL mode also cannot preserve transactions across multiple ATTACHed database files.

Transactions on WAL mode databases must use only a single database file to maintain ACID.


Yes and the speedup is not that great, I hear piping changes to /dev/null is even faster.


My computer rolling back a couple seconds on crashes is, actually, perfectly fine! Furthermore, I think most personal computer owners would agree.

The near impossibility of achieving ACI semantics on modern filesystems (other than ZFS) is one of my biggest pet peeves with how data is being handled. Servers may need ACID, but practically no home computers do.



I believe he means build flag cpp pragmas


No there are some PRAGMA calls that people typically don't set or bother to adjust from the defaults that really improve performance. Some are setting the `journal_mode`, `cache_spill`, and `cache_size`.


Control loops with SQLite no just in the loop but written around it, could you please elaborate? Sounds interesting.


Application defined functions are exposed to SQL. At the beginning of each tick an entry point query is executed. Everything flows from there.

Makes integrating computation and data very easy for domain experts.


I didn't quite get past the nostalgia stories, but here's why SQLite succeeded as a database in my projects: I can run a single lightweight binary and I have a database. It's simple and it "just works". For 99% of projects it's enough. For those that it's not, replacing it with something more heavy-duty is easy.


> ...here's why SQLite succeeded as a database in my projects: I can run a single lightweight binary and I have a database. It's simple and it "just works". For 99% of projects it's enough. For those that it's not, replacing it with something more heavy-duty is easy...

Yep, that right there is exactly why i love me and use sqlite!


>Jerod Santo: ...if I want this to be a varchar 40 and you let me put anything in there, then why did I declare it to be a varchar 40 in the first place? You know what I’m saying?

>Richard Hipp: Yeah, exactly. If you say it’s a varchar 40 and you an integer there, it will change it into text. Or if you have a comment that’s declared integer and you try to put text in it, it looks like an integer and it can be converted without loss. It will convert and store it as an integer. But if you try and put a blob into a short int or something, there’s no way to convert that, so it just stores the original and it gives flexibility there.

Am I the only one annoyed by this? Having a var datatype that acts this way is useful, but not being able to trust other datatypes to enforce type constraints is not ideal. Or if you don't actually have a varchar 40 type, and you just have something that's text-like without other more specific constraints, then call it 'text'.

Or allow strongly typed types along with var-type types. Or have a config to choose how types behave. Like, mssql differs from ANSI for NULL handling, but you can specify USE ANSI NULLS.


> not being able to trust other datatypes to enforce type constraints is not ideal

You can have more reasonable behaviour now by creating tables as "strict"[1], in which SQLite will fail if it can't perform a valid conversion. For storing values of arbitrary type it makes available a new "any" type instead.

> if you don't actually have a varchar 40 type, and you just have something that's text-like without other more specific constraints, then call it 'text'

It is in fact called text, it just supports aliases for common ways of specifying text fields in other DBs. However, "strict" tables to the rescue again; they also disable type names other than the native ones.

[1]: https://www.sqlite.org/stricttables.html


You can opt into this now with STRICT tables: https://www.sqlite.org/stricttables.html


SQLite is a Tcl library. All of its type flexibility is meant to work with limited friction in that ecosystem. In Tcl every object type has a binary and string representation and can be freely converted.


Also, I believe one of SQLite's original design goals was Postgres compatibility. So a lot of the games SQLite plays with type notations and coercions are intended to make it possible for SQL code written for Postgres to run with SQLite, to the greatest extent possible given Sqlite's underlying architecture.


Like a lot of people, and especially coming from postgres, I recoiled from it at first.

In practice it has never caused me a significant issue that I can recall and saves a lot of time casting things into compatible columns for exploratory queries. I'm still not sure I like it. But shifting into sqlite mindset is important for making the most of sqlite's strengths, and learning to think of these as "affinities" rather than types or constraints has been part of that.


SQLite succeeded because Berkeley DB files sooner or later always become corrupted, and it doesn’t hurt that SQL is a more useful interface than a simple key-value store.


This was my path to SQLite. I used Berkeley DB which worked fine for me until it didn't. I also had to massage everything into a KV model which meant I had to implement search logic.

SQLite was a nice local DB but gave me all the capability of SQL. I reduced the boilerplate I had to write in my code and I had a SQL front end I could use on my data stores.


It is not difficult to corrupt a SQLite database if it is not treated with proper care.

https://sqlite.org/howtocorrupt.html


I really wish that modern browsers add support for it. It’s supported virtually everywhere except for browsers


It is actually supported in browsers.

It is not a formal standard, because two implementations are required for that, and there can be only one SQLite.

"Mozilla's argument against it becoming a standard was because it would codify the quirks of SQLite."

https://en.wikipedia.org/wiki/Web_SQL_Database


However it is in the process of being deprecated and removed: https://developer.chrome.com/blog/deprecating-web-sql/


In any case, whether WebSQL is supported, every major browser is using it for internal functions.


Why do you need any special support from browsers? Just compile it to JS or Wasm and use it like any other library.


Well, I mean, with enough work you can make a key/value store (what browsers have) sort of act like a relational store, but the results aren't necessarily going to be awesome.

You could brute force it and just compile SQLite to WASM, maybe, like you say? need to persist that data into IndexedDB's key/data store somehow which is probably not going to be totally awesome.

There are polyfill solutions like JSStore that let you treat IndexedDB like a SQL database which would probably make more sense.

That's why it would be nice (for some use cases) if SQLite was baked into browsers themselves and exposed to Javascript. We had that at one point with WebSQL, and I think Firefox ships with a non-exposed dependency on SQLite for bookmark storage anyway, so it's not too unreasonable.


It’s powerful, compact, fast, solid, and reliable … and unlike the majority of other database engines, it doesn’t include an opinionated client/server front end welded onto it


I for one thoroughly appreciate how fresh it is for a software project to adopt the Rule of St. Benedict as their code of ethics.


I've been having a ton of success using sqlite to analyze structured textfiles. CSV data, etc.

Beforehand, I would have either used some hacked up unix tools (grep, cut, awk, uniq, sort), or if things are super complex loaded them into something like postgres - which requires a lot more overhead. Sqlite seems to be in the sweet spot where seeing tabular data within a db takes almost no work at all.


Can anybody give a summary? On a similar note Cloudflare's D1 database is build on SQLite.


Is almost alone on the hill: It don't have any major competitor.

Firebase is the other only RDBMS that could ship embedded but not on iOS/android (and maybe other exotic setups). After this, good luck finding a RDBMS you can use without a server!.


Well, I'm not so sure about that. It's rare among embedded databases in that it supports a rich and standardized query language, but it's SQLness alone isn't sufficient for SQLite to avoid competition. There's plenty of healthy tension and switches between SQLite and e.g. RocksDB, cabinet, and LMDB, and there are plenty of other contenders as well.


Because you can just copy/paste the database file?


Public domain license is a big factor IMO.


Whenever I read SQLite I think of missiles


For those downvoting, from wikipedia: D. Richard Hipp designed SQLite in the spring of 2000 while working for General Dynamics on contract with the United States Navy. Hipp was designing software used for a damage-control system aboard guided-missile destroyers.


Wait until you learn who funded and designed what we now call the Internet!


Thank goodness my weapons of mass destruction won't lose data.


"The transaction knows where it is because it knows where it isn't..."




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: