Hacker Newsnew | past | comments | ask | show | jobs | submit | levkk's commentslogin

I don't understand this obsession with SQLite for real, production apps. SQLite is an embedded database, completely unsuitable for managing concurrency. This is what database _servers_ are for, e.g., Postgres, MySQL, etc. Their entire job is to allow you to modify data from multiple processes, on different machines, at the same time.

This is a foundational principle of computer science. It seems to me that the "SQLite for everything" crowd is a little bit inexperienced.


You seem to have a rather limited understanding of what kinds of concurrency exist and how those needs are best met. Whether something is a server or not is not very relevant to this discussion.

SQLite is an excellent production db for many real world workloads, as has been widely documented. It is very different to Postgres, so requires learning a whole new skill set.

One way to think about it is that SQLite can work well for the parts of your system where there is naturally strong partitioning.


> SQLite can work well for the parts of your system where there is naturally strong partitioning.

Or the parts of your system that don't have big data and no need for massively concurrent writes. And that's the vast majority of systems!


You can do big data in SQLite. Concurrent writes, sure, I'd recommend something else.

If you think the majority of systems require massively concurrent writes, I think you need to look a bit harder. SQLite is, after all, the most widely deployed database system, ever.


We recently just partitioned the data into many SQLite databases and got away with it. It's telemetry data from IoT devices: one device, one database. Backups are an easy rsync job now instead of streaming a multi gigabyte database with compression that take hours. Reporting will just open each database and aggregate multi device data into another database (Duckdb, SQLite or something else, we'll see). Duckdb is not readable when locked so it's probably also going to be SQLite. Even it it's going to spit out JSON it will go into SQLite rows instead of many files.

Check out Quack for DuckDB.

I saw that but it's a bit too much friction. We'd rather attach a SQLite database and copy the results into that. Also no worries about version changes and compatibility, the types are not too restrictive. But we'll see.

Internet Explorer 6 was the most widely deployed awesome piece of software. Those that hated it need to look a bit harder.

It was not really “deployed” by a lot of people in the same sense.

It was forced upon most of us(not me, I used BeOS then Debian then FreeBSD).

I deployed phoenix.


The reason SQLite is the most deployed is that it's used by Android.

…and iOS, and Windows, and Mac OS, and Boeing, and Sony, and Firefox and Chrome and Safari…

Yes, which goes in line with the argument that claiming that it's "the most deployed" as proof of superiority or suitability for any use case is equivalent to claiming the same for Internet Explorer. It's the most deployed because it's bundled in a lot of systems, not because people are purposefully using it as a DBMS.

But it doesn't, because none of those systems are presenting SQLite to the user as something they should be using; they don't even make SQLite available to the user at all. Those systems all use SQLite internally to manage data.

It’s widely deployed as a local DB for local apps like phones, desktops, and web browsers. But it’s not the most used for distributed, concurrent web apps, which db servers were designed for. Maybe people are talking past each other, but that’s the debate I see.

Not sure why there’s a debate at all. The discussion is on using SQLite instead of jq and markdown files. People got lost on a tangent! :)

No it's not.. the context of other threads on this post (which mention jq) do not apply here. How poorly coded are you?

Like how poorly did Jesus code my DNA? Ask him. By him I mean ChatGPT Jesus mode.

What's massive amount concurrent writes? What's big data?

Such an exercise is left for the reader.

What additional skill set do you need to "learn" for SQLite? Copying files around?

Yeah these are deeply unserious people.

For me, I have a use case that needs to support a few thousand users, probably a few hundred concurrently.

The combination of SQLite (libsql, a concurrent implementation of sqlite) and Rust means I can do so from a $2/m VPS and a single server instance.

Backups are done via a cron job that uploads to S3.

Does it pass the "Netflix scale" test? No

But it doesn't need to. I'm not profiting from the service and SQLite offers a path to scale if/when ready because... well it's just SQL and I can literally just swap `libsql::Connection` with `psql::Connection` in my repositories.

Plus upgrading from a $2/m VPS to a $10/month VPS quadripples the number of concurrent users I can support.

IMO, you can vertically scale extraordinary far with SQlite and an efficient server implementation.

I'd wager that 90% of forum websites, wordpress sites and online shops would be fine with SQLite.


> The combination of SQLite (libsql, a concurrent implementation of sqlite) and Rust means I can do so from a $2/m VPS and a single server instance.

You can probably do it with regular SQLite, too. Being limited to a single writer isn't as devastating as it sounds when they get processed very quickly. Probably don't need Rust either but it'll be more efficient than the usual choices.

(Also, it looks like libsql is the same as SQLite? Only Turso has concurrent writes)


Why do you need libsql? Single writer tends to scale better than concurent writes.

There are many cases where SQLite + concurrent front end (like a go net/http server) can handle all the load that a service might ever conceivably have to handle, especially if allowed to scale up hardware over time. You can trivially scale up SQLite to, what, hundreds of thousands of tps?

The only thing you really give up is HA/failover and DR. But there are solutions to deal with those. And single-server systems are generally surprisingly robust (since, in the absence of very complex control planes, uptime goes down with more systems).


Why go through the trouble of shoehorning SQLite into a cloud database by getting solutions for HA/failover and DR, when you can just use Postgres off the shelf?

So you can post about it on HN, obviously

You also have the issue or normal maintenance (patching, OS upgrades, etc). You can’t do those without downtime if you are using SQLite.

I was thinking of using SQLite on top of k3s/Longhorn to replicate it. Anyone do something similar? Folks mention light steam and aws but Jeff Bezos’s biceps are too much for me to handle.

A longhorn volume can only be attached to one node at at time. It can share it with other nodes over nfs. I don't think this is going to scale well.

Just use Postgres with ro replicas.


I'll echo the other response.

I've had pretty terrible experiences with SQLite and Longhorn/NFS.

It's just not the right database for pretty much ANY network based filesystem, where the locking primatives aren't as robust, and you might get two processes trying to hit it at the same time.

Frankly - they say this themselves: https://sqlite.org/howtocorrupt.html

As someone who runs a fairly big personal cluster backed by a mix of giant NFS storage for media, and relatively large longhorn SSD drives for configs/temp data...

I avoid sqlite backing like the plague. It will get corrupted. Period. It's not the db for this use-case, and I'll take postgres/maria/mysql/mongo/ANYTHING else over it.

If you do it - back it up ALL THE TIME, because it's going to get corrupted.


Yeah sqlite on anything but a directly attached nvme is a bad time if you're using it for a web server.

That’s why there are billions of SQLite databases right?

SQLite is likely used more than all other database engines combined. Billions and billions of copies of SQLite exist in the wild. SQLite is found in:

Every Android device Every iPhone and iOS device Every Mac Every Windows 10/11 installation Every Firefox, Chrome, and Safari web browser Every instance of Skype Every instance of iTunes Every Dropbox client Every TurboTax and QuickBooks PHP and Python Most television sets and set-top cable boxes Most automotive multimedia systems Countless millions of other applications

https://sqlite.org/mostdeployed.html


That’s a comprehensive list of single user devices.

'production' doesn't equal 'multi-user concurrent access'. There are production uses where sqlite is a valid choice even if it may not be the best choice for multi-user production use cases.

strawman? I have seen a dozens of these debates and never once have I seen someone questioned the validity of it for embedded usecases.

Single-user, a single natural person, doesn't striclty mean single-accessor though. I don't think anyone here is suggesting that sqlite is a viable replacement a for any networked client/server postgresql system, but it is certainly capable of handling more than the most basic 1:1 tasks. Beyond that, the point is that you only need a file, so when you have natural data boundaries, a lot of problems decompose to that single user/single concern paradigm.

levkk is talking about concurrency. The list you gave doesn't explain high concurrency requirements for usage.

My read is that levkk is conflating concurrency with "real production apps" and this whole thread is starting to surface that "real production apps" and "high concurrency" are not measuring the same thing at all.

Sqlite is used in real production apps more than any other database.

Sqlite is also weak at any sort of write concurrency.

Both can be true.


Why doesn’t each of your users have a SQLite database writing up to a main?

You can have as many as you want - and one is often plenty.


GP calls out concurrency as a weakness of SQLite. Most of the examples here don't experience the same load even a moderately sized web service experience day to day.

And no, being a part of the python standard library doesn't means it is being used by the average python user. These days I'd say at least half of them are just there for machine learning.


SQLite is good for read-concurrency, not great for write-concurrency.

SQLite requires writes run sequentially. Most SQLite write operations take single digit milliseconds or even microseconds. If your writes are inexpensive (inserting or updating single small rows) you'll probably never even notice the queue.

Exactly, people confuse "doesn't scale" with "is a bottleneck". There's many applications whereby hitting the limits of SQLite is either a physical impossibility, or implies that the application has achieved success such that replacing SQLite is the least of anyone's problems.

I visited a piano store once that was running everything off MS Access. If only they had switched to HA technologies, they would be able to sell millions of pianos a day!


I mean... if you count flat files as "databases", there are a heck of a lot more

sqlite is great for the contacts app on your phone, but that's it.

Hipp even said that it is not a replacement for a real multi-user, concurrent RDMS. Its primary competitor is "fsync".


SQLite is able to handle tens of thousands of write transactions per second on modern hardware. That is probably similar to or more than your real, multi-user, concurrent RDBMS.

Hundreds of thousands*

And I don't understand the obsession with server-based databases for single apps. Especially in containerised setups, every "app" gets its own database anyways, and if the app is further broken down into services, they usually communicate between each other and not with a shared database. So in those cases, what do you gain by pulling the database out of the "process" and onto the other end of a socket? In most cases, absolutely nothing. So why bother?

Don't get me wrong, I've worked with plenty of server-based databases, including proper dedicated database servers. It's great tech and often the best tool for the job. But not always and I'd argue not in the majority of uses.


“Especially in containerised setups, every "app" gets its own database anyways, and if the app is further broken down into services, they usually communicate between each other and not with a shared database. “

You seem to be talking about a vastly different use case.

Containerized apps having their own database? What? Aren’t these types of containers stateless? I always very much try to keep state out of app containers.

What kind of data storage are we talking about?


If an app needs a database, it gets a database server container, instead of getting a user and database on a shared database server as things used to be done. Every little django app has its own postgres container. Every wordpress site gets its own mysql container. That is the modern way.

Those database containers get a PVC/volume/mount for their data dirs. The only thing ever connecting to them is their "owner" application container. So at that point, why not drop the postgres container and PVC mount a sqlite directory in the app container? The result is the same.


And when you need to scale to thousands of instances of your microservice?

Yeah this is the part I don’t get. It seems like people are talking about 1 distinct app = 1 container and this is the new normal? We’re back to managing cows instead of cattle again?

I just think a lot of people here haven't ever worked on large scale systems. They don't know what the don't know.

I think a lot businesses build large distributed systems prematurely. They don't know what they don't know.

What on earth needs thousands of instances? Are you building a CDN? Most apps can be a single server or sharded by business/region.

Honestly, this whole leys run loads of nodes seems to have sprung up from languages that are slow oe don't have decent concurrency.


That's the whole thesis; YAGNI.

Yes if you run a database server like an embedded application database, then it won’t be very different from an embedded application database.

How do you do server maintenance or handle hardware failure if your database is SQLite? You are going to have to take downtime, even in the best scenario.

1. Proxmox live migration or HA, Ceph storage

2. K8S DaemonSet, PVC backed by probably Ceph

3. Just..don't care? Do maintenance outside of working hours, fix issues quickly and explain things nicely to your customers. Not everything is google-scale. Most people can deal with some downtime.

And it's not like you won't have downtime in let's say a postgres-backed app. But now you have two "servers" to deal with.


Those first two options add more complexity than just using an external database.

I guess I am just not used to downtime being acceptable. Spent most of my career working for a CDN, any sort of downtime was simply unacceptable. I can't stop myself from that sort of thinking now.


It's complexity you already have. You need some sort of HA for your app server and some sort of resilient storage for your database server. Using sqlite just means the storage is used by the app server directly, nothing more.

Every container gets its own database?

Yes? Well, every "app", as I quite explicity wrote. Look up the docker compose file or helm chart for basically any app. I'm running dozens of apps, each with their own postgres, redis and nginx containers alongside the main application server. That's what the stack is designed for.

The Compose file is written like that so you can quickly try the app without setting up extra dependencies. Usually not for production use.

Especially since in production you might want to scale the parts separately. I like to have a Postgres cluster to connect where backup is already handled, and the app then doesn’t have any persistent data, doesn’t need any network volume mounts.


Sqlite is good for lots of stuff, but you're probably focusing your days on high-scale webapps that want sharding with networked DBs. That's one domain, and an interesting one, but there are lots of others.

I'm a big fan of re-evaluating prior "best practices" in light of technology changes, especially in ways that improve simplicity. Running my family's social media site off a single sqlite DB on a VPS is great. ~15 users, almost zero maintenance. I run my FreshRSS instance off of sqlite, as well as my "now" page. At work, I used sqlite for all kinds of things over the past decades: as an ad hoc job queue, as a quick way to ingest and query lots of logs locally, and present/filter in realtime with simonw's excellent https://github.com/simonw/datasette.

I don't think it's every "sqlite for everything" as much as it is "sqlite in lots of places you probably didn't think to apply it."

kentonv/Cloudflare's work on sqlite at the edge might have made this thinking a bit more popular, but it was always around. https://blog.cloudflare.com/sqlite-in-durable-objects/

I suspect being aware of all those little neat cases and wanting to leverage sqlite for them may be an indicator of experience, rather than the opposite.


> Running my family's social media site off a single sqlite DB on a VPS is great. ~15 users, almost zero maintenance.

Details, please!


Nothing public at the moment, unfortunately. I was kind of surprised at the lack of Very Simple scripts to just host a site for a handful of users easily. So I wrote one that focuses on:

    Unlimited-length posts in a chronological feed. You don't have to subscribe to everyone - having something appear in your feed is opt-in.
    Circles. Subscribing to someone into a hobby can get noisy. Circles give the hobby a place, for those that want to check.
    RSS everywhere: Anyone can add an RSS feed to the server, and anyone can view all the subscribed feeds and choose which to follow. They are not part of the home feed, but a separate section. Every feed (circle, user) has a public-to-the-internet feed and a private-to-folks-on-the-server feed.
    Mentions: you can @-mention anyone on the server to get a post into their home feed.
    Public posts: by default, every post is private. If a user checks a box, the post will be made public, so folks that are not logged in can see it.
    Posts in the feeds section, circles section, or home section can be replied to, with the user choosing where to share (a circle, or their feed)
    It's all a single-file Python script that can run in either CGI mode or server mode. I compile it with nuitka and run it as a CGI behind Apache. Very old school, works fine. Non-attachment data stored in sqlite, so if you have the DB, you can fire up a copy of the site, sans attachments.
    Attachments: gallery posts supported, with lightbox viewing. PDFs supported. Nothing else right now.
    Works on mobile.
    No email or other form of notifications. If users visit, they see stuff. If they don't, they don't.
    Super-opinionated: admin controls everything, and password resets go through the admin, who simply asks the server to regenerate a new password, that the admin then passes along to the user.
    There are no direct messages, or private posts, in the sense that if you log in, you'll be able to see everything going on if you click through to it.
    Replies, comments, and reactions are supported. Conversation view (tree of posts replying to each other) is supported.
    
Those are the features off the top of my head. It's a social network for small groups that are high-trust. If I open source it (after I feel it is more airtight) I'll probably ask AI to provide a landing page using this as a prompt and provide this verbatim at the top of that page for folks that want the zero-bullshit version. =)

There is something appealing about "it's just a file" (it really isn't; it has locks and a WAL), but I agree with you.

I think people are afraid to read the documentation for postgres. You can start it up in milliseconds. Fast enough and light enough to run one copy for every test case in your test suite, or whatever you're using it for. (mkdir /tmp/whatever; initdb -D /tmp/whatever --no-instructions -A reject -c listen_addresses= --auth-local=trust --no-sync -c fsync=off -c unix_socket_directories=/tmp/whatever -U postgres --no-locale; postgres -D /tmp/whatever) Now you have a test database that behaves exactly like production because it's exactly like production. (OK, turning fsync off makes it a lot faster than production, so be careful.)


> I think people are afraid to read the documentation for postgres.

Postgres may introduce a single-file embedded filesystem because what the hell, but the irony is all these guys won't even notice it. The same people that say Postgres backups are too hard.


Thing is SQLite scales better than both those network databases [1] if you're prepared to stick with one big machine (+ a standby).

This is even more obvious when you start doing transactions processing an row locks across the network limit you to 1-3k TPS that you cannot scale out of (Pareto distribution is merciless).

[1] - https://andersmurphy.com/2025/12/02/100000-tps-over-a-billio...


Seeing as I can get about 200K TPS from a networked DB in my environment, I have to question your setup here.

In the real world we are looking at things like RPO (recovery point objective) and RTO (recovery time objective). You need to consider HA and DR. It’s in these areas where SQLite does not scale.

That’s why I struggle to see the fit for SQLite in any sort of multi-user server environment. If you need the data to be durable, then the bigger DB’s have the tools. If you don’t need the data to be durable, just keep it in memory. I’m sure there are niches I am missing.


In this demo each T in TPS is two updates over a billion rows and most importantly skewing high on row lock contention. On a 5 year old macbook, using a dynamic language. Isolation level serializable and synchronous full (so max durability).

You can definitely go faster over less data doing single inserts on a better stack, with weaker guarantees.

RPO litestream even in it's default settings gives you point in time streaming backups to the second, which is considerably better than what RDS five minutes. So the funny thing is the durability guarantees are worse with the "bigger DBs".

RTO again you can have a standby that's warm with a copy of the data through litestream. Regional sharding also becomes trivial.

It's a solid set up for a lot of products/apps. Postgres is still fine if you want things like roles and permissions etc. Or if you don't have experience getting the most out of sqlite.


Wow, what an apples and aliens comparison. You add a bunch of transaction delays to your postgresql case because you can access a database over a network, but you use transaction batching for sqlite? Maybe just compare a local postgresql with/without batching to a local sqlite with/without batching to be much less misleading.

Because local postgres is a bad time unlesss it's the only thing running on the server. Even then sqlite will smoke postgres (even with unix sockets).

The point is to survive the Pareto row locking problem you need to move away from a network database (if you want to still have interactive transactions). The network part is the main point of a network database, once you drop that there's not much pointing sticking with the added complexity unless there's another feature you really need.


You know you can host a database like Postgres on the same machine, right?

Yes, it's still slower on the same machine, even with unix domain sockets.

It doesn't play nice with other things running with it in practice. JVM and postgres on the same box is a textbook bad time.


So teach them. If you want to bring up computer science fundamentals, the question is where does SQLite sit with regards to the CAP theorem. Consistency, Availability, and Partition tolerance. SQLite isn't a distributed system, so there are no partitions to tolerate, so it's a CA system. Other databases make different tradeoffs. For systems that don't need concurrent writes, SQLite is pretty great! There are no users to manage, no permissions, no daemon to run, no server and port to mix up. Just open a file on disk using a library.

Strawman, no? "run an Obelisk server with a SQLite database", now we're distributed.

SQLite is a nice local store. It's this server stuff that I don’t grok, well, yet. :)


In the beginning apps and SQL were co-mingled. Oracle eventually came along and noticed that people wanted SQL on the network so that many different apps, running on different computers, could all access the same data. But then people realized that clients really want rich, 'tree'-like data, not simple rows and columns, so people started sticking networked databases in front of networked databases to serve as a transformation system. And now people are realizing that the second networked database layer is redundant and never used beyond what is required for the client-facing network database, so they are moving the storage back into the first network database layer, just like Oracle did all those years ago. What is old is new again.

What changed is SSDs. SSDs means that local access is faster than hitting the network. An expensive SAN stopped making sense because of this in specific cases. So for read heavy, or even read only database loads, you copy the SQLite file to the node that's processing the file, and just update that file whenever the data does get changed.

> SQLite for everything

is just wrong, and I don't think that the SQLite fans are that crowd. Taking a database server for everything is probably possible, but often unnecessary. With experience, one can properly judge when SQLite is sufficient and when it is not.

So arguing that the SQLite crowd is inexperienced feels weird, because inexperienced people have a much harder time judging when to use what and can just use the database server all the time (even when it is overkill).


I had very good results giving 1 SQL DB per go routine, so the accesses were serialized up front, on a very high volume (130K requests/second) service. Exact transactionality was not a product goal, and the SQLite was just to backup the in memory state. If we lost a little due to abend or something, that was ok (although for normal maintenance it caught SIGTERM and stopped the listen and then waited for in flight calls and then flushed the remaining changes to SQLite; then on startup it would read the SQLite into memory to populate before taking the listen; persistent storage across container runs, and never both reads and writes to the same file at the same time. (It also just closed the DB and opened a new one when it hit some limit of rows, so as not to fill the disk; the max size of the SQLite corresponded to the max size of the LRU map being served from in memory; then it just flipped A / B between "a full memory worth of data stored" and "the currently updating state." A lot easier than having to write out proto bufs to disk or whatever I would have done for transient (during restarts/maintenance) persistence.

Woof. That sounds very complicated. If you need that kind of write concurrency, use an unlogged table in postgres [0]. Then you don't have to invent a whole sharded thing yourself.

[0] https://www.postgresql.org/docs/current/sql-createtable.html...


There are so many unfortunate footguns with unlogged tables, that I'd argue that the goroutine route is preferable.

What are the "footguns" with unlogged tables in Postgres?

1. If postgres shutsdowns uncleanly, your entire table is truncated; you lose everything.

2. You should check if your backup method backs up unlogged tables. For example, RDS Snapshots on AWS do not backup unlogged tables.

These 2 are a double whammy where if you aren't aware of these tradeoffs you can find that a bad restart has deleted all your data, plus your unlogged tables were never backed up.


Such as?

Running postgresql is an order of magnitude more complicated than sqlite.

130k tps even with unlogged is not always super easy especially if getting hit concurrently. Postgresql connection overhead alone can be pretty brutal if you are setting up and tearing down connections or have 1,000 writers etc.

Postgresql generally requires good network connectivity. Folks doing sqlite distributed tend to have everything independent, you literally don't need to worry about connection / security / firewall / permissioning / internode escape or data leaking etc, can even have problems in local side networking and services can still serve.


even with wal, postgresql can easily reach 130k tps in pipeline mode.

That was per container, with 16 containers per data center, so would be a lot of DBA tickets to get something that large; SQLite scaled with the horizontal scaling of the app; and we did have a flaky network - something like one in 100,000 tcp connections would fail. And occasionally the whole network would just go away for a number seconds. And the persistent container storage was managed by the same storage team that managed storage for the DB team, so base scalability and availability high.

Computer science no more get its hands dirty with concrete software than physics primarily being about building bridges.

It is not «a foundational principle of computer science».


> This is a foundational principle of computer science

How exactly is this a foundational principle of computer science?


I worked on an app that had sqlite databases per user... it was fine.

I think you'd be surprised to learn how many real production apps are actually running on top of SQLite (by way of Cloudflare D1).

Many DB servers are built upon embedded DB primitives (like RocksDB), that doesn’t mean the primitives are sufficient on their own.

I'm not sure what this has to do with my comment? D1 is pretty much sufficient on it's own...

My point is D1 is not sqlite, it’s a serverized architecture of it, including building things like replication, etc.

Plus, D1 has a 10gb limit which is wild to call “sufficient”.


It's touted by the people who use the word "just" a lot.

"Just use postgres" "Just use sqlite" "Juse use a monolith" "Just use sftp" "Just use an ec2 instance"

Usually these people have flunked out of the school of (distributed system) hard knocks. They couldn't hack it and are retreating to familiar.

The funny part is when one of those people fluke themselves into senior management when their saas takes off.

Inevitably they have to suck it up and hire experts in the same technologies that "no one needs".


That may exist but the opposite type of irrationality is much more common.

Scalability = success. We need to be "scalable" because that means we're successful right? Scalability = real engineering. I'm a real engineer so I need to design everything to be "scalable" because I'm so smart

>The funny part is when one of those people fluke themselves into senior management when their saas takes off.

>Inevitably they have to suck it up and hire experts in the same technologies that "no one needs".

Sounds like they were the wise ones to build something simple that achieved a high level of success.


Well if you run a tiny single-threaded app then SQLite is a nice simplification over spinning up a separate machine for Postgres.

I use postgres for very simple apps. I have a Dockerfile I use in my boilerplate repo. It takes a single make cmd for me to build, start and run migrations. Its as simple as using sqlite.

But now you have another process to babysit. How do you keep it healthy? And you have to ensure the client-server communication won't break.

For me the main benefit of sqlite is that it's a library rather than an app.


> But now you have another process to babysit. How do you keep it healthy?

I've been assured by many HN users that running apps/sites on a single VPS requires near-zero maintenance or monitoring to achieve acceptable uptime 24/7/365 for years on end, sooooo...just pretend it will never fail like your main server process?


Ive been assured by many HN users that you must have 24/7/365 uptime for everything in case one of your 10 bi-monthly users decides to log on.

Call me old-fashioned and quaint, but I don't like to build software that doesn't work all the time if I can help it, whether it's for 10 users or 10 million.

24/7/365 is needed (or achieved) just about never. our big tech is proving 90% will soon be utopia as well. being down has always been fine for 99.999975% of all projects on the planet.

Ok, now tell me the stat by percentage of overall market revenue rather than project count

I have boilerplate for client-server communication that makes it pretty trivial to build on top of.

Im not saying that sqlite isn't useful, im mostly saying that using postgres doesnt have to be complicated.


Its 2x the infra. You have to manage an additional process, auth, backups, logging, etc.

Or you can run postgres on the same machine as the application, which lets you much more easily migrate if the time comes when you need to scale to multiple application servers.

There's a world between "local file" and "network DB server", running a DB server locally has lots of benefits from being able to easily query from outside if needed to forcing you to consider concurrency without the latency overhead of a network hop.


This decision tree doesn't make much sense to me. Why you someone forego performance today in favor of adding a completely unnecessary network layer to every DB query in order to "satisfy" future imaginary "scaling concerns"?

Because you don't add a network layer by running a database locally.

That's still orders of magnitude more complexity for no real benefit. A migration from sqlite to postgres, if really required, is not that hard.

Yes, postgres should support a superset of SQLite functionality.

Now you've added a substantial dependency, and annoying setup requirements. Good luck doing this for a native app on mobile or desktop.

If someone is talking about "spinning up a separate machine" for Postgres, they're not talking about a desktop or mobile app...

Obviously SQLite is the best choice for a mobile or desktop app, that's not what's being discussed here.

If your data is naturally sharded (users) with writes happening within a single shard, parallelism becomes easy. The request is routed to the shard hosting the user's data and reads/writes locally.

This makes scalability _much_ easier to reason about. It's cut-paste, cut-paste. Every N users needs another shard.

It does buy you a _different_ set of problems, like cross-shard querying (analytics) and how to do load leveling as users age out.

But it avoids the whole shared index scaling problems from inserts/updates with large user counts.

It becomes a hierarchical instead of a relational database.


there is a difference between concurrency in a distributed environment and concurrency on a single machine across processes. SQLite is incredibly useful for the latter.

you seem like the inexperienced one to me..


SQLite does not support concurrent writes at all (on a single machine), a single writer process locks the entire database.

It doesn't block reads. Single writer systems are often faster than concurrent writers no coordination overhead and you can batch.

not really true, SQLite supports WAL mode which allows concurrent writes (technically write _attempts_, but these writes are exceptionally fast and are serialized to the file-system anyway, so functionally equivalent to concurrent writes for p50 use case).

also, use-case for massively concurrent writes is pretty narrow, and SQLite is not optimizing for that anyway.


> you seem like the inexperienced one to me

There is irony here


Personally I like Postgres for this reason too. Its extremely easy to run with Docker, I can dump data from all kinds of apps in there and I know it's not going to take any rearchitecting as soon as I need multiple concurrent writers.

I think docker is still super underappreciated so setting up any kind of server is seen as a chore. In my eyes it makes running tons of services like this very easy, so ill take the extra functionality, extensibility etc of postgres.


Sure, SQLite doesn't solve every problem -- but in many cases it solves the need at hand with the reward of one less piece of infra required to support it.

I see obsessions with tooling/solutions constantly from experienced devs who fall in love with the original solution and think it's the only way to do things -- so the experience part cuts both ways.


> SQLite is an embedded database

Yes, but that's not its main selling point. An SQLite database is also a single file, which makes it incredibly easy to replicate, backup, transfer, restore, etc.


SQLite in WAL mode which you want for server apps is multiple files.

Files which you cannot just copy while your application is running if you want a correct backup.


Vacuum into or .backup work perfectly with a running, WAL enabled db.

At which point there’s little difference from any other database’s backup commands.

I don't think you can install "any other database" by pasting one file in a direcory somewhere? Even if you can produce such a backup with the same command.

Pretty much every embedded database since about 1988 has worked like that.

You say that being an embedded database isn't the main selling point, being contained within a single file is. But that's a completely normal feature of an embedded db, to the point that the one implies the other.


Isn't concurrency also limited by your machines disk speed for writes, what difference does it make if you write sequentially vs concurrently? Why does concurrency even matter for databases?

> Isn't concurrency also limited by your machines disk speed for writes

Yes, in theory: given a large enough database, and a disk that can only do one operation at a time, and a large enough operation that touches enough of the database. In practice, in a SQLite single tenant scenario? No, not at all.

> what difference does it make if you write sequentially vs concurrently. Why does concurrency even matter for databases?

As soon as your codebase involves reacting to events independently of a user taking action it becomes a practical concern. Generally, this is a broad question and has 1,000,000 answers.

EDIT: Originally I had "I think you understand generally, no?" appended but realized that's not helpful at all, if you did, you wouldn't be asking.

Something that may help is imagining what'd happen if a DB wasn't thread safe / didn't allow multiple writers. Ex. in SQLite's case, it allows multiple write operations to take place but there's a one-at-a-time queue. If we didn't have databases that were able to execute multiple writes simultaneously, you'd need a separate database for each concurrent writer you expect, and you'd effectively have a global lock. Orderly scaling would be ~impossible unless you did something crazy like have a single server per user


I guess I need to dive deeper into this as I do not understand the implications you gave me, but I appreciate the attempt. Generally I understand why concurrency is good in many cases, I just dont get why its important for database stuff too.

Edit: thanks for clarifying in the edit, makes a lot more sense.


Imagine if every tweet had to go through a one-at-a-time queue before being persisted. There's about 6000 tweets per second, so you would have to be able to save them at <0.17ms per tweet or else you would become backlogged. If you are getting backlogged, you have to buffer those incoming tweets somewhere until they can be writted, and eventually that buffer gets full and you start losing tweets.

Maybe that too is a native question, but there's a large scale between single user and 6000 tweets per second - most of our apps will never reach anything approaching even one save a second. So where to draw the line? I do far have gone the sqlite route for my hobby apps as it's so easy to handle and doesn't require setting up two docker containers for a single app. Am I drawing myself in a corner in case my apps ever do become relevant?

Excellent question, and I spent so many years asking myself it, this over and over. You asking it made me realize I just...don't anymore. So allow me to blather a bit / free associate because I won't be sure why myself until I've written it out.

TL;DR: whatever works for you is the right decision. (which isn't helpful, I heard this so many times and as the recipient, I thought "That's nice. Now how do I choose what works for me?")

I finally had to use Postgres a couple years ago after a career of only SQLite - startup founder & iOS app developer using SQLite, turned Googler on Android, turned doing-my-own-thing.

In retrospect, I have made only one bad decision:

I went way out of my way to make SQLite work at my 2009-iOS-startup. It was a restaurant point of sale system, and to allow a networked system, one of the iOS devices would act as a server. This was a really cool trick, even an advantage in marketing that was appreciated by users. It meant the restaurant could continue to operate if the internet went down. But it eventually became clear owners loved having internet-based access too, ex. to do reporting/financial analysis over the data. And I kept contorting, instead of moving past my fear of getting into things I didn’t know, I instead did some like rudimentary thing over port forwarding. The bad decision here was riding one horse for so long and letting it affect the product, having a real server database would have allowed for a lot more features, think, first party gift cards, and a 100 others.

After leaving Google I needed server-side storage and fought and fought to avoid it. Then it turned out Postgres is easy and, just like SQLite, 99.999% of the time I don’t even know I’m using it.

In retrospect, there’s ~0 switching cost to these, particularly in age of LLMs. If you do need something more one day, it’ll be easy to do, and if you have to do it in a rush because you’re successful, you’re in Good Problem territory.

Hope that helped, after writing it out, dunno how convincing it is. Feel free to follow up, I appreciate the curiosity/framing because I had the same thought for so long.


Thank you for sharing a detailed anecdote from production; there's not many of those around here.

If we imagine 1 tweet = 1 transaction, that's only 6k tps. 6k tps is completely achievable, dare I say even pedestrian for an optimized database. And most systems are operating far below the scale of Twitter/X.

Sqlite can quite easily do 5000+ insert+commits per second on typical NVMe drives.

Speed is rarely the constraint that makes it unsuitable for an application.


Round trip time is actually much faster than Postgres, since there’s no need to touch the network. You can get massive single threaded throughput. In order to achieve comparable throughput in Postgres you need a large amount of concurrent connections, since each conn spends most of its time passing messages, deserializing etc (with a much larger total amount of overhead). There are a surprising amount of bottlenecks and misconfiguration that can tank performance of networked systems, particularly DBs.

Like you suggest, the reason for not picking SQLite is not reliability, speed, etc. Networked DBs allow decoupling between app and db servers, which have operationally different characteristics. But most importantly, you can have multiple apps access the same DB at the same time. Eg analytics, one off queries, any 3p app that interacts with your data directly.


You mean 100k+ right?

While I understand your point and like the explanation, I gotta make the joke that some Tweets should be lost

> Isn't concurrency also limited by your machines disk speed for writes, what difference does it make if you write sequentially vs concurrently? Why does concurrency even matter for databases?

For a simplified example, having three processes reading blocks X, Y, Z in parallel is much faster than having a single process read block X, wait for the read to finish, read block Y, wait for the read to finish, read block Z and wait for the read to finish.


How many production apps do you think have enough users to justify these huge DB servers?

Huge?

Everything is huge compared to sqlite.

sqlite is more like a file format than a database. it competes with .xlsx.

> "SQLite for everything" crowd is a little bit inexperienced.

every time i see it in a real application, it becomes a huge focus of issues (for example: jellyfin, hermes, openwebui, comfyui)


What kind of issues commonly arise?

anything that requires more than 1 user or not being down all the time

Someone with experience would know that concurrency isn't a universal requirement.

It's almost as if Postgres isn't perfect, and one size shoe doesn't fit all.

Some people want some of the benefits you get from SQLite.

SQLite is obviously not perfect, but it's an incredible piece of software, and people regularly find good ways to make use of an excellent pieces of software.


I absolutely 100% do not understand it either. At all. Every time I try to over the last year or two I come away with the conclusion its something that sounds cool (to me too!) but is guaranteed to cause more problems than more obvious solutions.

That being said I'd kill for someone who used it and benefited to explain it to me in a practical sense. (specifically where syncing is involved, and syncing a subset of the SQLite is necessary. If it's "just" a document store thats treated like a blob for syncing/backup, that's familiar. If it's all in one storage but only local, that's familiar.)

Re: TFA, I guess it would have helped if I knew what Obelisk was, which is on me, and a more in-depth explanation of how this ties into AI/agents, which is on the industry/writer.


It's very likely that you have multiple SQLite databases in your pocket right now. It's one of the most widely deployed pieces of software on the planet. If your conclusion is that it's guaranteed to cause more problems than other solutions, then that's on you.

Correct! I'm not "worried" about it, I've been putting SQLites in your and my pocket for the last 17 years.

I don't want to be glib and leave it there, even though I'm slightly annoyed you missed several sigils in my post that I was well past that.

The point is, for the not in your pocket case, for the not a singular document store case, I'm curious what the use case is.


I use it to keep infra spend low for some systems I built/maintain for a handful of volunteer orgs. These systems have multiple users, dozens to a couple hundred. I just serialize writes in app code. Backup the db files to blob storage every so often and don't think about it much more.

And of course there are now several responses proving your point.

I think the SQLite website itself says it best:

> SQLite does not compete with client/server databases. SQLite competes with fopen().


Most apps do not actually need the concurrency capacity that Postgres or MySQL are designed for.

you also seem to underestimate the performance you can get on one machine.

I mean - I agree for the typical multi-user, SaaS webapp. But I don't think that's what these folks are proposing. If they are - yeesh, count me out.

If on the other hand they're talking about single-user, software in the small - hell yeah. In fact, I'd also promote DuckDB in this regard (mostly for analytics) - with the power of a single machine these days, you can do a surprising amount and never have to worry about distribution. Unless you know you'll have to, in which case you're probably just digging yourself a hole?


The reason the parent post is complaining that it doesn't make sense, is because people have indeed pushed the idea of using SQLite as an alternative for web apps like that.

The typical multi-user SaaS webapp doesn't have anywhere near enough users to overwhelm a single SQLite instance. Of the few that do succeed to the point where that's no longer true, a significant fraction can use techniques like sharding to stretch SQLite further.

Yeah no - sharding SQLite? How about if you know you're going to scale like that, build with something appropriate in the first place.

First, you're very likely underestimating how much load SQLite can handle. SQLite is usually write limited, but for smallish writes it can easily handle thousands a second with very trivial optimization, and with more thought can scale to tens or hundreds of thousands of write transactions per second. In some cases, it can actually out perform traditional server based RDMSs because of reduced overhead and because holding locks on network timescales (which will likely happen even for databases with multiple writers, because eventually you have to deal with two transactions needing to write to the same place) is very inefficient.

Second, I think you're overestimating how hard sharding is here. There are plenty of use cases for which sharding isn't just easy to set up, but the natural thing you'd be likely to do even without scale. Things like e.g. a helpdesk SaaS, where each customer/organization has it's own independent data.

Third, a large part of the point is that you are unlikely to know ahead of time you're going to "scale like that". As I already pointed out, most SaaS apps do not end up having many users. For some that's intentional, but for others the reason is that they simply never caught on. For those cases (and they're the vast majority), cosplaying as a much larger app is a complete waste. It's much better to wait until you're successful enough to need to switch and then use the revenue you now have to solve the scaling problem you ran into.

Fourth, as an aside, ironically the other (slightly less) easy way to shard superficially resembles a common way people cosplay as netflix: splitting your data by domain as "microservices" (although there's a good chance they don't need to be independent processes/on independent machines).


I don't care that much about performance - I imagine any modern machine can handle that load pretty well, for this context that's not a huge concern.

The concerns are more about distributed systems issues, maintenance, etc. Typical NIH issues. Which you're embracing when you start building your own systems (ie: with sharding).

This is a solved problem, why would I want to inject these burdens onto myself, instead of solving ... actual problems that I already have enough of! I don't want to be building another Temporal/Cadence/DBOS/etc.


Scale to zero is very useful.

SQLite also gets really slow at around 50 million rows.

Seems fine at billions of rows in my experience.

what are your pragmas?

:cache_size 1562

:page_size 4096

:journal_mode "WAL"

:synchronous "NORMAL"

:temp_store "MEMORY"

:busy_timeout 5000

(Synchronous FULL in context where it matters)

But, it depends on the shape of your data, your indexes, how much of the data you care about is filling up a page. If your distribution is more even you sometimes need more cache than a more Pareto distributed data set etc. Things like not caching prepared statements costs you more (you should almost always be caching prepared statements per connection with sqlite).

You have to give things more thought at a billion sure. Partial indexes are your friend. You'll also want more cache to prevent thrashing etc.

- https://andersmurphy.com/2025/12/02/100000-tps-over-a-billio...


Are you one of my enterprise customers? What if your workload does not require write concurrency?

It wouldn't, at least not directly. That's why it wasn't done pre-AI.

I'm getting to a point in my life (age) where I want to put my money into things I care about. Netflix? 25$/mo, 3 hours a day. Search 10$/mo? 8 hours a day. Easy.

DigitalOcean. Seriously. They have been around a long long time and built a lot of the core infrastructure you rely on every day (e.g. Ceph).


I;ve had my share of VPS & Managed DB outages at DO, so they are also not faultless.


Not if, but when. No one is faultless. Chasing after 100% is a fool's errand.


I've been with DO since checks mailbox 2014. Honestly never experienced an unannounced outage.


Yeah overall they are ok. I think 3 times managed db and one or twice a vps just dead. No issues in a year or so.

They were always hardware failures, took about 45-120min. Not the end of the world, but also not fun getting lot of client complaints.


I have read plenty of snark about them on HN, but I found their product incredibly useful, well-designed, and easy to work with. If I was building a new startup from scratch, I'd definitely be giving them a look.

I'm sure there are plenty of the like 1,000 AWS products that DO has no viable competitor for, but for what they do offer, they're great.


I've used DigitalOcean for personal projects for over a decade, no major issues so I definitely recommend!


I've had nothing but good experiences with them and their docs and tutorials are excellent.


Yes, I use DO with Hatchbox. It is a perfect combo. Been using for more and more projects.


I'm using GitHub for my business and so do millions more. Might be time to prioritize paying customers, historically popular open source repos and PRs created by known human actors. Agents can wait, humans have much less patience.


Last time I checked, Barman didn't support backups to S3. That's why (for us) pgBackRest was such a big deal: it could offload full and incremental backups to a basically limitless and reliable medium.

I think (and I'm probably wrong now) that Barman only could push backups to another Linux machine (e.g., EC2 box), so you had to worry about your backup system _on top_ of the main DB.

So I'm really hoping someone will pickup maintaining pgBackRest.


Huh... Opposite experience. Barman cloud (s3 backups) is the only way I've ever used it. I didn't realize it wasn't the only way. Makes sense it could just use a filesystem.

https://docs.pgbarman.org/release/3.14.1/user_guide/barman_c...



Something like Rclone and a cron job, or else s3 mounted via FUSE, could possibly bridge that. Of course then you have to worry about reliability of the bridge...


There’s support via Barman cloud - we use it for azure at work but s3 and others are supported iirc


Mounting S3 with Fuse is not stable or performant enough at scale for backup storage


I can understand why it'd be preferable to avoid such a bridge layer, and indeed I too would rather just have a transparent view of what's going on at the protocol level.

Stability and performance at scale sound like implementation specific properties though. If you've tried this, I'd be curious to known about the specific issues you encountered.


https://github.com/Barre/ZeroFS Should do a great job at this.


There are other ways to mount S3, but you may want to check out Amazon's new product, S3 files: https://aws.amazon.com/about-aws/whats-new/2026/04/amazon-s3...

https://aws.amazon.com/s3/features/files/


Shameless plug[0].

[0] https://pgdog.dev


This is just a language server problem. I'm sure you can configure whatever language server PHP is using to disable specific warnings, etc.


> I'm sure you can configure whatever language server PHP is using to disable specific warnings, etc

You may be able to do this by editing a language server-specific config file in whatever arcane syntax they decided to offer. But there isn't any editor support for configuring languages servers, so it's a bit of a lift for a newcomer who just wants to turn off some warnings


I think the application should own its dependencies and its default config. In this case, it felt to me like no one had really looked at them.


Editors should take full responsibility of the user experience. The user should not have to care about the existence of language servers.


Amazing project that spawned entire companies. We used it to build postgresml[0] and most Postgres extensions are built on top of it these days.

[0] https://github.com/postgresml/postgresml


The domain for the org has expired and now it's a parking hosts. I don't know how maintained it is.


Postgresml closed up shop, the guy you're replying to was one of the key players there. Pgrx is trucking right along, and gaining speed at the moment.


Amazing, I didn't know this existed.


On top of it, intel chips are not competitive with apple silicon. Why buy a laptop that's 30% slower and uses more energy for the same price?


30% slower than a M5 is a M3/M4. I will take that, thank, and not concern myself with MacOS or the thousand cuts of leaving x86.


To be able to run any OS you want.


Can it run macOS?



For now, yes. But probably not for the next macos release.


To avoid having to use Mac OS, and suffering the whims of Apple.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: