> How are they managing the concurrent writes? PRAGMA journal_mode = WAL; PRAGMA...

aynyc · 2024-08-29T13:08:53 1724936933

#1. They are pretty much my default settings now. But how do you serialize concurrent write with multiple processes? Let's say you run your API service in separate stack as your webapp service. You can't really do that. I know some folks just make sure the transactions are so fast, collision will be rare. But why take the chance?

#2. Snapshot is backup strategy. AFAIK, from reading some reddit comments by Ben? (author of litestream) that it's not a replication strategy.

I've used sqlite in ETL pipeline very successfully, as well as as a read-only caching. I just can't figure out how people use it as server without doing a whole bunch of hacks to deal with its limitations.

bob1029 · 2024-08-29T16:13:38 1724948018

> multiple processes

I would never try to have 2 different processes (such as a web server) talking to the same SQLite database. That is a recipe for performance disaster. All of the projects I build this way are single process designs. It is not really a good fit for legacy solutions.

If you absolutely require transactional integrity against the same datastore with multiple processes, you could consider selecting one process as the DB owner and then IPC/RPC the SQL commands from the other processes to that one. Named pipes (shared memory) are nearly free in many ecosystems. You will find tradeoffs with a proxied approach, such as all application defined functions need to be defined on the db owning process.

> that it's not a replication strategy

Correct, but many scenarios can survive with 15-60 minutes (whatever the snapshot interval is) of data loss. The complexity trade off is often worth it if the business can tolerate. If you do require replication, then a 3rd party library or something application-level is what you need.

aynyc · 2024-08-29T23:38:48 1724974728

I think my point is still valid. How are people using sqlite3 in server context? Single process in modern webapp and 15-60 minutes of data loss might be acceptable but why? To save yourself a postgresql install which can be on the same server?

I feel like I missed something here.