Hacker News new | past | comments | ask | show | jobs | submit login
SQLedge: Replicate Postgres to SQLite on the Edge (github.com/zknill)
359 points by clessg on Aug 9, 2023 | hide | past | favorite | 79 comments



Following the PostgreSQL logical replication stream to update a local SQLite database copy is definitely a neat trick, and feels very safe to me (especially since you track the Log Sequence Number in a postgres_pos table).

The bit that surprised me was that this thing supports writes as well!

It does it by acting as a PostgreSQL proxy. You connect to that proxy with a regular PostgreSQL client, then any read queries you issue run against the local SQLite copy and any writes are forwarded on to "real" PostgreSQL.

The downside is that now your SELECT statements all need to be in the subset of SQL that is supported by both SQLite and PostgreSQL. This can be pretty limiting, mainly because PostgreSQL SQL is a much, much richer dialect than SQLite.

Should work fine for basic SELECT queries though.

I'd find this project useful even without the PostgreSQL connection/write support though.

I worked with a very high-scale feature flag system a while ago - thousands of flag checks a second. This scaled using a local memcached cache of checks on each machine, despite the check logic itself consulting a MySQL database.

I had an idea to improve that system by running a local SQLite cache of the full flag logic on every frontend machine instead. That way flag checks could use full SQL logic, but would still run incredibly fast.

The challenge would be keeping that local SQLite database copy synced with the centralized source-of-truth database. A system like SQLedge could make short work of that problem.


> I worked with a very high-scale feature flag system a while ago - thousands of flag checks a second.

May I ask why the flags are checked that frequently? Couldn't they be cached for at least a minute?

> It does it by acting as a PostgreSQL proxy. [...] and any writes are forwarded on to "real" PostgreSQL.

What happens if there's a multi-statement transaction with a bunch of writes sent-off to the mothership - which then get returned to the client via logical replication, but then there's a ROLLBACK - how would that situation be handled such that both the SQLite edge DBs and the mothership DB are able to rollback okay - would this impact other clients?


Feature flag systems are usually based on a set of rules that could be serialized and evaluated locally (this is how pretty much every open source feature flag system and feature flag SaaS works). Usually it's based on some kind of UUID being hashed with a per-flag seed and bucketed after some set of targeting rules are applied to other properties passed in for that user. There are added features where you can stores large cohorts to do specific targeting and usually there's some kind of local cache added to make that look-up faster for recent users.

I'm not sure what the original commenter was doing but it sounds like they had some kind of targeting that was almost entirely based on cohorts or maybe they needed to have stability over time which would require a database. We did something similar recently except we just store a "session ID" with a blob for look-up and the evaluation only happens on the first request for a given session ID.


> May I ask why the flags are checked that frequently? Couldn't they be cached for at least a minute?

Not in that project but feature flags don't have to be all or nothing. You can apply flags to specific cohorts of your users for example, so if you have a large user base, even if you cache them per-user, it still translates into many checks a second for large systems.


I guess the cost of doing it precisely isn't terribly high, but if the majority of the flags were "off" (eg, subsets of users being opted into a beta or something), I wonder if you cut a bunch of the queries by sending down a bloom filter.

So basically you check the filter first, and if that says the feature is enabled, only then do you actually ask the real DB.


Checking a feature flag for a given cohort/user/demographic/etc. should always be a memory read, not a syscall. (You can't have syscalls in a hot loop!) That means feature flags will always be cached in each application instance from their source of truth. And that means you would never query the source of truth directly from the application, regardless of any property of the flag or its users.


> May I ask why the flags are checked that frequently? Couldn't they be cached for at least a minute?

Not the previous poster, but it appears in the scenario, the SQLite database is the cache.


They were being cached for at least a minute (maybe even more than that, I can't remember the details) - that's what the local memcached instance was for.

This was problematic though because changing a feature flag and then waiting for a minute plus to see if the change actually worked can be frustrating, especially if it relates to an outage of some sort.


If the cache is out-of-process, access is going to be subject to faults and latencies that are categorically different than anything that would be in-process.

You'd never want to include a request to an out-of-process service in any kind of hot loop, even if that service is on the same machine.


The logical replication protocol sends a series of messages that essentially follow the flow that a database transaction would.

i.e. a stream of messages like: "BEGIN", "[the data]", ["COMMIT" or "ROLLBACK"].

So any application that listens to the Postgres replication protocol can handle the transaction in the same way that Postgres does. Concretely you might choose to open a SQLite transaction on BEGIN, apply the statements, and then COMMIT or ROLLBACK based on the next messages received on the stream replication protocol.

The data sent on the replication protocol includes the state of the row after the write query has completed. This means you don't need to worry about getting out of sync on queries like "UPDATE field = field + 1" because you have access to the exact resulting value as stored by Postgres.

TL;DR - you can follow the same begin/change/commit flow that the original transaction did on the upstream Postgres server, and you have access to the exact underlying data after the write was committed.

It's also true (as other commenters have pointed out) that for not-huge transactions (i.e. not streaming transactions, new feature in Postgres 15) the BEGIN message will only be sent if the transaction was committed. It's pretty unlikely that you will ever process a ROLLBACK message from the protocol (although possible).


Logical replication never includes uncommitted data, unless you have written an custom output plugin.

EDIT: https://stackoverflow.com/questions/52202534/postgresql-does...


It's true for protocol version 1 that only the committed data is sent on the replication connection.

In protocol version 2 (introduced in postgres 14) large in-progress transactions can appear in the new "Stream" messages sent on the replication connection: StreamStart, StreamStop, StreamAbort, StreamCommit, etc. In the case of large in-progress transactions uncommitted data might end up in the replication connection after a StreamStart message. But you would also receive a StreamCommit or StreamAbort message to tell you what happened to that transaction.

I've not worked out what qualifies as a "large" transaction though. But it is _possible_ to get uncommitted data in the replication connection, although unlikely.

https://www.postgresql.org/docs/15/protocol-logicalrep-messa...


My limited understanding of logical replication is that writes only happen at COMMIT. Ie. nothing is replicated until it's committed.


> Couldn't they be cached for at least a minute?

Only per feature+per user. (Though 1000s per second does seem high unless your scale is gigantic.)

> What happens if there's a multi-statement transaction with a bunch of writes sent-off to the mothership - which then get returned to the client via logical replication, but then there's a ROLLBACK

Nothing makes it into the replication stream until it is committed.


Honest question: why is SQLLite needed for local? Why would you not have PG at edge that replicates data with central PG? That way the SQL dialect problem you mentioned wouldn't exist.


That is a much safer way to go for most use cases. Well actually, most use cases don't need edge compute at all, but for those that do, this setup is indeed common, and fine for most apps:

- Say we do edge compute in San Francisco, Montreal, London and Singapore

- Set up a PG master in one place (like San Francisco), and read replicas in every place (San Francisco, Montreal, London and Singapore)

- Have your app query the read replica when possible, only going to the master for writes

In rare cases, maybe any network latency is not OK, you really need an embedded DB for ultimate read performance - then this is pretty interesting. But a backend server truly needing an embedded DB is certainly a rare case. I would imagine this approach would come with some very major downsides, like having to replicate the entire DB to each app instance, as well as the inherent complexity/sketchiness of this setup, when you generally want your DB layer to be rock solid.

This is probably upvoted so high on HN because it's pretty cool/wild, and HN loves SQLite, vs. it being something many ppl should use.


SQLite is much smaller and self-contained than postgres. It's written in ANSI-C and by including one file you have access to a database (which is stored in another single file). It's popular in embedded systems like, I imagine, edge devices


You know what's faster than a local connection to Postgres? Having the database engine directly embedded in your application. No context switch.


There are still context switches with sqlite, even in-memory sqlite (it has to mess with files, which means syscalls). Just not network ops.


In-memory SQLite doesn't read or write files, and therefore doesn't generate syscalls, by definition.


Fair enough; I didn't realize that ":memory:" SQLite accesses did zero syscalls overall, I had assumed that they required shared memory writes that entailed syscalls for barriers and the like.

I'm happy you helped me learn that's not the case! That'll make several things I regularly need to do much, much easier.


Sure! It's a huge and important benefit of that access mode.


The overhead of IPC isn't significant here, unless there's some special use case I'm not thinking of. SQLite might still be faster for small queries for other reasons.


A simple version of this is to do a very cheap SELECT * [where tenant = ...] of your feature flag table(s) to a dictionary structure in memory on every single edge/application server, and do this atomically every few seconds or minutes.

Statsig [0] and Transifex [1] both use this pattern to great effect, transmitting not only data but logic on permissions and liveness, and you can roll your own versions of all this for your own domain models.

I'm of the opinion that every agile project should start with a system like this; it opens up entirely new avenues of real-time configuration deployment to satisfy in-the-moment business/editorial needs, while providing breathing room to the development team to ensure codebase stability.

(As long as all you need is eventual consistency, of course, and are fine with these structures changing in the midst of a request or long-running operation, and are fine with not being able to read your writes if you ever change these values! If any of that sounds necessary, you'll need some notion of distributed consensus.)

[0] https://docs.statsig.com/server/introduction

[1] https://developers.transifex.com/docs/native


Does it though? If it’s a proxy it can support the SQLite read and the Postgres write syntax. If reads only ever go to SQLite they don’t need to work on Postgres.


How many flags are we talking here? I implemted a similar system and we just replace the whole sqlite DB file by downloading it from the centralized storage whenever it changes.

Even with 1M flags it's still only a few 100 kB compressed.

I wouldn't replicate per user flags to the edge to keep size under control.


It's nice to see another pg proxy using the pgx parser (their src[1]) - I built one using this lib too. However, this implementation is missing a lot of low level features to be considered close to compatible, including: multi-query transactions, auth, TLS, extended query mode, query cancellation.

[1]: https://github.com/zknill/sqledge/blob/main/pkg/pgwire/postg...


I wonder, how does this handle a single transaction that contains both reads and writes? Maybe it just says "within a transaction, all reads and writes go through the Postgres proxy, SQLite is ignored"?


One use case I can see this being valuable for is for a client based application and Postgres being a centralized database. The client would just query SQLite and not need to write Postgres SQL.


Ff are mostly written centrally and cached to memory.


This is pretty neat! One question, if all your queries have to be SQLite-compatible, doesn't that defeat the purpose of using PG in the first place? Maybe SQLite supports more PG features than I thought, but if for example your app uses pgvector or pgcrypto you might have issues here.


Yes, absolutely, and this is going to be one of the hardest tech challenges to solve. I've thought a little about it, and it's probably unrealistic to think that we can translate every single PG statement into a SQLite one, especially when PG has extensions. So we're probably destined to use the local SQLite database for queries we can parse and understand, and forwarding all the others (both reads and writes) to the upstream PG server.

This slightly breaks model of having a local copy serve data faster, but if only the minority of queries use a format that we don't understand in SQLite then only that minority of queries will suffer from the full latency to the main PG server.


> doesn't that defeat the purpose of using PG in the first place

PG can scale higher than SQLite, especially considering concurrent writers. So even without the PG syntax and extensions, it's still useful. Also, maybe you can use PG syntax for complex INSERT (SELECT)s?


I'm super excited for this -- it seems like it's perfect as an app-local cache of things that can be a drop-in replacement for some high-cost queries.

Are there any plans to support which tables get copied over? The main postgres database is too big to replicate everywhere, but some key "summary" tables would be really nice to have locally.


And how do you manage conflicts?

edit

> The writes via SQLedge are sync, that is we wait for the write to be processed on the upstream Postgres server

OK, so, it's a SQLite read replica of a Postgres primary DB.

Of course, this does mean that it's possible for clients to fail the read-your-writes consistency check.


How "edgy" can real PostgreSQL be? Seems to me that this is all in lieu of using real PostgreSQL replication, on the basis that real Postgres is too heavy / complex to run on the edge. Can a true PostgreSQL replica be configured in a light weight way to serve a similar purpose?


I have the same question. There have been demos of local Postgres-on-web using wasm, which would not solve this issue (browser is way heavier than SQLite), but maybe it demonstrates how portable Postgres can be with some effort.


A commercial offering (although also open source): https://www.powersync.co/


Does anyone know of a tool that will export a Postgres database to a SQLite database file? Seems like a handy way of exporting and passing around smallish DBs. I feel like this tool must exist, but I haven’t found it yet. (Supporting imports and data transformations would be even better!)


I wrote a tool to do that: https://datasette.io/tools/db-to-sqlite

You can see an example of it in use here: https://github.com/simonw/simonwillisonblog-backup/blob/main...


We replicated our MySQL database to a SQLite edge at Segment in ctlstore: https://github.com/segmentio/ctlstore

We considered tailing binlogs directly but there's so much cruft and complexity involved trying to translate between types and such at that end, once you even just get passed properly parsing the binlogs and maintaining the replication connection. Then you have to deal with schema management across both systems too. Similar sets of problems using PostgreSQL as a source of truth.

In the end we decided just to wrap the whole thing up and abstract away the schema with a common set of types and a limited set of read APIs. Biggest missing piece I regret not getting in was support for secondary indexes.


Is there a reason you didn't add them when making the shared API?


If you're asking about secondary indexes, it was just seen as a "later" feature we'd implement as a follow-up. It was definitely asked for, just never prioritized before I moved off the project.


This space is starting to get crowded. Can anyone compare this with some of the other solutions coming out recently?


List a couple?


#. SQLite WAL mode

From https://www.sqlite.org/isolation.html https://news.ycombinator.com/item?id=32247085 :

> [sqlite] WAL mode permits simultaneous readers and writers. It can do this because changes do not overwrite the original database file, but rather go into the separate write-ahead log file. That means that readers can continue to read the old, original, unaltered content from the original database file at the same time that the writer is appending to the write-ahead log

#. superfly/litefs: a FUSE-based file system for replicating SQLite https://github.com/superfly/litefs

#. sqldiff: https://www.sqlite.org/sqldiff.html https://news.ycombinator.com/item?id=31265005

#. dolthub/dolt: https://github.com/dolthub/dolt :

> Dolt is a SQL database that you can fork, clone, branch, merge, push and pull just like a Git repository. [...]

> Dolt can be set up as a replica of your existing MySQL or MariaDB database using standard MySQL binlog replication. Every write becomes a Dolt commit. This is a great way to get the version control benefits of Dolt and keep an existing MySQL or MariaDB database.

#. github/gh-ost: https://github.com/github/gh-ost :

> Instead, gh-ost uses the binary log stream to capture table changes, and asynchronously applies them onto the ghost table. gh-ost takes upon itself some tasks that other tools leave for the database to perform. As result, gh-ost has greater control over the migration process; can truly suspend it; can truly decouple the migration's write load from the master's workload.

#. vlcn-io/cr-sqlite: https://github.com/vlcn-io/cr-sqlite :

> Convergent, Replicated SQLite. Multi-writer and CRDT support for SQLite

> CR-SQLite is a run-time loadable extension for SQLite and libSQL. It allows merging different SQLite databases together that have taken independent writes.

> In other words, you can write to your SQLite database while offline. I can write to mine while offline. We can then both come online and merge our databases together, without conflict.

> In technical terms: cr-sqlite adds multi-master replication and partition tolerance to SQLite via conflict free replicated data types (CRDTs) and/or causally ordered event logs.

yjs also does CRDTs (Jupyter RTC,)

#. pganalyze/libpg_query: https://github.com/pganalyze/libpg_query :

> C library for accessing the PostgreSQL parser outside of the server environment

#. Ibis + Substrait [ + DuckDB ] https://ibis-project.org/blog/ibis_substrait_to_duckdb/ :

> ibis strives to provide a consistent interface for interacting with a multitude of different analytical execution engines, most of which (but not all) speak some dialect of SQL.

> Today, Ibis accomplishes this with a lot of help from `sqlalchemy` and `sqlglot` to handle differences in dialect, or we interact directly with available Python bindings (for instance with the pandas, datafusion, and polars backends).

> [...] `Substrait` is a new cross-language serialization format for communicating (among other things) query plans. It's still in its early days, but there is already nascent support for Substrait in Apache Arrow, DuckDB, and Velox.

#. ibis-project/ibis-substrait: https://github.com/ibis-project/ibis-substrait

#. tobymao/sqlglot: https://github.com/tobymao/sqlglot :

> SQLGlot is a no-dependency SQL parser, transpiler, optimizer, and engine. It can be used to format SQL or translate between 19 different dialects like DuckDB, Presto, Spark, Snowflake, and BigQuery. It aims to read a wide variety of SQL inputs and output syntactically and semantically correct SQL in the targeted dialects.

> It is a very comprehensive generic SQL parser with a robust test suite. It is also quite performant, while being written purely in Python.

> You can easily customize the parser, analyze queries, traverse expression trees, and programmatically build SQL.

> Syntax errors are highlighted and dialect incompatibilities can warn or raise depending on configurations. However, it should be noted that SQL validation is not SQLGlot’s goal, so some syntax errors may go unnoticed.

#. benbjohnson/postlite: https://github.com/benbjohnson/postlite :

> postlite is a network proxy to allow access to remote SQLite databases over the Postgres wire protocol. This allows GUI tools to be used on remote SQLite databases which can make administration easier.

> The proxy works by translating Postgres frontend wire messages into SQLite transactions and converting results back into Postgres response wire messages. Many Postgres clients also inspect the pg_catalog to determine system information so Postlite mirrors this catalog by using an attached in-memory database with virtual tables. The proxy also performs minor rewriting on these system queries to convert them to usable SQLite syntax.

> Note: This software is in alpha. Please report bugs. Postlite doesn't alter your database unless you issue INSERT, UPDATE, DELETE commands so it's probably safe. If anything, the Postlite process may die but it shouldn't affect your database.

#. > "Hosting SQLite Databases on GitHub Pages" (2021) re: sql.js-httpvfs, DuckDB https://news.ycombinator.com/item?id=28021766

#. >> - bittorrent/sqltorrent https://github.com/bittorrent/sqltorrent

>> Sqltorrent is a custom VFS for sqlite which allows applications to query an sqlite database contained within a torrent. Queries can be processed immediately after the database has been opened, even though the database file is still being downloaded. Pieces of the file which are required to complete a query are prioritized so that queries complete reasonably quickly even if only a small fraction of the whole database has been downloaded.

#. simonw/datasette-lite: https://github.com/simonw/datasette-lite datasette, *-to-sqlite, dogsheep

"Loading SQLite databases" [w/ datasette] https://github.com/simonw/datasette-lite#loading-sqlite-data...

#. awesome-db-tools: https://github.com/mgramin/awesome-db-tools

Lots of neat SQLite/vtable/pg/replication things


A few I know: rqlite, dqlite, SQLite wasm, litestream.


rqlite[1] creator here, happy to answer any questions.

[1] https://www.rqlite.io


(I guess I can test or ask in the rqlite Slack etc)

Can I try to use a UNIX socket to TCP socket so I can use rqlite as a drop-in sqlite replacement (where a client expects a file)? for ex:

socat TCP-LISTEN:4000,reuseaddr,fork UNIX-CLIENT:/path/to/sqlite


No, rqlite doesn't support that kind of access. No reason it couldn't, but it currently only exposes a HTTP API.

https://rqlite.io/docs/faq/#is-it-a-drop-in-replacement-for-...


Why SQLite instead of a standard hub and spoke replication? What benefit does this provide? Being able to run your database on the client? That seems like it would be risky


Because it’s running in the app’s address space, SQLite is obscenely fast for reads, so much so that you can often just write N+1 queries with it.


In this case though it appears that SQLEdge is running as its own process, distinct from the app, and the app sends all queries to the SQLEdge proxy.


That makes less sense to me then, though the IPC cost for those reads is still much lower than a typical n-tier Postgres setup.


How would mutually incompatible upstream changes from multiple SQLite edge instances be resolved? You'd need user input for that, right?


The writes via SQLedge are sync, that is we wait for the write to be processed on the upstream Postgres server. So it operates as if SQLedge wasn't in the request path from application to Postgres. The writes to Postgres are only reflected in SQLite when the data is received back from the Postgres server on the replication slot.

This means writes are eventually consistent, currently, but I intend to include a feature that allows waiting for that write to be reflected back in SQLite which would satisfy the 'read your own writes' property.

SQLedge will never be in a situation where the SQLite database thinks it has a write, but that write is yet to be applied to the upstream Postgres server.

Basically, the Postgres server 'owns' the writes, and can handle them just like it would if SQLedge didn't exist.


So, the advertised performance is just for the read-part, not the write-part?

As far as I understand: writing would still exhibit the same characteristics as before, while reads would never be affected by other users.


What about migrations?


I don't think that's a problem here, because the write queries aren't handled by the local SQLite databases - instead, the proxy forwards them directly to the PostgreSQL instance, so presumably the resulting changes show up in SQLite when they are replicated back down again.


> SQLedge serves reads from it's local sqlite database, and forwards writes to the upstream postgres server that it's replicating from.

I don't think SQLite data is ever written back to the postgres DB so this shouldn't be an issue


Very interesting! I have question ( out of my experience in https://github.com/maxpert/marmot ) how do get around the boot time, specially when a change log of table is pretty large in Postgres? I've implemented snapshotting mechanism in Marmot as part of quickly getting up to speed. At some level I wonder if we can just feed this PG replication log into NATS cluster and Marmot can just replicate it across the board.


We're currently grappling with trying to build a system similar to https://stripe.com/sigma with a single multi-tenant Postgres db as the source and an SQLite read replica per tenant. Currently we re-generate the world every night. We need fine-grained control over the SQLite schema (it's a public api we let users write sql against). Any related projects would be great to point me towards!


We need a standardized WAL format. Too many people trying to write software to simulate it.


WAL = write-ahead logging?


Yep, and the world is now full of apps that read one WAL format and copy the data somewhere else. This app, Kafka, “data lake” replication to heterogeneous databases, several other flavors of database replication, on and on. And every bit of it is custom to the WAL format of the source database.


I’ve been thinking a lot about this problem of distributed databases, and so far the easiest and most innovative solution I’ve seen is PolyScale[1], basically a CDN cache at the SQL layer with smart automatic invalidation. I just don’t want to deal with the hairiness of logical replication and now sqlite<>postgres incompatibility.

[1] Not associated with PolyScale in any way, just an interested potential customer.


Very cool. I've always found it hard to write any slightly complex SQL (i.e. not just a simple join) in a way that's fast on multiple DBMSes. Even if both support the same syntax you need, subtle things can affect the performance a lot. But there are probably use cases where this isn't an issue and you care a lot more about using something compatible with limited edge devices, like SQLite.


If I was convinced of the Edge I would use this. I tried to challenge this concept because I don't like the Edge hype. But I must say, this looks like one of the best solutions in terms of cost, simplicity and maturity (PG, SQlite). The major downsides are the limits in data size and the SQL features. Overall I think it's really well thought out and done.


Somewhat orthogonal, but I was just thinking of looking for or writing a MySQL compatible frontend for SQLite. There’s a third party app that has removed SQLite support in its latest version and I really don’t want a MySQL database for something SQLite is more than capable of for my use case.


I'm a bit confused by this. Since SQLEdge is a proxy, you lose all the IO benefits of running an embedded in-process DB.

At that point, why not replicate to a real Postgres on the edge?

Maybe the expectation is that the application also opens the SQLite file directly? (But in that case, what is the point of the SQLEdge proxy?)


This looks neat. Any support for postgres schemas? Would be cool if this supported each SQLite chard to write to separate schema - giving each "share" a single tenant "view" while giving access to all data in the postgres instance?

Or is this only public or all schemas?


this seems neat, I'll definitely give it a look. this seems like a very suitable trade-off for a lot of applications I have worked on.

does anyone know if there is a postgres-postgres version of this that is easy to run in ephemeral environments? ideally I'd like to be able to run Postgres sidecars along my application containers and eliminate the network roundtrip using the sidecar as a read replica, but haven't seen this being done anywhere. maybe it wouldn't be fast enough to run in such scenarios?


In PG only one can use logical or streaming replication. Then all reads to the replica and writes to main. App needs two connections - one for read, one for write.


I can’t be the only person who doesn’t know what ‘at the edge’ means in this context. Any useful links to educate myself?


"At the edge" means closer to the end user. Maybe you have a central server with the master database in Virginia, USA and edge servers with read-only follower databases near various big cities where your customers live. The point of edges is to provide lower-latency access due to their proximity. They also reduce load on the master server, but you don't need edges to do that.


For one moment I thought this was the equivalent of SQLite for EdgeDB and the idea made me very happy.


What does on the edge mean in this context? What is the use case ?


How does this work with custom postgres types?


I am sick of reading "on the edge"




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: