The notifier pattern for applications that use Postgres

oersted · 2024-05-14T14:01:14

Take a look at Materialize, Noria and the family of Differential/Timely Dataflow technologies. It's the same concept on steroids, you can subscribe to arbitrary queries and efficiently receive any changes to that view. You can also efficiently maintain any materialized view for extremely fast reads for known queries.

An automatic stream processing pipeline for maintaining caches and listening to complex real-time events.

Quite underrated, it has so much promise. The concept is not new but it's still semi-stuck in Rust-land. It's becoming more mainstream with Materialize, which is technically open-source, but they are quite aggressive with pushing their expensive cloud and offuscating on-prem usage.

https://github.com/MaterializeInc/materialize

https://github.com/mit-pdos/noria

https://timelydataflow.github.io/differential-dataflow/

https://timelydataflow.github.io/timely-dataflow/

benesch · 2024-05-14T15:07:53

(Materialize CTO here.)

> It's becoming more mainstream with Materialize, which is technically open-source, but they are quite aggressive with pushing their expensive cloud and offuscating on-prem usage.

Quick but important clarification: Materialize is source available, not open source. We've been licensed under the BSL [0] from the beginning. We feel that the BSL is the best way to ensure we can build a sustainable business to fund Materialize's development, while still contributing our research advances back to the scientific community.

> Quite underrated, it has so much promise.

I'm glad you think so. We think so too. One of the best parts of my job is watching the "aha" moment our prospects have when they realize how much of the complex code they've been writing is neatly expressed as a SUBSCRIBE over a SQL materialized view.

[0]: https://github.com/MaterializeInc/materialize/blob/main/LICE...

steeeeeve · 2024-05-14T17:52:22

It's crazy to me that the most updated file in your repository is the license - pushing back the open source date by a day every day.

benesch · 2024-05-14T19:19:40

Those updates are not retroactive. They apply on a go forward basis. Each day's changes become Apache 2.0 licensed on that day four years in the future.

For example, v0.28 was released on October 18, 2022, and becomes Apache 2.0 licensed four years after that date (i.e., 2.5 years from today), on October 18, 2026.

[0]: https://github.com/MaterializeInc/materialize/blob/76cb6647d...

BeefySwain · 2024-05-14T20:35:12

I love this concept. Did you all come up with this or is there prior art? Is there a name for this concept?

benesch · 2024-05-14T21:23:29

We did not originate the Business Source License (BSL/BUSL). It was originally developed by the folks behind MariaDB. Wikipedia has a good article that covers the history: https://en.wikipedia.org/wiki/Business_Source_License

Other large projects using the BSL include CockroachDB and (somewhat infamously) Terraform.

We're very glad to have been using the BSL for Materialize since our very first release. Relicensing an existing open source project under the BSL can be a painful transition.

BeefySwain · 2024-05-14T21:47:04

I was actually asking about the automatic timed re-license to Apache :)

benesch · 2024-05-15T00:11:40

Ah, I misunderstood! Yes, we may have invented that. I whipped up the cron job a few years back in response to concerns from our legal team. I’m not aware of any prior art for automatically advancing the change date for the BSL.

jjovan1 · 2024-05-14T16:56:53

Hey Benesch, is Materialize used by TimescaleDB to create Materialized View? I noticed a similar approach.

benesch · 2024-05-14T19:22:28

Not to my knowledge. I believe TimescaleDB has their own incremental view maintenance engine.

jjovan1 · 2024-05-14T20:52:58

Ok so I was wondering if your solution is faster. I noticed their materialized views are not as fast for real time data.

benesch · 2024-05-14T21:29:38

We haven't benchmarked TimescaleDB, so I can't say. Results tend to vary heavily by workload, too.

What I can say is that the research at the heart of Materialize (https://dl.acm.org/doi/10.1145/2517349.2522738) allows us to efficiently maintain computations that are more complex than what a lot of other IVM systems can handle.

Your best bet is to run your own benchmark of both systems using data that's representative of your workload. We offer a free seven day playground if you'd like to run such a benchmark: https://console.materialize.com/account/sign-up

We also have a community Slack where a number of Materialize employees hang out and answer questions: http://materialize.com/s/chat

jjovan1 · 2024-05-18T07:44:01

Thanks!

bouncing · 2024-05-14T14:04:36

All cool stuff.

But to my mind, the main advantage for this postgres workflow is that you get something simple without adding a ton of stuff to your stack.

oersted · 2024-05-14T14:08:07

Indeed! It would be so much better if this were a Postgres extension instead.

There are some efforts but still quite immature: https://github.com/sraoss/pg_ivm

But at least Materialize does have Postgres wire compatibility, and same for Noria and MySQL. It's a plug & play switch, it's not as complex as adopting other Pub-Sub, Job-Queue or KV-Caching solutions.

benesch · 2024-05-14T15:36:19

> It would be so much better if this were a Postgres extension instead.

I've thought about this counterfactual a lot. (I'm a big part of the reason that Materialize was not built as a PostgreSQL extension.) There are two major technical reasons that we decided to build Materialize as a standalone product:

1. Determinism. For IVM to be correct, computations must be strictly deterministic. PostgreSQL is full of nondeterministic functions: things like random(), get_random_uuid(), pg_cancel_backend(), etc. You can see the whole list with `SELECT * FROM pg_proc WHERE provolatile <> 'i'`. And that's just scratching the surface. Query execution makes a number of arbitrary decisions (e.g., ordering or not) that can cause nondeterminism in results. Building an IVM extension within PostgreSQL would require hunting down every one of these nondeterministic moments and forcing determinism on them—a very long game of whack a mole.

2. Scale. PostgreSQL is fundamentally a single node system. But much of the reason you need to reach for Materialize is because your computation is exceeding the limit of what a single machine can handle. If Materialize were a PostgreSQL extension, IVM would be competing for resources (CPU, memory, disk, network) with the main OLTP engine. But since Materialize is a standalone system, you get to offload all that expensive IVM work to a dedicated cluster of machines, leaving your main PostgreSQL server free to spend all of its cycles on what it's uniquely good at: transaction concurrency control.

So while the decision to build Materialize as a separate system means there's a bit more friction to getting started, it also means that you don't need to have a plan for what happens when you exceed the limits of a single machine. You just scale up your Materialize cluster to distribute your workload across multiple machines.

One cool thing we're investigating is exposing Materialize via a PostgreSQL foreign data wrapper [0]. Your ops/data teams would still be managing two separate systems, but downstream consumers could be entirely oblivious to the existence of Materialize—they'd just query tables/views in PostgreSQL like normal, and some of those would be transparently served by Materialize under the hood.

[0]: https://www.postgresql.org/docs/current/postgres-fdw.html

d0mine · 2024-05-14T17:05:51

Timely dataflow in Python: https://github.com/bytewax/bytewax

matlin · 2024-05-14T14:50:09

For folks using Typescript, we're building this for web development at Triplit. We support incrementally updating subscribed queries in real-time from server to client over web-sockets.

https://www.triplit.dev/

koolba · 2024-05-14T10:11:12

This post misses the most important part of LISTEN / NOTIFY: transactions

The notify piece respects your current transactional state. So of you issue a notify within a transaction, it’s only delivered if the transaction commits. If the transaction rolls back, the notification is discarded.

This leads to the common pattern of combining the notification with insertion into a work queue. That solves the “listener was not listening yet” problem too.

brandur · 2024-05-14T10:20:13

Author here. The behavior of notify with respect to transactions is indeed notable, and definitely a great feature that makes them distinct from pub/sub in other systems. Notifies fire only when data is ready after the transaction is committed, and they're also deduplicated based on payload so listeners don't have to react to many of the same message unnecessarily.

That said, NOTIFY isn't really what this post is about. It concerns itself with the other half of listen/notify by describing a "notifier" pattern, one which listens via `LISTEN` statements and distributes them application subcomponents to help maximize economy around the use of Postgres connections.

koolba · 2024-05-14T10:49:13

Ha! I just realized that the “notifier” in the post title refers to the handling on the client side.

cryptonector · 2024-05-14T21:25:09

Also, because there are no authorization controls on who can `NOTIFY` on a given channel (indeed, channels aren't even `CREATE`d objects) you can't really rely on the notification payload, so you really do have to go query a work queue.

riedel · 2024-05-14T16:47:19

I was recently very annoyed by my immich server on my NAS doing constant writes to risk. I discovered that this was due to the use of the postgres notify backed socket.io plugin. It turns out while the notify itself does not use the WAL for any information, one needs to trigger a WAL flush for the notify to propagate. In my case this lead toa lot of unnecessary empty WAL writes. If you scale up applications or anyways have constant writes this won't matter. But for a home lab server this is suboptimal.

zorgmonkey · 2024-05-14T20:35:45

I'm not an expert in postgres performance, but I would think that excessive WAL flushing would also slow down workloads with constant writes.

mcqueenjordan · 2024-05-14T10:34:12

Other than the space for past notifications and/or having to issue a DELETE, are there significant reasons to prefer this over the typical table-based approach with SKIP LOCKED queries to poll the queue?

It seems to me that if the listener dies, notifications in the meantime will be dropped until a listener resubscribes, right? That seems prone to data loss.

In the SKIP LOCKED topic-poller style pattern (for example, query a table for rows with state = 'ready' on some interval and use SKIP LOCKED), you can have arbitrary readers and if they all die, inserts into the table still go through and the backlog can later be processed.

brandur · 2024-05-14T10:47:27

In Postgres listen/notify are inherently lossy channels — if a notification goes out while a listener wasn't around to receive it, it's gone, so they should never be relied upon in cases where data consistency is at stake.

I find that the main thing they're useful for is notifying on particular changes so that components that care about them can decrease the time until they process those changes, and without sitting in a hot loop constantly polling tables.

For example, I wrote a piece here [1] describe how we use the notifier to listen for feature flag changes so that each running program can update its flag cache. Those programs could be sitting in loops reloading flags once a second looking for changes, but it's wasteful and puts unnecessary load on the database. Instead, each listens for notifications indicating that some flag state changed, then reloads its flag cache. They also reload every X seconds so that some periodic synchronization happens in case an update notification was missed (e.g. a notifier temporarily dropped offline).

Job queues are another example. You'll still be using `SKIP LOCKED` to select jobs to work, but listen/notify makes it faster to find out that a new job became available.

[1] https://brandur.org/fragments/instant-feature-flags

mcqueenjordan · 2024-05-14T10:55:34

Got it, thanks for the reply. The feature flag cache reload use case seems like reasonable one to me.

santicalcagno · 2024-05-14T10:48:57

I implemented a queue using both LISTEN/NOTIFY for notifications to the task processor and SKIP LOCKED to sift through the pending tasks in the tasks table.

I think you can eliminate polling if you don't need to retry tasks, by simply processing pending tasks at startup and then just responding to LISTEN events. However, I'm curious if there are any alternatives to polling the queue in cases where you need to support retrying tasks at a given timestamp.

mcqueenjordan · 2024-05-14T10:58:21

I personally think polling the queue/table via queries is a very sensible pattern and not something I have a desire to remove. In theory, you could go at it via a push approach by wiring into the WAL or something but that comes with its own rats nest of issues.

cryptonector · 2024-05-14T21:28:27

One nice thing about `NOTIFY` is that the system is very fast and scales to many `LISTEN`ers that can all get notifications with very little latency. I.e., it's a C10K system.

Because there are no access controls on who can NOTIFY to what channel, you can't rely on the payload, so you really do have to look at a work queue. But if it's just one user, and all you're trying to do is broadcast date fast, then NOTIFY works great.

thom · 2024-05-14T10:47:31

The only tradeoff here is the pure NOTIFY approach (if you don't care about losing notifications) can sit there on a single connection, and probably performs a bit better than having a bunch of workers in contention for that connection (at which point you don't really need SKIP LOCKED anyway). But ultimately tuning the level of parallelism of your worker pool and how many connections to dedicate to it doesn't seem a huge hardship.

hot_gril · 2024-05-14T18:42:52

Tbh I didn't know about SKIP LOCKED until now, but it looks like you have to hold a xact open the entire time the worker runs, which can be a problem. What I've done before is timestamp cols for start/end. A worker takes any job whose end time is null and start time is not too recent, which makes retries natural and flexible.

A pubsub pattern like pg_notify can definitely make sense depending on the requirements, but I wouldn't jump to it first. The few times I've used pubsub elsewhere, it was when subscribing to some other team's service, not via a shared DB.

mcqueenjordan · 2024-05-15T16:54:27

Yeah, you can avoid holding the xact with the means that you mentioned, e.g. SKIP LOCKED and set some value to PROCESSING, then do your processing, then update to DONE at the end. Or as you mentioned, timestamps.

I think the SKIP LOCKED part is really only useful to avoid contention between two workers querying for new work simultaneously.

hot_gril · 2024-05-15T17:28:05

Yeah, I can imagine SKIP LOCKED being faster if used that way. Just hasn't been an issue for me yet, so I haven't tested.

orangeisthe · 2024-05-18T09:09:23

They can be used in conjunction.

porsager · 2024-05-14T09:57:43

Great post!

Postgres.js does this implicitly through a simple API[1] mimicking the postgres way, thereby using only a single dedicated connection for listening per process.

Listen/notify is also super useful with triggers.

[1] https://github.com/porsager/postgres#listen--notify

salojoo · 2024-05-14T10:18:36

"Instead, a received notification is immediately sent into buffered channel, which means it’s discarded if the channel is full"

Shouldn't the channel rather block than discard if full?

Anyway, nice and relevant article for me as I've recently added a few listeners to my app. I chose the naive approach since I only have two topics and a surplus of connections

brandur · 2024-05-14T10:29:59

Author here. The Go channel send behavior could certainly be altered depending on the particular semantics of the application, but the reason I chose to use a non-blocking buffered channel is so that no particular subcomponent can slow down the distribution of notifications for everybody.

> Shouldn't the channel rather block than discard if full?

In Go, a blocking channel is one that's initialized without a size (see [1]). You could have a blocking channel where the sender uses a `select/default` to discard after it's full, but that leaves very little margin of error for the receiver. If it's still processing message 1, and then message 2 comes in and the notifier tries to send it, message 2 is gone.

IMO, better to use a buffered channel with some leeway in terms of size, and then write receivers in such a way that they clear incoming messages as soon as possible. i.e. If messages are expected to take time to process, the receiver spins up a goroutine to do so, or has another internal queue of its own where they're placed so that new messages from the notifier never get dropped.

---

[1] https://gobyexample.com/channels

salojoo · 2024-05-14T11:15:10

We probably both understand the underlying concepts correctly. But just in case: also buffered channels will block if full.

So I don't see how notifications get discarded (meaning lost). But somehow that's what your text says?

brandur · 2024-05-14T11:25:23

They'll get lost when using a non-blocking send with select/default:

    messages := make(chan string)
    select {
    case messages <- "hi":
        fmt.Println("sent message", msg)
    default:
        fmt.Println("no message sent")
    }

The reason you'd use a non-blocking send is to make sure that in the event of one slow consumer that the entire system doesn't slow down.

Imagine a scaled out version of the notifier in which it's listening on hundreds of topics and receiving thousands of notifications. Each notification is received one-by-one using something like Pgx's `ListenForNotification`, and then distributed via channel to subscriptions that were listening for it.

In the case of a blocking send without `default`, one slow consumer that was taking too much time to receive and process its notifications would cause a build up of all other notifications the notifier's supposed to send, so one bad actor would have the effect of degrading the time-to-receive for all listening components.

With buffered channels, a poorly written consumer could still drop messages for itself, which isn't optimal (it should be fixed), but all other consumers will still receive theirs promptly. Overall preferable to the alternative.

salojoo · 2024-05-14T13:05:12

Right, I didn't know it was possible to bypass full channels like that with select/default. Thanks for spelling it out

macNchz · 2024-05-14T14:00:07

I built a service using LISTEN/NOTIFY a while ago, but the fact that Postgres will instantly drop all messages to a given channel if there are no registered listeners made me nervous.

This felt like a big caveat to me, taking what could be an extremely well-guaranteed approach (e.g. a db trigger that writes a row to an event table, which is being polled by an application process) into an ops risk, where transient network issues could result in data loss. You’d want idempotent tasks and zero-downtime deployments, but it feels like the benefits don’t really justify the risk. If you want a topic exchange just have the polling process forward to a real message broker.

I like the concept of sending messages straight from the db, and we did run my service in production (for non-critical stuff), but unless something has changed with the way this works I would still be wary.

bouncing · 2024-05-14T14:06:29

When I've used it, I've used it in conjunction with a table that maintains state.

So basically the notify keeps other processes from having the poll the table for updates, but the table itself still acts as the queue.

cryptonector · 2024-05-14T20:43:36

> [...] was somewhat shocked recently looking into MySQL and SQLite to learn that even in 2024, no equivalent exists.

I'm not sure how one would make `NOTIFY` work on SQLite3. Maybe one could have a system table (`sqlite_temp_notifies`?) that would store notifies, and then have a directory associated with the database where one can create pipes/AF_LOCAL sockets that a process/thread running `NOTIFY` would attempt to write to in a non-blocking manner, then delete the notifies from that system table. The system table would only be needed to make notifies reliable while "connected" to the database and LISTENing (i.e., having registered the pipe/socket).

I agree that that would be very useful.

A couple of things:

- LISTEN/NOTIFY lacks authorization on the channel. Indeed, channels are not even CREATEd. This means you cannot trust NOTIFY payloads if users you don't trust have access to the database.

- NOTIFYs are lost if no client is LISTENing.

In practice this means that NOTIFYs are only good for waking clients that then have to check what's up with a query.

shayonj · 2024-05-14T12:55:36

Any post from brandur is a must read. Love the post

brandur · 2024-05-14T13:05:49

Thanks Shayon!!! Very kind of you :)

jakjak123 · 2024-05-14T16:38:34

As someone who came into a project that uses postgres notify, it was pretty confusing, as your app would connect to other apps db to listen for changes. There also was no documented nor enforced data format, so it was very annoying to deal with in practice.

vivzkestrel · 2024-05-14T10:59:46

Isnt this pattern already implemented in javascript by pg-promise? https://github.com/vitaly-t/pg-promise/wiki/Robust-Listeners

agentultra · 2024-05-14T19:00:29

There’s also the payload size limit for notification messages, requiring consumers to do a fetch in order to get the complete message. There’s still the potential there, if your message payloads are large, to have consumers contend for connections with the main application.

John23832 · 2024-05-14T09:33:32

I thought that listen was bad because it holds connections unnecessarily and defeats connection pooling.

brandur · 2024-05-14T09:49:36

Listen does hold a connection, but that doesn't mean it defeats connection pooling.

That's what I was trying to convey in this blog post: you'll keep a fixed number of connections open for use with listen, but as few as possible by reusing a single connection per program to simultaneously listen on all channels that your application cares about (with the notifier distributing messages to each internal component that subscribed).

With your dedicated listen connections accounted for, the rest of the connection pool can operate normally, with programs checking connections in and out only as long as they need them.

So the net-net is that you have a handful of connections dedicated for listen, and the remaining ~hundreds are part of the connection pool for shared use.

John23832 · 2024-05-14T11:18:05

Thank you. The more you know.

justinclift · 2024-05-14T09:52:45

My application tends to open a single, extra connection to the database server (not through the main pool) just for receiving notifys. That connection then feeds them to whatever part of the application needs them.

Seems to work "ok" in practise. No real issues, apart from listen/notify not supporting 2 phase commit (2PC) which we'd been thinking of using but haven't investigated since.

dajtxx · 2024-05-14T23:52:15

I looked into using LISTEN/NOTIFY recently and I must be misunderstanding. How is it different to polling a table? It seemed like it had all the same machinery behind it.

orangeisthe · 2024-05-18T09:08:25

well it's the same difference between polling an api vs setting up a webhook.