Choose Postgres queue technology

ekidd · on Sept 25, 2023

I've built three distributed job systems at this point. A handy rule of thumb which I have promoted for years is "build for 10x your current scale."

If you need to handle 70 requests/second, design for 700. If you need to handle 20 servers running batch jobs, design for 200 servers. If you're in a startup that grows 100% per year, you'll be at 8x scale in 3 years. So you'll have time to rewrite as you grow!

Out of three job systems I built, the first one tried to avoid SQL for "scalability." Then we hit a bunch of edge cases that needed transactional integrity, and we were in a world of pain.

My two more recent distributed job systems use PostgreSQL as a coordinator. They're literally built around SELECT FOR UPDATE SKIP LOCKED. One of them routinely controls 350 workers and the other does elaborate prioritization for thousands of jobs. Both of them will continue to run just fine until they're earning millions of dollars a year—for example, the one controlling 350 workers should scale to about 2,000 CPUs with a little work.

Hyperscale technologies are fairly cheap right up until you discover you need transactions. At that point, faking transactional semantics on top of an eventually consistent data store becomes an engineering nightmare.

So sit down, and do the math. If your company was earning $100 million/year, how big would your distributed system need to be? Can you easily get a PostgreSQL instance big enough to handle that load? (Or could you just shard per client?) If so, strongly consider using PostgreSQL. It makes a hundred things trivial.

bunderbunder · on Sept 25, 2023

Similar experience here. Multiple times, I've pushed an SQL-based queue a couple orders of magnitude past the scale where others say SQL craps out and a distributed solution is an absolute requirement. And the SQL solution is typically simpler, requires fewer compute resources, and easier to support in production.

But, to make it work, you've got to know the database well enough to know that things like SELECT FOR UPDATE SKIP LOCKED exist in the first place. Which is a kind of knowledge that's getting quite rare these days, because more and more engineers grow up never having known a world where they aren't walled off from their DBMS's true capabilities by a heavyweight ORM.

runeks · on Sept 26, 2023

> Multiple times, I've pushed an SQL-based queue a couple orders of magnitude past the scale where others say SQL craps out and a distributed solution is an absolute requirement.

What about availability, though? The distributed solution is also useful to avoid downtime in case of single node failure.

Is there an off-the-shelf solution that lets me do that with Postgres? I know the newest version (16) just added active-active replication, but I wouldn’t know how to use that to achieve resilience.

adamckay · on Sept 26, 2023

But if you're using Postgres as your queuing system because you're already using it as your core database technology for your app, you've got the same issue. If your single Postgres instance is down then your app is, too, and won't be enqueuing more jobs.

And unless your jobs are trivial then it's highly likely that they interact with your app in some way so it doesn't really matter if your workers are distributed and up, they're not able to complete their work because your app is down because of a single-node Postgres.

acaloiar · on Sept 27, 2023

What you're pointing out is an architectural constraint that's unrelated to how and where one queues jobs.

If background jobs need to be available while some other core application is down, that needs to be designed for, and that design can be achieved with any queue technology. Simply separate the queue system stack from the core application system stack.

> But if you're using Postgres as your queuing system because you're already using it as your core database technology

Note your own use of "database technology" and not "database server". It's common to have separate application and queue database servers when such an architectural constraint is present. Of course, this sacrifices the benefit of transactional guarantees when the application and background jobs run on the same server.

Like I said in the post, technology (and architectural) choices are tradeoffs all the way down :)

dekkers · on Sept 26, 2023

What the article writes about scalability also applies to availability. Does the queue need 99.999% or 99.9999% uptime? Or is the Service Level Objective actually 99.99%, 99.9% or even 99.5%?

With 99.99% you can have 4 minutes of downtime a month. If failover to a hot standby takes a minute then that shouldn't be a problem to achieve a 99.99% uptime SLO.

bunderbunder · on Sept 29, 2023

I don't know PostgreSQL as well. I mostly use MSSQL in production, and it's had good replication for ages, so I just wasn't really that worried about single-node failure for the critical stuff.

And, frankly, even for the less-critical stuff that was only running on a single node, I still dealt with fewer availability problems back in the day than I do now that everything's gone distributed. I think that a thing that's been forgotten over the years is that a lot of this stuff that distributed systems do to be reliable was more about digging oneself out of the hole that was created by running on lots of cheap hardware instead of using a single server with redundancy built-in. I acknowledge that, past a certain scale, that's the only option that makes sense. But if you're not operating at that scale then there's a good chance it's all just verschlimmbessern.

nerpderp82 · on Sept 25, 2023

Design (and test) for 10x your current scale, build for what you need now. And the system has to be able to handle peak loads, and if you don't know what those are build in a safety margin or a way to shed or defer work if you need to.

Everything is a tradeoff, optimize for the things that need optimizing, and determining what those are is the hallmark of a good engineer.

insanitybit · on Sept 25, 2023

I've found the same. You should understand what your 10x / 100x growth solution would look like (assuming that that's relevant - obviously if you have no intent to hit that scale, don't bother). Build your system to handle your 1-1.5x, maybe 10x scale, and make sure you're not blocking the 10-100x solutions by doing so.

aaroninsf · on Sept 26, 2023

"faking transactional semantics on top of an eventually consistent data store becomes an engineering nightmare"

PREACH

If this is news to you, you just justified a week's worth of browsing HN on the clock reading this statement. Scribble it on a scrap of paper and keep it taped to your ATM card.

merb · on Sept 25, 2023

you can even built a advisory lock queue in Postgres, which is way slower but has some benefits.

pas · on Sept 25, 2023

slower to build or slower to run?

merb · on Sept 25, 2023

slower to run, but when you keep the postgres connection open you will know that the job is still running, while with for update skip locked you would need to have a status and a job_timeout basically.

so pg_try_advisory_lock/pg_advisory_unlock can lock over transactions while for update skip locked can't, thus you would either need to keep a transaction open or use status+job_timeout (and in postgres you should not use long transactions)

basically we use c#, but we looked into https://github.com/que-rb/que which uses advisory_locks, since our jobs take like 1 min to 2 hours it was a no-brainer to use advisory_locks. it's just not the best thing if you have thousands of fast jobs per second, but for a more moderate queue where you have like 10000 jobs per minute/10 minutes/30 minutes and they take like 1 min to 2 hours its fine.

we also do not delete jobs, we do not care about storage since the job table basically does not take a lot. and we have a lot of time to catchup at night since we are only in europe

ekidd · on Sept 25, 2023

Here's my current favorite recipe for building complex job systems on PostgreSQL. I'm not thinking about "send an email"-type jobs, but bigger jobs that do complex tasks.

The usual trick I use is to have a `jobs.state` field containing "pending", "running", "done", or "error" (or whatever that job system needs). I only hold SELECT FOR UPDATE SKIPPED LOCKED long enough to:

1. Transition from "pending" to "running". Or a second time, to transition from "running" to either "done" or "error".

2. Store the current worker ID (often the Kubernetes pod name).

Then, I can build a watcher that wakes up every 5 minutes, and looks for "running" jobs with no corresponding Kubernetes pod, and mark them as "error". I try to never hold a lock for more than a second or two, and to never lock more than one job at once. This gets me 80% of the way there.

The reason I don't hold a transaction open for the entire job is because every transaction requires a PostgreSQL connection, and connections are surprisingly expensive. In fact, you may want to run connections through pgbouncer or a stateless REST API to avoid holding open hundreds or thousands of connections. Everything except PostgreSQL itself should ideally be stateless and restartable.

You might also have a retry system, or jobs that recursively queue up child jobs, or jobs that depend on other jobs, or more elaborate state machines. You might have timeouts. Most of these things are solveable with some mix of transations, some SQL, CREATE INDEX or CREATE VIEW. A database gives you so many things for free, if you're just a little careful about it.

And since Grafana supports SQL, you can easily build really nice dashboards for support and ops by just querying the tables used by the job system.

There are other ways to do it! But I'm fond of this general strategy for coarse-granularity jobs.

runeks · on Sept 26, 2023

Isn’t this just https://news.ycombinator.com/item?id=29599132 with the addition of a custom worker ID (pod name)?

0x457 · on Oct 2, 2023

IMO the main issue with it - advisory locks in postgres require an open connection being held the entire time lock is taken. Combine that with thread per connection model...

aduffy · on Sept 24, 2023

For several projects I’ve opted for the even dumber approach, that works out of the box with every ORM/Query DSL framework in every language: using a normal table with SELECT FOR UPDATE SKIP LOCKED

https://www.pgcasts.com/episodes/the-skip-locked-feature-in-...

It’s not “web scale” but it easily extends to several thousand background jobs in my experience

matsemann · on Sept 24, 2023

I've done even simpler without locks (as no transaction logic), where I select a row, and then try to update a field about it being taken. If 1 row is affected, it's mine. If 0, someone else did it before me and I select a new row.

I've used this for tasks at big organizations without issue. No need for any special deployments or new infra. Just spin up a few worker threads in your app. Perhaps a thread to reset abandoned tasks. But in three years this never actually happened, as everything was contained in try/catch that would add it back to the queue, and our java app was damn stable.

refibrillator · on Sept 25, 2023

PSA: This is a read-modify-write pattern, thus it is not safe under concurrency unless a transaction isolation level of SERIALIZABLE is specified, or some locking mechanism is used (select for update etc).

deredede · on Sept 25, 2023

The part about checking the number of affected rows hints at using `UPDATE ... WHERE ...` which should act as an atomic CAS regardless of isolation level.

Edit: To clarify, I mean `SELECT id WHERE used = 0` followed by `UPDATE ... SET used = 1 WHERE id = ... AND used = 0`

RichieAHB · on Sept 25, 2023

This works fine as long as you’re happy to do the same task multiple times. I.e. the task is idempotent and cheap.

kdps · on Sept 25, 2023

I don't get it :(. Why could the same task be executed more than once? From my understanding, if the UPDATE is atomic, only one worker will be able to set `used = 1`. If the update statement is not successful (affected != 1), then the worker should drop the task and do another select.

klauserc · on Sept 25, 2023

With a transaction isolation level below SERIALIZABLE you can have two transactions that both read the old row (with `used = 0`) at the time they perform the update (but before they commit the transaction). In that case, both transactions will have performed an update (rows affected = 1).

Why would both transactions see `used = 0`? The DB server tries to isolate transactions and actively hides effects of other transactions that have not committed yet.

singron · on Sept 25, 2023

This is not true in postgres. When the second transaction tries to update the row, it will wait for the first transaction to commit first and then recheck the WHERE.

https://www.postgresql.org/docs/current/transaction-iso.html...

matsemann · on Sept 25, 2023

This is spot on! We let the db provide the atomics.

mjb · on Sept 25, 2023

This should be safe under SI (other than the ABA issue, which isn't even fixed with serializable). The update forces a W-W conflict, which is sufficient to make the behavior serializable under SI (and therefore, I think but am not sure, PG's RR level too).

andrelaszlo · on Sept 24, 2023

I guess you update it with the assigned worker id, where the "taken by" field is currently null? Does it mean that workers have persistent identities, something like an index? How do you deal with workers being replaced, scaled down, etc?

Just curious. We maintained a custom background processing system for years but recently replaced it with off the shelf stuff, so I'm really interested in how others are doing similar stuff.

matsemann · on Sept 24, 2023

No, just update set taken=1. If it was a change to the row, you updated it. If it wasn't, someone updated before you.

Our tasks were quick enough so that all fetched tasks would always be able to be completed before a scale down / new deploy etc, but we stopped fetching new ones when the signal came so it just finished what it had. I updated above, we did have logic to monitor if a task got taken but never got a finished status, but I can't remember it ever actually reporting on anything.

fbdab103 · on Sept 24, 2023

I would set the taken field to a timestamp. Then you could have a cleanup job that looks for any lingering jobs aged past a reasonable timeout and null out the field.

magicalhippo · on Sept 25, 2023

We have a "status flag" column which is either Available, Locked or Processed (A, L and P), an Updated column with a timestamp of when it was last updated, and a Version counter.

When grabbing a new message it selects "Available or (Locked with Updated timestamp older than configured timeout)". If successful it immediately tries to set the Locked status, Updated timestamp and bumps the Version counter, where the previous values of Status and Version has to match. If the update fails it retries getting a new message.

If the Version counter is too high, it moves the message to the associated dead-letter table, and retries getting a new message.

This isn't for high performance. I tested it and got 1000 messages/sec throughput with handful of producers and consumers against test db instance (limited hardware), which would be plenty for us.

I wrote it to be simple and so we could easily move to something AMPQ'ish like RabbitMQ or Azure Service Bus when needed. Overall quite easy to implement and has served us well so far.

Izkata · on Sept 25, 2023

We do it with two columns, one is an integer identifying which process took the job and the second is the timestamp for when it was taken.

tylergetsay · on Sept 24, 2023

it wont work with a timestamp because each write will have an affected row of 1 beacuse the writes happen at different times. setting a boolean is static

jayd16 · on Sept 24, 2023

You can do something like UPDATE row SET timeout = NOW() WHERE NOW() - taskTimeout > row.timestamp. You're not stuck with comparing bools.

twic · on Sept 24, 2023

update tasks set taken_timestamp = now() where task_id = ? and taken_timestamp is null

AdamJacobMuller · on Sept 25, 2023

update row set taken=true,taken_by=my_id,taken_at=now() where taken is false;

SahAssar · on Sept 24, 2023

That is the sort of thing that bites you hard when it bites. It might run perfectly for years but that one period of flappy downtime at a third party or slightly misconfigured DNS will bite you hard.

matsemann · on Sept 24, 2023

But compared to our rabbit setup where I work now, it was dead stable. No losing tasks or extra engineering effort on maintaining yet another piece of tech. Our rabbit cluster acting up has led to multiple disasters lately.

SahAssar · on Sept 24, 2023

Agreed, I've had my own rabbit nightmares. But setting up a more robust queue on postgresql is easy, so you can easily gain a lot more guarantees without more complexity.

fsniper · on Sept 24, 2023

You can combine this "update" with a "where taken = 0" to directly skip taken rows.

calrain · on Sept 24, 2023

I've done this successfully with a web service front that retrieves jobs to send to workers for processing, by using a SQL table queue. That web service ran without a hitch for a long time, serving about 10 to 50 job consumers for fast and highly concurrent queues.

My approach was:

- Accept the inbound call

- Generate a 20 character random string (used as a signature)

- Execute a sql query that selects the oldest job without a signature and write the signature, return the primary key of the job that was updated.

- If it errors for any reason, loop back and attempt again, but only 10 times, as some underlying issue exists (10 collisions is statistically improbable for my use case)

- Read the primary key returned by that sql query and read it, comparing it's signature to my random one.

- If a hit, return the job to the caller

- If a miss, loop back and start again, incrementing attempts by 1.

The caller has to handle the possibility that a call to this web service won't return anything, either due to no jobs existing, or the collision/error threshold being reached.

In either case, the caller backs for it's configured time, then calls again.

Callers are usually in 'while true' loops, only existing if they get an external signal to close or an uncontrolled crash.

If you take this approach, you will have a function or a web service that converts the SQL table into a job queue service. When you do that, you can build metrics on the amount of collisions you get while trying to pull and assign jobs to workers.

I had inbuilt processes that would sweep through jobs that were assigned (had a job signature) and weren't marked as complete, it actioned those to handle the condition of a crashed worker.

There are many many other services the proper job queues offer, but that usually means more dependencies, and code libraries / containers, so just build in the functionality you need.

If it is accurate, fast enough, and stable, you've got the best solution for you.

/edited for formatting

valzam · on Sept 25, 2023

The reason why you want to use skip locked is so that Postgres can automatically skip rows that are being concurrently accessed for updating the "status". You are right, if you update a "status" field you don't really need to worry about advisory locks and skipping rows that are locked but it still helps with performance if you have a decent amount of concurrent consumers polling the table.

samus · on Sept 24, 2023

I recently got introduced to this system at work, and also built a new job using it. It works fine, but since I had to implement work stealing to deal with abandoned jobs in a timely manner, I wouldn't dare to use it for actions that absolutely must not happen twice.

A1kmm · on Sept 25, 2023

Exactly-once is only meaningfully possible if you have a rollback for tasks of unknown completion state - for example if the task involves manipulating the same database as the one controlling the task execution. Otherwise, it becomes the (impossible to solve) two-generals problem between updating the task status and performing the task.

samus · on Sept 25, 2023

Full agree here.

There is actually another possibility: there must be a way to check whether the receiving system has received the message. But this only works if there are no "rogue" senders.

mbb70 · on Sept 24, 2023

I've done the same with MongoDB with findOneAndModify, simple and solid

ttfkam · on Sept 25, 2023

I think you mean findOneAndUpdate, and while simple, I wouldn't call it solid.

https://stackoverflow.com/a/76821755

dminor · on Sept 25, 2023

Agenda uses this, and we found the hard way on mongo 4 that it can lead to mongo spinning the CPU at 100% if it gets too many at once. No idea if they've fixed it in later versions.

klysm · on Sept 24, 2023

With what transaction isolation level?

bushbaba · on Sept 24, 2023

You could even use a timestamp for handling what if this task was never finished by the worker who locked the row.

surprisetalk · on Sept 24, 2023

I recently published a manifesto and code snippets for exactly this in Postgres!

  delete from task
  where task_id in
  ( select task_id
    from task
    order by random() -- use tablesample for better performance
    for update
    skip locked
    limit 1
  )
  returning task_id, task_type, params::jsonb as params

[1] https://taylor.town/pg-task

thom · on Sept 25, 2023

Presumably it's okay that this loses work if your task runner has an error?

surprisetalk · on Sept 25, 2023

If you read my guide, you’ll see that I embed it in a transaction that doesn’t COMMIT until the companion code is complete :)

For example, I run the above query to grab a queued email, send it using mailgun, then COMMIT. Nothing is changed in the DB unless the email is sent.

maxbond · on Sept 25, 2023

Holding a transaction open for the duration of a request to an external service makes me nervous. I've seen similar code lock up the database and bring down production. Are you using timeouts and circuit breakers to control the length of the transactions?

lomereiter · on Sept 25, 2023

Yes, you absolutely need to set a reasonable idle transaction timeout to avoid a disaster (bugs in the code happen) - this can also be done globally in the database settings.

w23j · on Sept 25, 2023

Long running transactions can lead to an accumulation of dead tuples: https://brandur.org/postgres-queues

lomereiter · on Sept 25, 2023

This article was written in 2015, a year before idle_in_transaction_session_timeout parameter was added (in Postgres 9.6) - which is unfortunately still disabled by default, but that's the easiest way to make sure no transaction sits idle for too long.

bavell · on Sept 25, 2023

This is from 2015, does it still hold true in 2023?

sgarland · on Sept 25, 2023

Yes, in that Postgres still uses oldest-to-newest tuple ordering, and its MVCC hasn’t changed, so you can still cause the issues listed.

Careful monitoring and tuning of parameters mentioned by the sibling comment to you can help mitigate this, though.

Ultimately at scale, no, RDBMS shouldn’t be a queue. But most have a long way to go before they hit that point.

thom · on Sept 25, 2023

Gotcha, apologies for responding without reading!

muti · on Sept 25, 2023

From the linked article

> The task row will not be deleted if sendEmail fails. The PG transaction will be rolled back. The row and sendEmail will be retried.

Karrot_Kream · on Sept 25, 2023

In my experience, a queue system is the worst thing to find out isn't scaling properly because once you find out your queue system can't architecturally scale, there's no easy fix to avoid data loss. You talk about "several thousand background jobs" but generally, queues are measured in terms of Little's Law [1] for which you need to be talking about rates; according to Little's Law namely average task enqueue rate per second and average task duration per second. Raw numbers don't mean that much.

In the beginning you can do a naive UPDATE ... SET, which locks way too much. While you can make your locking more efficient, doing UPDATE with SELECT subqueries for dequeues and SELECT FOR UPDATE SKIP LOCKED, eventually your dequeue queries will throttle each other's locks and your queue will grind to a halt. You can try to disable enqueues at that point to give your DB more breathing room but you'll have data loss on lost enqueues and it'll mostly be your dequeues locking each other out.

You can try very quickly to shard out your task tables to avoid locking and that may work but it's brittle to roll out across multiple workers and can result in data loss. You can of course drop a random subset of tasks but this will cause data loss. Any of these options is not only highly stressful in a production scenario but also very hard to recover from without a ground-up rearchitecture.

Is this kind of a nightmare production scenario really worth choosing Boring Technology? Maybe if you have a handful of customers and are confident you'll be working at tens of tasks per second forever. Having been in the hot seat for one of these I will always choose a real queue technology over a database when possible.

[1]: https://en.wikipedia.org/wiki/Little%27s_law

mlyle · on Sept 25, 2023

> and are confident you'll be working at tens of tasks per second forever.

It's more like a few thousand per second, and enqueues win, not dequeues like you say... on very small hardware without tuning. If you're at tens of tasks per second, you have a whole lot of breathing room: don't build for 100x current requirements.

https://chbussler.medium.com/implementing-queues-in-postgres...

> eventually your dequeue queries will throttle each other's locks a

This doesn't really make sense to me. To me, the main problem seems to be that you end up with having a lot of snapshots around.

Karrot_Kream · on Sept 25, 2023

> https://chbussler.medium.com/implementing-queues-in-postgres...

This link is simply raw enqueue/dequeue performance. Factor in workers that perform work or execute remote calls and the numbers change. Also, I find when your jobs have high variance in times, performance degrades significantly.

> This doesn't really make sense to me. To me, the main problem seems to be that you end up with having a lot of snapshots around.

The dequeuer needs to know which tasks to "claim", so this requires some form of locking. Eventually this becomes a bottleneck.

> don't build for 100x current requirements

What happens if you get 100x traffic? Popularity spikes can do it, so can attacks. Is the answer to just accept data loss in those situations? Queue systems are super simple to use. I'm counting "NOTIFY/LISTEN" on Postgres as a queue, because it is a queue from the bottom up.

mlyle · on Sept 25, 2023

> Factor in workers that perform work or execute remote calls and the numbers change.

These don't occur on the database server, though... This merely affects the number of rows currently claimed.

> The dequeuer needs to know which tasks to "claim", so this requires some form of locking. Eventually this becomes a bottleneck.

These are just try locks, though-- the row locks are not contended. The big thing you run into is having lots of snapshots around and having to skip a lot of claimed rows for each dequeue.

> What happens if you get 100x traffic? Popularity spikes can do it, so can attacks.

If you get 100x the queueing activity for batch jobs, you're going to have stuff break well before the queue. It's probably not too easy to get 100x the drain rate, even if your queue system can handle it.

This scales well beyond 100M batch tasks per day, which gets you to 1M users with 100 tasks/day each.

sgarland · on Sept 25, 2023

> What happens if you get 100x traffic?

Throttle the inputs. Rate-limiting doesn’t belong to the data layer.

While throttling due to organic popularity isn’t great, I’d argue the tradeoffs might be worthwhile. If it looks like the spike will last, stand up Redis during the throttling, double-write, and throttle down the Postgres queue until it’s empty. If you really need to, take a 15 minute outage to just copy data over.

nerpderp82 · on Sept 25, 2023

What happens when you get 500x the traffic or 50x?

How does the system behave when the traffic rate is higher for which it was designed for or can currently handle? Because that number will always be there, even in a "scalable" system. One won't be able to add capacity at the same rate that work will increase.

asdfaoeu · on Sept 25, 2023

NOTIFY/LISTEN isn't a queue it has broadcast semantics. Postgres queueing is really just the SELECT FOR UPDATE SKIP LOCKED, the NOTIFY/LISTEN allows you to reduce the latency a bit but not essential.

hu3 · on Sept 25, 2023

> What happens if you get 100x traffic?

This line of reasoning is desirable for FAANGS, but can bankrupt startups that need to move fast and get shit done.

klauserc · on Sept 25, 2023

If you find yourself in that situation, migrating to a more performant queuing solution is not that much of a leap. You already have an overall system architecture that scales well (async processing with a queue).

_Ideally_ the queuing technology is abstracted from the job-submitters/job-runners anyway. It's a bit more work if multiple services are just writing to the queue table directly.

I agree that the _moment_ the system comes to a screeching halt is definitely not fun.

asdfaoeu · on Sept 25, 2023

You are going to have the same scaling issues with your datastore. I don't really understand why you say that your dequeue queries will throttle each others locks and grind it to a half? Isn't that the whole point of SKIP LOCKED?

adatta02 · on Sept 24, 2023

This is more or less how graphile, https://github.com/graphile/worker is implemented.

ricardobeat · on Sept 24, 2023

That's what's in the article.

somsak2 · on Sept 24, 2023

Fourth paragraph of the post:

>Applied to job records, this feature enables simple queue processing queries, e.g. SELECT * FROM jobs ORDER BY created_at FOR UPDATE SKIP LOCKED LIMIT 1.

Klonoar · on Sept 25, 2023

I’ve used this for a queue with millions of items and some indexes. It “just works”.

theK · on Sept 25, 2023

Ditto.

Also, postgres partial indexes can be quite helpful in situations where you want to persist and query intermediate job lifecycle state and don't want multiple rows or tables to track one type of job queue

mvdtnz · on Sept 25, 2023

How is this "an even dumber approach"? It's literally the one thing this article is advocating for. Did you read it?

boruto · on Sept 25, 2023

Skip locked is useful till you have to maintain order for a group of messages with some "group_id", so that set of related messages are sent one after the other.

Then you probably have to write complicated queries or use partitions in some sort.

Or Just stick to one thread polling the messages.

orangepanda · on Sept 24, 2023

As I understand, with SKIP LOCKED rows would no longer be processed in-order?

vore · on Sept 24, 2023

Yes, but if you're going through the queue with multiple workers in parallel, you lose ordering guarantees anyway.

klysm · on Sept 24, 2023

Depends on how many consumers you have. If you need order guarantees, then something like the outbox pattern is probably a better fit.

LennyWhiteJr · on Sept 25, 2023

Nothing about the outbox pattern guarantees ordering.

klysm · on Sept 25, 2023

If you use Postgres logical replication, that is not true.

riku_iki · on Sept 24, 2023

article says he also uses "order by" clause, but I am wondering if it will severely limit throughput since all messages will need to be sorted on each lookup, but this probably can be solved by introducing index.

vore · on Sept 24, 2023

It seems strictly worse to use ORDER BY in this case, since if you're using SKIP LOCKED you should be doing parallel processing anyway, and if you're doing parallel processing, ordering is already going out the window.

nsonha · on Sept 24, 2023

Parallel or not, the order is of importance in any queue system.

sarchertech · on Sept 25, 2023

Unless you can guarantee that the processing time of each job is exactly the same, if you have multiple workers processing the same queue, you can’t order anything except the start time.

You can use locks to effectively break the queue into sub queues so that each sub queue is only being processed by 1 worker. Then you can order that sub queue.

nsonha · on Sept 25, 2023

job should be attempted inthe same order/priority they are enqueued, that's the meaning of the word "queue". That they take varrying amounts of time is another matter.

worthless-trash · on Sept 25, 2023

Queue can clearly mean "work that needs to be completed" not necessarily 'work completed in order'. Your definition is much stricter than it needs to be for most use cases.

nsonha · on Sept 25, 2023

There is clearly a conceptual difference between a set of things from which you pull things out randomly, and a queue. A queue always has intrinsic criteria to select the next item to be pulled out.

sarchertech · on Sept 25, 2023

There are many times when the start order doesn’t really matter, and the additional sorting overhead isn’t worth it. In those cases people will still tend to refer to the entity holding the jobs to be processed as a queue despite the fact that it doesn’t strictly follow FIFO order.

If they are being technically precise, queue isn’t the correct term, but language changes with context and time. Either way the implementation isn’t wrong if strict start order has been considered and isn’t important.

nsonha · on Sept 26, 2023

you're confusing between "i don't care about order", and "there is no order". Name ONE queue implementation that doesn't have order.

sarchertech · on Sept 26, 2023

Here you go. One of the first tutorials explaining how SKIP LOCKED works in Postgres implants a job “queue” that doesn’t have an order by clause. https://www.pgcasts.com/episodes/the-skip-locked-feature-in-...

I’m not confusing anything. I’ve seen random selection “job queues” implemented many times. As long as you truly don’t care about start order, it’s fine to trade it for increased throughout.

nsonha · on Sept 27, 2023

> doesn’t have an “order by” clause

Does that mean it doesn't have any order or that whoever writes the query doesn't care about order?

Also we are arguing over whether pg suffices as a queue implementation, and you use itself as an example?

sarchertech · on Sept 27, 2023

It means that you are telling pg that you don’t care about order, so it is free to optimize the query in whatever way it wants to. The order can change query to query depending on numerous external factors.

I’m not using pg itself as an example. I’m using a specific implementation of a “job queue” built with pg.

I’ve seen and you can search for and find many implementations of “job queues” using relational databases where job start order guarantees are traded away for throughput.

worthless-trash · on Sept 26, 2023

Having and caring about are different things.

I care about money, I dont have money.

worthless-trash · on Sept 26, 2023

From the dictionary.

Queue: a list of data items, commands, etc., stored so as to be retrievable in a definite order, usually the order of insertion.

note the term "Usually", not "always".

nsonha · on Sept 27, 2023

> to be retrievable in a definite order, usually the order of insertion.

always has an oder, which is usually of insertion.

CyberDildonics · on Sept 25, 2023

not necessarily 'work completed in order'

That's exactly what a queue means, not just in every day life, but specifically in computer science.

MaxBarraclough · on Sept 25, 2023

This depends if we consider a priority queue to be a type of queue.

CyberDildonics · on Sept 25, 2023

No it doesn't. A queue always has an order. A priority queue just means you aren't always inserting at the last place in the queue.

vore · on Sept 25, 2023

You have no ordering guarantees, so how can order be important? If 4 work items are scheduled on 4 independent workers, you have no guarantee which will start first or finish first.

nsonha · on Sept 25, 2023

The order matters in the sense that the 5th jobs should not be atempted before those 4.

riku_iki · on Sept 25, 2023

I think the order matter at least because you want to have some FIFO approximation, otherwise some tasks can forever stuck in queue and never be picked up.

vore · on Sept 25, 2023

Then I think what you actually care about is scheduling fairness, and a strict ordering of execution of job 5 after job 4 is unimportant.

qaq · on Sept 24, 2023

batch inserts process tasks in batches and it is pretty much webscale

johnthescott · on Sept 25, 2023

skip lock works well on many Ks/sec message queues.

boruto · on Sept 28, 2023

Not if you have groups in those thousands and want to maintain order on those groups

ritzaco · on Sept 25, 2023

I often see the "engineers copy FAANG infrastructure because they want to be cool, even though their needs are completely different" take as a kind of attack on engineers.

But I think a lot of it is also about knowledge and documentation. If I want to copy FAANG or another startup, and set up an infinitely scalable queue-based architecture, I can find dozens of high quality guides, tutorials, white papers etc, showing me exactly how to do it. Yes maintenance is higher, but I can get set up with redis, SQS, any of the 'scalable' solutions within a few hours of copy-pasting commands and code and configuration from a reputable source.

If I want to use NOTIFY in postgres? I googled "SQLALchemy notify listen postgres" and I find a few unanswered stackoverflow questions and a github gist that has some code but no context.

I would honestly love to use this approach for a side project, but I don't have 2-3 days to figure it out on my own. The direct choice for me might seem to be

* simple, but not scalable (ie just use postgres)

* complex, but scalable (ie redis, sqs, whatever)

and then it's a tradeoff, and the argument goes that I am blinded by cool tech and FAANG and I'm choosing complex but scalable, even though I don't need scalable.

But taking into account guides and other resources, the choice for me is actually

* complex and not scalable (this, because I don't know how to implement it and I can't predict what pitfalls I might face if I try)

* simple and scalable (what everyone actually does)

and that makes the engineer's choice to follow faang look a lot more reasonable.

zozbot234 · on Sept 25, 2023

> If I want to use NOTIFY in postgres?

The nice thing about "boring" tech like Postgres is that it has great documentation. So just peruse https://www.postgresql.org/docs/current/sql-notify.html . No need for google-fu.

ritzaco · on Sept 25, 2023

Python, Flask, SQLAlchemy, and Postgres all have great documentation individually, but if I am building an application at the intersection often a guide on exactly how to join them all up is much faster than using each individually and trying to figure out the interactions in four places.

AWS white papers and engineering blogs tend to give me everything I need in one place, and I don't think there are any for apps built with NOTIFY.

spprashant · on Sept 25, 2023

SQLAlchemy is an extra abstraction blocking your path here. While you probably should still use an ORM for your regular relation queries, you are not gaining anything significant by trying to use SQLAlchemy for implementing a queue backend. You can write raw SQL with psycopg2 (which is already a dependency in your project thanks to SQLAlchemy), and wrap these raw queue management SQL in a nice little Python module which you can later reuse for other applications as well.

mixmastamyk · on Sept 25, 2023

You can write raw SQL with SA, while keeping the other nice features it has.

hmhmhmhm · on Sept 25, 2023

Without being rude, what are the nice features? I've worked with it a bit and constantly found myself wishing it was just SQL whenever I've bumped into it

mixmastamyk · on Sept 26, 2023

API/engine/connection/pooling abstraction, serialization, type checking, etc.

Look up the features of Core if interested. No ORM needed, as it says in the docs.

randomdata · on Sept 25, 2023

If SQLAlchemy’s documentation doesn’t explain its use with LISTEN/NOTIFY, perhaps it’s the wrong tool for the job? You are presumably not going to use it with Redis or SQS queues, so why are you so hung up on it here?

sgarland · on Sept 25, 2023

IMHO - and this probably why I’ll never launch a product - you should understand each piece of your infra. Not necessarily to the metal on each, but I don’t think it’s unreasonable to be able to explain why each piece is necessary, what it’s doing, and how to troubleshoot it when it breaks.

With your mentioned list, three of them are Python, so that significantly reduces the breadth.

valenterry · on Sept 25, 2023

Scalability comes at a price. Unless you need it, it makes you less flexible. And that is exactly what you don't want to be as a startup.

For instance, if you use postgres with a low load, it is almost trivial to migrate schemas, add new constraints, do analytics etc.

If you use SQS, Cassandra, whatever, then you now get scalability/availability but it becomes much more time-consuming to change things if you figure out that your original design doesn't work. Say the business comes and says "please add constraint X. All users of type foo must never combined value bar at the same time."

It is possible to implemented that without postgres, but it is not easy or simple, especially if you need to make changes.

Therefore, my take is that you either use postgres to stay flexible or you use both postgres and something else on top of it when you know that you won't have to change things. Of course this means additional infrastructure/maintenance overhead.

In the end it's always a trade-off, you just need to know when to trade which thing off against what.

Lutger · on Sept 25, 2023

This is all true, important and often misunderstood, but beside the point made to which you reply.

There's a (sort of) objective trade-off to be made, but another dimension is how familiar you are with the solution and/or how quick can you implement it using documentation and examples.

If you happen to know exactly how to create a horizontally scalable microservice based hairball with nodejs, then maybe you are quicker with that than with some traditional django monolith using a nicely normalized sql database (or whatever).

In a startup, you are often always squeezed for time, so making the objectively right tradeoff for your context is usually secondary to the simple question of 'when can you ship?' If the scalable-yet-inflexible is what stack overflow abundantly recommends and documents, maybe this is quicker to get done now, whatever the consequences are on the longer run.

valenterry · on Sept 25, 2023

Then maybe I just don't understand your post. To me it sounds like you say "FAANG-technology" is chosen because of documentation. But I don't think that the documentation of e.g. SQS is better than the postgres (if you can even compare the too).

If someone says "I choose X over Y because I used X before (or because X is better documented" then fair enough - but I rarely hear that as an argument when choosing "FAANG-technology".

Lutger · on Sept 25, 2023

That was exactly what ritzaco said (not me):

> If I want to copy FAANG or another startup, and set up an infinitely scalable queue-based architecture, I can find dozens of high quality guides, tutorials, white papers etc, showing me exactly how to do it.

I'm not sure about this either, though from reading typical developer blogs and listening to the hivemind, you do get the feeling that you must be scalable. Devs often don't really know when (usually not) that becomes important and how far the vast majority of apps can go with monoliths in big boxes (quite far).

valenterry · on Sept 25, 2023

Okay, if that is the context then I understand.

But my response would then be that this is a stupid example in the context of this whole submission because that submission talks about postgres and trying to get postgres to scale "infinitely" let alone fulfill other properties like extremely high uptime etc. that is just... insane. No one in their right mind tries to do that with postgres. It is one thing to do queueing with it but "infinitely scalable" is a totally different one.

Therefore I can only say: yeah, to set up "an infinitely scalable queue-based architecture" you should not use postgres and the author in the submission says the same thing.

> Devs often don't really know when (usually not) that becomes important and how far the vast majority of apps can go with monoliths in big boxes (quite far).

Right, they make the wrong trade-offs. That is exactly what I wanted to express with my response.

cereal_cable · on Sept 25, 2023

I often find that the tooling I have at work helps speed up the development of more complicated solutions. So saying that FAANG solutions are easy to use and you can be fast at is easy when you have that FAANG support. Even just non-FAANG but large enterprises allow for that, but for startup's is easy to forget how the environment (including tooling) helps speed up all of that work immensely.

So yeah, I find a lot of the more complicated solutions to be simple, but mostly because it's well supported and not by just me.

valenterry · on Sept 25, 2023

But that is not what we are discussing here. From the submission:

> There’s a good chance that you’re already using a relational database, and if that relational database is Postgres, you should consider it for queues before any other software

The point is, if you are already using postgres, then the question is not: should I use postgres for queueing and the rest or should I use postgres for the rest and a FAANG solution for queueing on top of it.

Now the thing is that the FAANG solutions are great in certain ways and allow you to scale a lot and have extremely high availability. But it comes at the cost, for examply those solutions don't support transactions like postgres does. So if you need those (and often you don't know in advance how the business of a startup develops) then now you have to build some technical solution on top of the FAANG solution which is much much slower and more complicated compared to doing it in postgres.

Even if you say that it's more difficult to setup and understand the queueing in postgres (and I agree), I would argue that in the end it is still faster because you don't need to setup and maintain all the infrastructure (yeah, even if it runs in the cloud) unless this is a prototyp and you don't care about security, documentation and all of that and throw it away in the end anyways.

gazpacho · on Sept 25, 2023

> Scalability comes at a price. Unless you need it, it makes you less flexible. And that is exactly what you don't want to be as a startup.

This is a valid comment. I’ve chosen Postgres in the past for the features, not the performance. For example guaranteed at most once delivery (via row locks) and filtering of jobs based on attributes (it’s a database after all).

acaloiar · on Sept 25, 2023

> If I want to use NOTIFY in postgres? I googled "SQLALchemy notify listen postgres" and I find a few unanswered stackoverflow questions and a github gist that has some code but no context.

Author here. I would say that my post is less targeted at someone like you (application developer, presumably) and more targeted at library developers.

I don't think it's ideal for everyone to be implementing bespoke, Postgres-backend (or any other queue for that matter) background job workers in their applications. There's a lot of nuance and implementation details to get wrong with background jobs, and for that reason I think background work should generally be done by more comprehensive, dedicated libraries or frameworks.

If every Rails application didn't have Sidekiq/Active Jobs and instead had bespoke background worker implementations, Rails applications would likely have a much less rosy reputation on account of their unreliability.

JohnBooty · on Sept 25, 2023

Thank you for writing this.

I love the article's point, and I tend to feel that the "chasing the cargo cult of 'scale'" is maybe the biggest problem I see in development teams today. It is certainly the biggest problem that I rarely hear anybody talking about.

    Author here. I would say that my post is less 
    targeted at someone like you (application developer, 
    presumably) and more targeted at library developers.

I think the article might benefit from clarification on this point.

Reading the HN comments, I see that I'm not the only person who came away with a misunderstanding there.

Again, I 100% love the overall point.

acaloiar · on Sept 25, 2023

Yep, fair critique. Glad you enjoyed it.

ranting-moth · on Sept 25, 2023

> Yes maintenance is higher

This is what kills you if you're a small startup. Of course it gives you a lot too. But if you're belly up then it doesn't matter.

Of course go for whatever solution gives you the most benefits while not distracting you too much from your main goal.

I've seen a startup where devs spent around 80% of the time fighting their tools and infrastructure. They had a 3 month runway and today there's a massive hole at the end of that runway. I still shudder form just the thought of it.

PH95VuimJjqBqy · on Sept 26, 2023

This line stood out to me too as very dismissive of something that can absolutely bring you to a standstill.

It's not a small thing and it's not something you should be dismissing out of hand.

PH95VuimJjqBqy · on Sept 25, 2023

> and that makes the engineer's choice to follow faang look a lot more reasonable.

Your argument is that going with FAANG level designs saves time?

And the crux of your argument is that you're able to find a guide online?

I strongly suspect you don't have a healthy respect for complexity.

natmaka · on Sept 25, 2023

Another point is: you don't need scalable now, but may (or even hope) to need it later, and you know that when you will need it you probably won't have time to invest into migrating this component.

Also: you may think that you may one day want to be hired by a FAANG.

mkl95 · on Sept 25, 2023

How relevant is it to be hired by a FAANG? I have some experience with "web scale" systems, but I tend to reject FAANG recruiters because Leetcode makes me want to become an apple farmer (no pun).

vineyardmike · on Sept 25, 2023

> How relevant is it to be hired by a FAANG?

If you want a job there, very relevant.

> I tend to reject FAANG recruiters because Leetcode

I understand the pain of leetcode interviews. They’re terrible. But optimizing your career based on the interview process seems… backwards?

FAANG companies (for example) are very relevant if you want to make a lot of money and live in Silicon Valley without being a successful founder/VC. Apple farmers… not so much. If you live in Tokyo, then FAANG companies might be less relevant.

Either way, doesn’t seem like the interview is where you should draw the line.

mkl95 · on Sept 25, 2023

I guess my career is pretty close to optimal. I get to work on interesting problems from anywhere I want and save a ton of money. If you are an EU candidate, FAANG companies want you to relocate to some city in the UK or Ireland, which would obliterate my savings rate, and are worse places to live than most mainland EU areas. I understand not everyone is as fortunate as me, which increases their motivation to grind Leetcode and the like.

evantbyrne · on Sept 25, 2023

Hard disagree. Some things are difficult to change later on, others not so much, and you can't do everything for v1. The product has to launch at some point. Your choice of queue is one of the things you'll be able to change. Don't complicate things unless you've run the numbers and know you'll need to. A lot of very large companies do just fine with using relational databases as queues.

runeks · on Sept 25, 2023

If your first point holds, then all app components should be “scalable” from the beginning, because you may not have time to make it so later.

And that’s terrible advice, of course. You very likely will have time to scale things up (customer count almost never increase dramatically from one day to the next), and even if you don’t you’ll most likely never deliver a useable product if all components need to be “scalable” from the beginning.

natmaka · on Sept 26, 2023

This is all a matter of balancing constraints. I wrote "you know that when you will need it you probably won't have time to invest into migrating this component.". I didn't wrote "always go for the more scalable".

For a starter "the most scalable component is always the most difficult to integrate and use" isn't true, and "whatever your team knows or don't know, the challenges tied to integrating then exploiting a given component are always the same". There are many parameters. In some contexts taking into account the team's subjective preferences is crucial.

There is no universal rule, à la "always go for the most scalable, neglecting any other consideration" or "the minimal immediate effort is always the best option".

tmpX7dMeXU · on Sept 25, 2023

Yeah. Just keep it to side-projects only. Anyone practicing resume-driven development on my team will be (and I’m exaggerating here) shown the door.

natmaka · on Sept 25, 2023

Decisions are rarely made upon a single criterion, and such a criterion isn't usually formulated explicitly.

_lqaf · on Sept 25, 2023

Building things you don't need in hopes that you'll need them because things you aren't spending time on will grow to demand them is like hiring an investment manager when you're in debt.

JohnBooty · on Sept 25, 2023

I agree with you:

    but I can get set up with redis, SQS, any of the 
    'scalable' solutions within a few hours of copy-pasting 
    commands and code and configuration from a reputable 
    source

    [...] and that makes the engineer's choice to follow faang 
    look a lot more reasonable.

I also agree with the linked article's overall point, but I think the specific "job queue" example from the article is actually a bad example because:

- "rolling your own" job queue is not rocket science but is nontrivial and easy to get wrong w.r.t. locking etc.

- the argument against taking additional dependencies is that now you have one more tool to master, understand, and manage. but my experience is that job queues like Sidekiq are not a significant overhead in terms of developer burden.

edanm · on Sept 25, 2023

That's a great point, that often people misunderstand.

It's even worse than you say though. As someone who has used neither Postgres or Redis for queueing, how am I supposed to know what is the "simple" solution here and if it really solves my problem?

Almost everyone uses solution X. A few people are saying "no, just use solution Y, it's obviously enough and far simpler". Even if it is far simpler, how am I supposed to know whether there are some hidden gotchas here?

Much safer to bet on technology that is proven to work, given that large amounts of people are using it in production for this purpose.

jpgvm · on Sept 24, 2023

Few things.

1. The main downside to using PostgreSQL as a pub/sub bus with LISTEN/NOTIFY is that LISTEN is a session feature, making it incompatible with statement level connection pooling.

2. If you are going to do this use advisory locks [0]. Other forms of explicit locking put more pressure on the database while advisory locks are deliberately very lightweight.

My favorite example implementation is que [1] which is ported to several languages.

[0] https://www.postgresql.org/docs/current/explicit-locking.htm...

[1] https://github.com/que-rb/que

MrPowerGamerBR · on Sept 25, 2023

One reason that makes me dislike NOTIFY/LISTEN is that issues with it are hard to diagnose.

Recently I had to stop using it because after a while all NOTIFY/LISTENS would stop working, and only a database restart would fix the issue https://dba.stackexchange.com/questions/325104/error-could-n...

runeks · on Sept 25, 2023

At my previous company, they switched from using NOTIFY/LISTEN for Postgres notifications to a custom solution built on top of logical replication. As I understand it, part of the reason was reliability. I never touched that part of the code, but I believe the idea was to subscribe to logical replication updates and send out notifications based on these.

cpursley · on Sept 25, 2023

This is what we do. Even put together a library for it:

https://github.com/cpursley/walex

teaearlgraycold · on Sept 25, 2023

Is there a Node version of Que?

mianos · on Sept 25, 2023

Skype used postgres as queue with a small plugin to process all their CDR many years ago. I have no idea if it used these days but it was 'web scale', 10 years ago. Just working, while people on the internet argued about using a database as a queue is an anti-pattern.

Having transactions is quite handy.

https://wiki.postgresql.org/wiki/SkyTools

I did a few talks on this at Sydpy as I used it at work quite a bit. It's handy when you already have postgresql running well and supported.

This said, I'd use a dedicated queue these days. Anything but RabbitMQ.

reubenbond · on Sept 25, 2023

> Anything but RabbitMQ.

Would you mind elaborating on this? I'd be happy for others to chime in with their experiences/opinions, too.

mianos · on Sept 25, 2023

The software works excellently in a development environment and performs well when running as a single instance. However, I encountered issues when scaling it up for high availability in a clustered setup. The system would fail inconsistently, with two masters consuming messages simultaneously, which wasn't ideal for my use case. Eventually, I switched to Kafka and haven't revisited the original solution since.

It's worth noting that these issues might have been due to my improper configuration. Nevertheless, if the configuration process is fraught with pitfalls, that's problematic in itself. I've had these experiences more than once.

Additionally, I found a critical race condition in the Python library, rendering it practically unusable for me. I submitted a bug report with a minimal example demonstrating the issue. I considered fixing it myself, but since using RabbitMQ wasn't crucial for my project, I switched to ZeroMQ, which didn't require a broker. The issue was acknowledged and fixed about a year later. At the time, I had to assume that nobody else was using the Python bindings.

Three years ago, I worked on a project that used the software for a Celery queue. Messages would occasionally go missing, although this could have been a configuration issue on our part. Ultimately, we replaced it with a Redis queue (not the best practice, I admit) and didn't look back. This was for a lower-availability use case where a single instance of Redis sufficed.

boyter · on Sept 25, 2023

I used RabbitMQ for a while and nothing but problems.

Admittedly I probably shouldn't have used it the way I did. I dumped many millions of tasks into it, then fanned out processes pulling from that queue that took a variable amount of time to run. Some ran in seconds, some hours.

I had picked RabbitMQ because I wanted that queue to be durable and resist workers dying, or being restarted. However long lived tasks like this is not really what it was designed for (in my opinion). I kept running into issues where it would take a long time to restart, and stop answering connections and need a restart to continue. I ended up having to write monitoring code to check for this and handle it to have it be slightly reliable.

Im sure it works well for smaller short lived messages, but considering the issues I bumped into I would be hesitant to try it. Id probably reach to redis first with wrappers allowing me to swap out to any other queue as required first.

phamilton · on Sept 25, 2023

I can share our experience with RabbitMQ/SQS/Sidekiq. Our two major issues have been around the retry mechanism and resource bottlenecks.

The key retry problem is "What happens when a worker crashes?".

RabbitMQ solves this problem by tying "unacknowledged messages" to a tcp connection. If the connection dies, the in-flight messages are made available to other connections. This is a decent approach, but we hit a lot of issues with bugs in our code that would fail to acknowledge a message and the message would get stuck until that handler cycled. They've improved this over the past year or so with consumer timeouts, but we've already moved on.

The second problem we hit with RabbitMQ was that it uses one-erlang-process-per-queue and we found that big bursts of traffic could saturate a single CPU. There are ways to use sharded queues or re-architect to use dynamically created queues but the complexity led us towards SQS.

Sidekiq solves "What happens when a worker crashes?" by just not solving it. In the free version, those jobs are just lost. In Sidekiq Pro there are features that provide some guarantees that the jobs will not be lost, but no guarantees about when they will be processed (nor where they will be processed). Simply put, some worker sometime will see the orphaned job and decide to give it another shot. It's not super common, but it is worse in containerized environments where memory limits can trigger the OOM killer and cause a worker to die immediately.

The other issue with Sidekiq has been a general lack of hard constraints around resources. A single event thread in redis means that when things go sideways it breaks everything. We've had errant jobs enqueued with 100MB of json and seen it jam things up badly when Sidekiq tries to parse that with a lua script (on the event thread). While it's obvious that 100MB is too big to shove into a queue, mistakes happen and tools that limit the blast radius add a lot of value.

We've been leaning heavily on SQS the past few years and it is indeed Simple. It blocks us from doing even marginally dumb things (max message size of 256gb). The visibility timeout approach for handling crashing workers is easy to reason about. DLQ tooling has finally improved so you can redrive through standard aws tools. There are some gaps we struggle with (e.g. firing callbacks when a set of messages are fully processed) but sometimes simple tools force you to simplify things on your end and that ends up being a good thing.

imiric · on Sept 25, 2023

That's very insightful, thanks for sharing.

Do you have any experience with NATS, and how would you compare it to RMQ/SQS?

The authors claim it guarantees exactly-once delivery with its JetStream component, and it looks very alluring from the documentation, but looks can be deceiving.

phamilton · on Sept 25, 2023

> The authors claim it guarantees exactly-once delivery

I find this definition has morphed from one meaningful to developers into one queue implementations like to claim. I've learned this generally means "multiple inserts will be deduped into only one message in the queue".

The only guarantee this `exactly-once` delivery provides is that I won't have two workers given the exact same job. Which is a nice guarantee, but I still have to decide on my processing behavior and am faced with the classic "at most once or at least once" dilemma around partially failed jobs. If I'm building my system to be idempotent so I can safely retry partially failed messages it doesn't do much for me.

kureikain · on Sept 25, 2023

It has multiple mode. One of them is explicitly acknowlede mode. If the worker finished process the job but doesn't ack, the message will appear again.

chucke · on Sept 25, 2023

SQS limits you further in other ways. For instance, scheduled tasks are capped to 15m (delaySconds knob), so you'll be stuck when implementing the "cancel account if not verified in 7 days" workflow. You'll either reenqueue a message every 15m until its ready (and eat your the SQS costs), or build a bespoke solution only for scheduled tasks using some other store (the database usually) and another polling loop (at a fraction of the quality of any other OSS tool). This is a problem well solved by sidekiq, despite the other drawbacks you mention.

Bottom line, there is no silver bullet.

mtlguitarist · on Sept 25, 2023

If you wanted to handle this scenario with the serverless AWS stack, my recommendation would be to push records to Dynamo with TTLs and then when they pop have a Lambda push them onto the queue. Would cost almost nothing to do this. If you had 10 million requests a month your Lambda cost would be ~$150 to run this (depending on duration, but just pushing to a queue should be quick). Dynamo would be another ~$50 to run, depending how big your tasks are.

Granted now you need 3 services instead of 1. I personally don't find the maintenance cost particularly high for this architecture, but it does depend on what your team is comfortable with.

phamilton · on Sept 25, 2023

I've explored this space pretty thoroughly, including the Dynamo approach you've described. Dynamo does not have a strict guarantee on when items get deleted:

  TTL typically deletes expired items within a few days. Depending on the size and activity level of a table, the actual delete operation of an expired item can vary. Because TTL is meant to be a background process, the nature of the capacity used to expire and delete items via TTL is variable (but free of charge). [0]

Because of that limitation, I would not use that approach. Instead I would do Scheduled Lambdas to check for items every 15 minutes in a Serverless Aurora and then add them to SQS with delays.

I've had my eye on this problem for a few years and keep thinking that a simple SaaS that does one-shot scheduled actions would probably be a worthy side project. Not enough to build a company around, but maintenance would be low and there's probably some pricing that would attract enough customers to be sustainable.

[0] https://docs.aws.amazon.com/amazondynamodb/latest/developerg...

throwawayjxu · on Sept 25, 2023

You could probably use AWS EventBridge and schedule the message to be posted to SQS in 7 days.

pqdbr · on Sept 25, 2023

> Sidekiq solves "What happens when a worker crashes?" by just not solving it. In the free version, those jobs are just lost.

I've been using Sidekiq for 11+ years in production and I've never seen this happen. Sidekiq (free version) has a very robust retry workflow. What are you talking about here?

yxhuvud · on Sept 25, 2023

He is talking about the case when the worker itself die for some reason. It can be due to for example when the worker died due to using too much memory or if it hits a segfault or whatever.

phamilton · on Sept 25, 2023

Yep. OOMs are the most common cause. It's definitely low frequency. On the order of one in a billion. For some systems that's once a year. For us that's once a week. If that's an important job and it just gets dropped, then you've got a problem.

With the paid features to keep it from getting dropped things still can be painful. We have a lot of different workers, all with different concurrency settings and resource limits. A memory heavy worker might need a few GB of memory and be capped at concurrency of 2 while a lightweight worker might only need 512MB and have concurrency of 20. If the big memory worker crashes, its jobs might get picked up by the lightweight worker (and possibly hours later), which will then OOM and all its 19 other in flight jobs all end up in the orphanage. And now your alerts are going off saying are saying the lightweight worker is OOMing and your team is scratching their heads because that doesn't make any sense. It just gets messy.

Sidekiq probably works great outside of containerized environment. Many swear to me they've never encountered any of these problems. And maybe we should be questioning the containerization rather than sidekiq, but ultimately our operations have been much simpler as we've moved off of sidekiq.

jashmatthews · on Sept 25, 2023

Sidekiq will drop in-progress jobs when a worker crashes. Sidekiq Pro can recover those jobs but with a large delay. Sidekiq is excellent overall but it’s not suitable for processing critical jobs with a low latency guarantee.

https://github.com/sidekiq/sidekiq/wiki/Reliability

hn_throwaway_99 · on Sept 25, 2023

> This said, I'd use a dedicated queue these days.

I agree, primary reason being that if you're in the cloud (thus this applies to a lot of people but obviously not everyone), all the cloud providers have extremely easy to use, and cheap, hosted queueing tech. Even if you're worried about vendor lockin, queueing primitives are so small (basically push and pop), that it's relatively easy to write things in a way so it would be easy to migrate if necessary.

klysm · on Sept 25, 2023

Is it cheap if you already using Postgres though?

hn_throwaway_99 · on Sept 25, 2023

I commented elsewhere, but in most cases I think it would be a bad idea to host your queue tables and logic in the same instance that hosts your primary data. This if you spin up another PG instance in the cloud, it could very well end up costing you more than a default cloud-hosted queue service.

klysm · on Sept 25, 2023

I don’t think that’s best for small things. Unless you can’t vertically scale your instance to handle to load, being able to join and keep transactions within one data store is massively valuable. I wouldn’t want to open myself to distributed systems problems unless I’m absolutely forced to.

justinclift · on Sept 25, 2023

> Anything but RabbitMQ.

Bad experiences? If so, was it the tech itself or surrounding stuff? (aka maybe docs, maybe community, etc)

lmm · on Sept 25, 2023

> Skype used postgres as queue with a small plugin to process all their CDR many years ago. I have no idea if it used these days but it was 'web scale', 10 years ago. Just working, while people on the internet argued about using a database as a queue is an anti-pattern.

It works great until it doesn't, and the way it breaks puts you in a state that's very difficult to recover from. And if your excuse for using a database as a queue was that you were already running a database, that cuts both ways: congratulations, your queue mess has now brought down your primary datastore too.

mianos · on Sept 25, 2023

> It works great until it doesn't, and the way it breaks puts you in a state that's very difficult to recover from.

This is similar to saying, 'if I mess up all the tables in one database I wreck the rest'. Just my opinion, but this is not actually a thing in databases.

Maybe compromised the performance of one database due to another loading things up? I think database are developed with this as an important consideration. I can't say I have seen this, but admit it's a possibility.

Also, if you run one postgres, you won't have of an issue running another if you have the experience in production.

lmm · on Sept 25, 2023

> This is similar to saying, 'if I mess up all the tables in one database I wreck the rest'. Just my opinion, but this is not actually a thing in databases.

If you mess up the tables in one database it doesn't affect others, but if you lock up the server where it can't respond to queries, that affects every database running on that server.

> Also, if you run one postgres, you won't have of an issue running another if you have the experience in production.

We're talking about using a different feature that you presumably haven't used before, so you won't necessarily know about the admin side of that.

j16sdiz · on Sept 25, 2023

> If you mess up the tables in one database it doesn't affect others, but if you lock up the server where it can't respond to queries, that affects every database running on that server.

How is it different from: putting multiple queues on same redis, when one queue is locked up, others queue are affected?

If that's a real risk, you can always put them into different instances. The solution is exactly the same for redis or postgresql

lmm · on Sept 25, 2023

> How is it different from: putting multiple queues on same redis, when one queue is locked up, others queue are affected?

Queues by nature tend to be for tasks that you can tolerate delaying a bit. If your queues can impact your "live"/online processing, that's worse than just impacting other queues.

Also something like redis tends to be a lot simpler and less prone to locking up than the monster that is postgresql.

dventimi · on Sept 25, 2023

I wonder if it would be possible to have more than just one PostgreSQL database.

lmm · on Sept 25, 2023

It is, but it comes with a lot of the same costs as having a PostgreSQL database and a proper queue system.

mianos · on Sept 25, 2023

No, you already know how to run and manager a postgres database.

If have you large teams, for sure, pick a proper queue that someone in the team knows well.

lmm · on Sept 25, 2023

> No, you already know how to run and manager a postgres database.

Well, maybe. Postgres is large and has lots of features. NOTIFY/LISTEN is a distinct thing with its own quirks, and just because you've been running a postgres database up until now doesn't mean you're going to know about the admin side of these other features.

dventimi · on Sept 25, 2023

Now, you're making a different argument.

hn_throwaway_99 · on Sept 25, 2023

> congratulations, your queue mess has now brought down your primary datastore too.

Just don't put your queue tables/logic in the same DB instance as your datastore. There are still a lot of benefits to using the same tech even if you have segregated instances.

5id · on Sept 24, 2023

One of the biggest benefits imo of using Postgres as your application queue, is that any async work you schedule benefits from transactionality.

That is, say you have a relatively complex backend mutation that needs to schedule some async work (eg sending an email after signup). With a Postgres queue, if you insert the job to send the email and then in a later part of the transaction, something fails and the transaction rollbacks, the email is never queued to be sent.

theptip · on Sept 24, 2023

Worth being clear that bridging to another non-idempotent system necessarily requires you to pick at-least-once or at-most-once semantics. So for emails, if you fail awaiting confirmation of your email you still need to pick between failing your transaction and potentially duplicating the email, or continuing and potentially dropping it.

The big advantage is for code paths which async modify your DB; these can be done fully transactionally with exactly-once semantics since the Job consumption and DB update are in the same transaction.

skybrian · on Sept 24, 2023

Email might never arrive, though. The only way to know they got it is to have them follow a link to confirm.

collinvandyck76 · on Sept 25, 2023

That's kind of missing the parent's point. If you wanted to ensure emails arrive, that sounds like another queue that could be backed by a different table that is also produced into as part of the original transaction.

laurencerowe · on Sept 24, 2023

> One of the biggest benefits imo of using Postgres as your application queue, is that any async work you schedule benefits from transactionality.

This is a really important point. I often end up using a combination of Postgres and SQS since SQS makes it easy to autoscale the job processing cluster.

In Postgres I have a transaction log table that includes columns for triggered events and the pg_current_xact_id() for the transaction. (You can also use the built in xmin of the row but then you have to worry about transaction wrap around.) Inserting into this row triggers a NOTIFY.

A background process runs in a loop. Selects all rows in the transaction table with a transaction id between the last run's xmin and the current pg_snapshot_xmin(pg_current_snapshot()). Maps those events to jobs and submits them to SQS. Records the xmin. LISTEN's to await the next NOTIFY.

matsemann · on Sept 24, 2023

Good point. We alleviate that a bit by scheduling our queue adds to not run until after commit. But then we still have some unsafety, and if connection to rabbit is down we're in trouble.

robertlagrant · on Sept 25, 2023

I agree - having to tell a database that something was processed, and fire off a message into RabbitMQ, say, is never 100% transactional. This would be my top reason to use this approach.

> With a Postgres queue, if you insert the job to send the email and then in a later part of the transaction, something fails and the transaction rollbacks, the email is never queued to be sent.

This is true - definitely worth isolating what should be totally separate database code into different transactions. On the other hand, if your user is not created in the DB, you might not want your signup email. Just depends on the situation.

ryanbrunner · on Sept 24, 2023

Another benefit of this is that you're guaranteed that the transaction is completed before the job is picked up. With redis-backed queues (or really anything else), you very quickly run into the situation where your queue executes a job depending on a database record existing prior to the transaction being committed (and the fix for this is usually awkward / complex code).

__jem · on Sept 24, 2023

I'm not sure this is really an issue with transactionality as a single request can obviously be split up into multiple transactions, but rather that even if you correctly flag the email as pending/errored, you either need to process these manually, or have some other kind of background task that looks for them, at which point why not just process them asynchronously.

vb-8448 · on Sept 25, 2023

> With a Postgres queue, if you insert the job to send the email and then in a later part of the transaction, something fails and the transaction rollbacks, the email is never queued to be sent.

An option could be use a second connection and a separate transaction to insert data in the queue table.

Ozzie_osman · on Sept 25, 2023

One thing I love about Kafka is... It's just an append-only log, and a client is essentially just holding an offset. This is conceptually very simple to reason about. It's also persistent and pretty fault-tolerant (you can just go back and read any offset).

Unfortunately, Kafka carries with it enough complexity (due to the distributed nature) that it ends up not being worth it for most use-cases.

Personally I'd love something similar that's easier to operate. You'd probably be able to handle hundreds (if not thousands) of events per second on a single node, and without distributed complexity it'd be really nice.

And yes, in theory you could still use postgres for this (and just never delete rows). And maybe that's the answer.

valzam · on Sept 25, 2023

Considering that you have a native "offset" (auto incrementing id) and the ability to partition by date I would say postgres is a great candidate for a simple Kafka replacement. It will also be significantly simpler to set up consumers if you don't really need to whole consumer group, partition etc. functionality.

maxbond · on Sept 25, 2023

Unfortunately `serial` is not sufficient on it's own for that use case. If you observe the values 1, 2, and 4, you can't actually conclude whether 3 exists or not. That transaction may have failed after incrementing the serial, meaning 3 does not exist, or it may be an ongoing transaction which hasn't yet committed, meaning 3 exists but is not yet visible to you.

So if you update your offset to 4 before the transaction for 3 commits, you'll lose 3 forever (unless you control for this, eg by periodically detecting & requeueing orphaned jobs, or by using a strictly serial/gap free integer rather than the built in auto incrementing type).

sgarland · on Sept 25, 2023

SERIAL/SEQUENCE/IDENTITY increment immediately, not at commit. They’re just reading from a generator. You can also get the current key from it if you’d like.

Advisory locks also exist, if you want to implement logic in the application to inform you of various row conditions without having the DB care about it.

But for the example given, you could do many things:

* Add some boolean columns for ack and complete. Performance due to cardinality of these will eventually start to suck if they’re being indexed, but by that point (millions of rows) you can have thought of another solution.

* Add ctime and atime columns, each of which can be handled by Postgres natively to update when created / written, respectively. This has the advantage of lending itself nicely to partitioning by date range, if that becomes necessary.

* Have three tables - available, in_progress, and completed. Use triggers or application logic to move entries in an atomic manner.

None of this is necessarily normalized, but if you’re receiving JSON payloads for the job, 1NF went out the window anyway.

chalcolithic · on Sept 25, 2023

I used to hope that FasterLog would eventually be mature enough

rubenfiszel · on Sept 24, 2023

We use exactly this for windmill (OSS Retool alternative + modern airflow) and run benchmarks everyday. On a modest github CI instance where one windmill worker and postgres run as containers, our benchmarks run at 1200jobs/s. Workers can be added and it will scale gracefully up to 5000jobs/s. We are exploring using Citus to cross the barrier of 5000j/s on our multi-tenant instance.

https://github.com/windmill-labs/windmill/tree/benchmarks

vhiremath4 · on Sept 25, 2023

We used postgres for some of our queues back when we were at ~10 msg/s. It scaled quite a bit, but, honestly, setting up SQS or some other queue stack in AWS, GCP, or Azure is so simple and purpose built for the task (with DL queues and the like built in), I don’t know why you wouldn’t just go that route and not have to worry about that system shitting the bed and affecting the rest of the DB’s health.

It seems foolish. I am a big fan of “use the dumbest tool”, but sometimes engineers take it too far and you’re left with the dumbest tool with caveats that don’t seem worth it given the mainstream alternative is relatively cheap and simple.

rtpg · on Sept 25, 2023

What I've settled on is "store most job state in the DB, use task queues just to poke workers into working on the jobs".

Storing the job state in the DB means you can query state nicely. It's not going to exactly show the state of things but it's helpful for working through a live incident (especially when most job queues just delete records as work is processed).

And if you make all the background tasks idempotent anyways then you're almost always safe with running a thing like "send a job to the task queue to handle this job".

If you rely _just_ on message queues, there are a lot of times where you can have performance issues, yet have a lot of trouble knowing what's going on (for example, rabbitMQ might tell you the size of your queues, but offer little-to-no inspection of the data inside them).

vhiremath4 · on Sept 25, 2023

Ultimately you have to figure out the separation of concerns of the job state and other core state. Ranging from “all state stored in message and will never become out of sync” to “no state stored in message and will never become out of sync”. In between you have “some state stored in db and some in message” and what I’ve found to be useful is keeping stuff in the db that needs to have high end state integrity (or as you said just making sure jobs are cancellable/idempotent).

Tangible example:

We have a video transcoder queue. The state of the video model in our db can change as the video is being finalized in various ways. The transcoder generates thumbnails and assets from the video and also updates its state in the db. So we store job information in the message about what thumbnails we want to generate and the video ID but nothing else. This allows us to look up the video row, see if the same media was already transcoded from the video (and cancel the job), and, if not, run the job and update the video row.

Also (and I know you’re not saying this), but I’ve never understood the argument that keeping queues in Postgres leads to higher data integrity via transaction guarantees. The job is still running on another process outside of the db. The only time this could be true is if the job itself mostly updates state in the db, in which case it’s the small minority of queued workloads (with the majority needing to do non-db compute work).

bicijay · on Sept 25, 2023

Transactions, data consistency. This is the answer that you will not find in SQS.

mjb · on Sept 25, 2023

You can't do better than At Least Once if you're having side effects outside the database, so it's not clear that SQS's weaker semantics have any practical effect.