
Transactionally Staged Job Drains in Postgres - johns
https://brandur.org/job-drain
======
brandur
(Author here.)

I've taken fire before for suggesting that any job should go into a database,
but when you're using this sort of pattern with an ACID-compliant store like
Postgres it is _so convenient_. Jobs stay invisible until they're committed
with other data and ready to be worked. Transactions that rollback discard
jobs along with everything else. You avoid so many edge cases and gain so much
in terms of correctness and reliability.

Worker contention while locking can cause a variety of bad operational
problems for a job queue that's put directly in a database (for the likes of
delayed_job, Que, and queue_classic). The idea of staging the jobs first is
meant as a compromise: all the benefits of transactional isolation but with
significantly less operational trouble, and at the cost of only a slightly
delayed jobs as an enqueuer moves them out of the database and into a job
queue.

I'd be curious to hear what people think.

~~~
koolba
> I've taken fire before for suggesting that any job should go into a
> database, but when you're using this sort of pattern with an ACID-compliant
> store like Postgres it is so convenient.

+1 to in database queues that are implemented correctly. The sanity of
transactional consistency of enqueuing alone is worth it. I've used similar
patterns as a staging area for many years.

This also allows for transactionally consistent error handling as well. If a
job is repeatedly failing you can transactionally remove it from the main
queue and add it to a dead letter queue.

~~~
brandur
> _This also allows for transactionally consistent error handling as well. If
> a job is repeatedly failing you can transactionally remove it from the main
> queue and add it to a dead letter queue._

Totally. This also leads to other operational tricks that you hope you never
need, but are great the day you do. For example, a bad deploy queues a bunch
of jobs with invalid arguments which will never succeed. You can open a
transaction and go in and fix them in bulk using an `UPDATE` with jsonb select
and manipulation operators. You can then even issue a `SELECT` to make sure
that things look right before running `COMMIT`.

Again, something that you hope no one ever does in production, but a life
saver in an emergency.

------
heinrichhartman
Content aside. I never saw a blog article, so carefully typeset as this one:

\- Font choices and sizes

\- TOC

\- Figures

\- Code samples

... all look perfect. It even includes a carefully spaced initial.

I'd love to be able to replicate this on my Jekyll blog. But looks like most
of this is hand-crafted HTML/CSS:
[https://github.com/brandur/sorg](https://github.com/brandur/sorg)

~~~
sandGorgon
+1 - i wish it were made in Hugo (which is golang as well and 100x as fast as
jekyll).

~~~
mikeatlas
Is static page generation performance really a thing for you?

~~~
sandGorgon
here's the thing - i think the definition of fast is in two different
dimensions and both of them count.

One is the generation: i actually run my startup's website on Hugo. There are
quite a few pages - including landing pages - and jekll was sloooow.

Second, we have marketing people on windows + mac, dev people on linux. The
time taken for someone to setup Hugo on their laptops is 5 minutes: its a
single binary.

Jekyll - upwards of 30 minutes after fighting ruby and setting %PATH%
variables in windows.

------
memracom
I think it is great that PostgreSQL is strong enough to allow people to build
robust queuing systems, but I still think that you are better off in the long
run to use a real message queuing system like RabbitMQ to do this job.

Start out by running RabbitMQ on the same server as PostgreSQL but do limit
its use of cores and RAM. Then when your business grows you can easily scale
to a separate RabbitMQ server, to a cluster of MQ servers and to a distributed
RabbitMQ service using clusters in multiple data centers with global queues
synchronized using a RabbitMQ plugin.

The benefit of using RabbitMQ is that you begin to learn how message queuing
fits into a system architecture and that you will not run into corner cases
and weird behaviors as long as you heed the advice of moving to a dedicated
RabbitMQ server when your usage gets large enough.

An additionally benefit is that when you learn how to integrate functionality
by using a message queue (actor model) rather than a link editor, you can
avoid the monolithic big ball of mud problem entirely and easily integrate
both monolithic functions and microservices in your app.

Background jobs are just one part of what a robust message queue gives you. In
my opinion, the desire for background jobs is a design smell that indicates a
flaw in your architecture which you can fix by adding a message queue system.

~~~
netcraft
I tend to agree with you but I haven't found any good way to put things in a
queue from within postgres - from a trigger for instance. Doing so would open
up a lot of possibilities - do you have any suggestions, even if its just for
things to google?

~~~
Deimorz
I haven't personally used either of these, but they look interesting and I'm
hoping to test them out at some point:

[https://github.com/gmr/pgsql-listen-exchange](https://github.com/gmr/pgsql-
listen-exchange)

[https://github.com/subzerocloud/pg-amqp-
bridge](https://github.com/subzerocloud/pg-amqp-bridge)

In theory, these let you use postgres NOTIFY to add messages to queues (which
can be done from inside triggers).

------
pnathan
Interesting.

I'm working on delivering a Postgres based job system right now; we cycle
through states from an ENUM, landing eventually on a terminal state. Worker
jobs (containers on a cluster) don't directly manipulate the state of the
table, there's a controller system for that. Each controller in the (3-node)
cluster has 2 connections to Postgres. Old jobs are DELETE'd when it's been
"long ago enough".

Prior to addressing deadlocks from doing too much per transaction, initial
load testing for _this_ system suggested that the database was not the
bounding factor in the system throughput, but rather worker throughput.
Initial load is estimated to be under 500/day (\ _yawn\_ ), but pushing the
load to 100K/day didn't alter the outcome, although it made the cluster admin
mildly annoyed.

One key reason I prefer to have the state machine switching / enum approach is
that it's _logically_ obvious. At a certain point, I am sure it'd have to
change. I'm not sure how many concurrent mutations to separate rows a Postgres
table can tolerate, but that serves as a hard upper bound.

Author: what kind of volume do you tolerate with this kind of design?

------
rraval

        loop do
          DB.transaction do
            # pull jobs in large batches
            job_batch = StagedJobs.order('id').limit(1000)
    
            if job_batch.count > 0
              # insert each one into the real job queue
              job_batch.each do |job|
                Sidekiq.enqueue(job.job_name, *job.job_args)
              end
    
              # and in the same transaction remove these records
              StagedJobs.where('id <= ?', job_batch.last).delete
            end
          end
        end
    

Isn't this essentially a busy loop? You can achieve something much more
performant by using `LISTEN` and `NOTIFY` to fire an event every time a row is
inserted.

Then the enqueuer can do a preliminary scan of the table when it boots up and
then just a `LISTEN` instead of polling the DB.

~~~
brandur
Note the code snippet is mostly meant as a demonstration rather than something
that's really production-grade.

Using `LISTEN`/`NOTIFY` is certainly workable. It's worth considering though
that once you've got non-trivial traffic, you're going to have new jobs
showing up all the time, so the loop is likely going to be cycling endlessly
no matter what mechanic you use to feed it.

At the very least though you'd want a sleep statement with exponential backoff
so that it's not churning away on an empty database at the low end.

------
bgentry
This pattern would basically be a clean migration away from a pure Postgres
queue if either table bloat or locking becomes a performance problem. You
maintain the benefits of transactional job enqueueing while only slightly
worsening edge cases that could cause jobs to be run multiple times.

Just be sure to run your enqueueing process as a singleton, or each worker
would be redundantly enqueueing lots of jobs. This can be guarded with a
session advisory lock or a redis lock.

Knowing that this easy transition exists makes me even more confident in just
using Que and not adding another service dependency (like Redis) until it’s
really needed.

~~~
Deimorz
> Just be sure to run your enqueueing process as a singleton, or each worker
> would be redundantly enqueueing lots of jobs. This can be guarded with a
> session advisory lock or a redis lock.

If you're using PostgreSQL 9.5+ you can also use the new SKIP LOCKED
capability, which is perfect for this sort of usage:
[https://blog.2ndquadrant.com/what-is-select-skip-locked-
for-...](https://blog.2ndquadrant.com/what-is-select-skip-locked-for-in-
postgresql-9-5/)

------
meritt
Sequence allocations occur globally and outside your transaction.

    
    
        StagedJobs.where('id <= ?', job_batch.last).delete
    

This will end up deleting a job id that was reserved inside a transaction,
meanwhile your enqueuer kicks off and fetches the jobs, then your transaction
writes the job to staged_jobs table, just in time for enqueuer to delete it
without ever queueing it.

You need to delete the specifically queued ids and not a numeric range.

------
njharman
> by selecting primed jobs in bulk and feeding them into another store like
> Redis

Doesn't this just mean bunch of lost jobs when redis fails.

Why not keep jobs with job state wait, done, etc in the reliable ACID store.

~~~
brandur
> _Doesn 't this just mean bunch of lost jobs when redis fails._

You need to be a little careful about configuration, but believe it or not, as
long as Redis is configured with an append only file [1] it will provide you
with a durability guarantee just like Postgres would. If it crashes, any jobs
that were in there are recovered the next time it starts up.

From the docs:

> _From now on, every time Redis receives a command that changes the dataset
> (e.g. SET) it will append it to the AOF. When you restart Redis it will re-
> play the AOF to rebuild the state._

\---

[1] [https://redis.io/topics/persistence](https://redis.io/topics/persistence)

------
sandGorgon
this is so awesome. for a small team building infrastructure on the cheap,
building background jobs on postgres is so much nicer than using more complex
tools like rabbitmq, etc .

are you planning on productizing this ?

