

Asynchronous Processing in Web Apps, Part 1: A Database Is Not a Queue - nesquena
http://blog.gomiso.com/2012/11/15/asynchronous-processing-in-web-applications-part-1-a-database-is-not-a-queue/

======
fdr
I think the article is good, however, I feel compelled to weigh in from the
countervailing general direction:

I am a message queue skeptic. Not that they should never be used, but rather a
general feeling that complex, dedicated message queue software is often used
for engineering problems between two or three orders of magnitudes too small
before they deliver value. And, for most projects, queue replacement is not
too difficult, especially if one's use of a database-backed queue is
relatively naive.

As such, I will -- with an open mind -- suggest that general purpose database
management software masquerading as queues is not an outright antipattern, and
the times to use them in this way is probably more commonly seen than the
opposite, where a dedicated queue delivers clear value.

Here are my main reasons for thinking that, which almost entirely have
something to do with being able to address one's queue and one's other data in
the same transaction:

* Performance: a pedantically unsound but basically reasonable rule of thumb (as experienced by me) would suggest that one needs to be processing somewhere between hundreds or thousands of messages per second with at least fifty (and maybe up to a hundred or two) parallel executors processing or emitting messages before there are performance issues where the lower constants of a dedicated queuing package become attractive. Below this kind of throughput, one starts to experience some more pain than gain.

* Correctness: Most database + queue integrations do a lousy job of what is effectively two-phase commit between different data storage instances (so that would include two+ RDBMSes, even if they are of the same kind, e.g. 2xPostgreSQL), and frequently the queue has to be counted among these (exception: when the queue contents can be lost/can be rebuilt/is idempotent at all times). Systems do a lousy job of making this work because it's pretty finicky to do a good job in many situations, i.e. expensive and time consuming.

* Constants, when dealing with other systems: When one _does_ do a good job and has interesting requirements in the 'correctness' case, it often means doing forms of two-phase commit, whether explicitly supported by the system (e.g. PREPARE TRANSACTION) or a spiritual equivalent via carefully thought out state machines. In principle these could be relatively cheap, but typically to avoid complexity more expensive approaches are employed, such as a couple extra UPDATE requests to poke at some home-grown state machine.

Also, my experience indicates that as systems evolve, there will be inevitable
bugs in these state machines that, by nature, span systems. Be vigilant and
make sure you get more value than pain, and try to avoid having too many of
them.

* HA is still hard: clustering is generally in principle possible, but make sure you read the fine print. For example, many people use Redis as a queue, but it is not really unlike any other monolithic database most generally -- its main draw as a vanilla queue is good execution-time constants. The same could be said of Apache ActiveMQ in its least byzantine configuration. One might think that one would get a lot of leverage 'for free' given the simpler semantics of queues vs the diversity of access methods in most general purpose databases, but so far I have not seen that to be the case, for the very good reason that a lot of people expect a lot of reliability out of their queues (no less than the transactional nature of some databases), and doing that is either most natural in a monolithic system or slow, or complicated, or both in a multi-master distributed system.

All in all, if you think you need dedicated queuing software to send a few
dozen emails a second (that's a lot of email for most people!), think twice:
it might still be a good idea, but brace yourself for these pitfalls or
convince yourself that they probably mostly don't apply to you.

~~~
chris_wot
Interestingly, that's why Microsoft created SQL Server Service Broker. They
had te infrastructure for reliable message queuing and transactional support,
so they created SQL Server Service Broker!

Not sure I ever took off though...

~~~
cico71
Unfortunately it didn't really took off, but it's a shame because it's a good,
polished and complete implementation that allows for some advanced scale-out
topologies (it can also be used as a foundation for data dependent routing and
map/reduce scenarios).

<rant>I guess it's not much used because people like to reinvent the wheel
every time (by manually implementing queues using tables with all the
traditional concurrency problems) instead of learning something a bit more
complex.</rant>

Anyway, it's not going to be thrown away anytime soon as it's used in other
parts of the engine (e.g. SMTP mail integration, Server Events and Query
Notifications).

------
icebraining
_With a traditional database this typically means a service that is constantly
querying for new processing tasks or messages._

Traditionally, but not necessarily; PostgreSQL supports the LISTEN and NOTIFY
commands for asynchronous notifications, without polling.

~~~
nesquena
I agree and if polling was the only consideration then this would mitigate the
issue. Admittedly I didn't cover this as much in my article but there's also a
lot of other flexibility afforded by a good message queue around delivery
strategies, consumption strategies, error handling, etc that you would have to
do much more manually and painfully trying to shoehorn it into a database
(even a good one). I am not saying PostgreSQL notify commands are not useful
for certain tasks, just encouraging people to understand the tools available
and pick the right one for the job.

------
lsh123
1) The article claims that queues is a replacement for a database thus it
solves all the problems with DB solution. In reality, if you have to have
reliability and data integrity guarantees, then a messaging queue system will
be using a DB under the hood (in the best case, it will be off-the-shelf DB,
in the worst, something homegrown). All the same problems and considerations
apply. You run into the same DB problems with the need to manually optimize
queue tables, etc. because message queues rarely do a good job there. Of
course, all the commit issues discussed above apply to.

2) The issue with polling is easily solved by using modern DBs (e.g. Postress
with listen/notify) or just plain old triggers/stored procedures. Yes, it will
be a custom solution but it will be simpler from operational perspective than
a "generic" do-it-all solution.

3) Lastly, it all depends on the task. At WePay we use gearmand backed by a
DB. We have seen it working really well and helping us handle 1000x spikes
from normal load w/o a single problem. However, we did quite a few
modifications to get there from the "stock" code to customize it to our use-
case and have 200% data integrity and reliability guarantees.

~~~
rhettg
I understood the point as "Don't use a database as a replacement for a queue".
Use the right tool for the job.

The fact that gearmand is backed by a DB is not at all the same as using a DB
for a queue directly. Gearmand just uses a database as a backup that it can
reload tasks when it get's restarted.

~~~
lsh123
Yes. However, you have same issues with DB tables receiving tons of
inserts/deletes/selects. This is the worst possible DB load :)

~~~
seunosewa
DB's are designed for this purpose.

------
PaulHoule
I dunno. I've seen companies go through 4 or 5 different message queue systems
and find that none of them work quite right.

12 years or so ago I developed a few systems that used qmail as a message
queue and I was pretty happy with that.

Asynchronous processing is a necessary evil, but I think a lot of people
underestimate the difficulty. It's one thing to compress a video in the
background, but if you have one asynchronous task that spawns a bunch of
asynchronous tasks and they spawn asychronous tasks and someday they all come
together... Well maybe they come together someday. There's a definite
"complexity barrier" you hit when asynchronous applications rapidly become
harder to maintain.

There are ways around this, but I've frequently seen MQ-based systems that
never get "done".

~~~
JeffJenkins
I believe this is the problem that Storm was supposed to solve (
<https://github.com/nathanmarz/storm> ). Or Amazon's SWF (
<http://aws.amazon.com/swf/> ). They work in the case where you can define the
flow between all of the tasks.

------
jhartmann
Nice article. One thing you should talk about in my opinion are systems like
Redis. Redis can be used as a generalized key value store, but it can also be
used as a messaging platform. In fact many systems like Storm (which would be
another great topic) have easy integration with redis pub subs. While Redis
and other solutions like it probably are not a good fit for all your data,
they are great for mixed supporting data, caching and messaging. There are
also really nice integrations if you like Java, these days you can make Spring
message driven beans to consume messages from a redis pub sub very easily with
minimal configuration. ZeroMQ is probably another technology that is worth
discussing, either using it with a layer like Storm on top or by itself. Not
every messaging system has to be heavyweight and cumbersome like the good ole'
days.

~~~
nesquena
Definitely, thanks for your feedback. Will be bringing up redis in the next
post. Redis has some messaging functionality baked right in and can be a good
solution as seen with <https://github.com/defunkt/resque> in Ruby as well.
ZeroMQ is also an important technology to understand in the context of this
discussion.

------
blcArmadillo
Look forward to reading the future articles. Right now I'm working on a side
project that has a need for some asynchronous tasks. I was planning on using
beanstalkd but the one thing that concerns me is that if the queue goes down
the outstanding jobs are not persisted. Any recommendations on the best way
around this?

~~~
nesquena
Yes beanstalkd has solid persistence support now in later versions. You can
use the “-b” option, and beanstalkd will write all jobs to a binlog. If the
power goes out, you can restart beanstalkd with the same option and it will
recover the contents of the log.

------
aidos
Nice clear introduction.

You can write a simple Async DB system without running into deadlocks.
Deadlocks tend to occur when you have 2 processes trying to lock 2 different
resources in differing orders. Not a case you'll run into here.

Also, you could do something like (correct me if I'm wrong, but I think it
should be safe):

    
    
      update queue set owner = '1_time_use_rand_key' where owner is null order by id limit 1;
    

Not that I'm advocating this, you should definitely use a proper system to
manage these types of tasks! :)

------
matticakes
disclaimer: co-author of NSQ [1] here

Agreed. Message queues play an important role for us (bitly) in being a layer
of fault tolerance, buffering, and a means to perform various operational
tasks.

They're so important to us that we decided to build something that worked
exactly like we wanted.

NSQ is a realtime distributed message processing system where we've taken the
approach of focussing on making it ops friendly and easy to get started.

IMO, solutions that make it easy to develop on and administrate are most
important... because things break. NSQ is straightforward to deploy (limit
dependencies), simple to configure (runtime discovery), and client libraries
provide a lot of functionality important for handling failures (like backoff,
deferred messages, etc.) for a variety of use cases.

We've written an in-depth introductory blog post [2] about NSQ that has more
details.

[1]: <https://github.com/bitly/nsq> [2]:
<http://word.bitly.com/post/33232969144/nsq>

~~~
nesquena
Interesting, thanks for sharing the links. I think building your own message
queue is sometimes the only way to get things working exactly as you might
want. For people looking for a highly configurable framework for building a
tailored MQ, be sure to check out <http://www.zeromq.org/> as a basis.

------
randomtree
If you would consider using NoSQL, then MongoDB might be a good fit. It has a
messaging queue. It's called capped collections with tailable cursors
(<http://www.mongodb.org/display/DOCS/Tailable+Cursors> ). It's persistent,
you don't need polling, and you don't need to remove processed messages.

------
eternalban
An in-(or out of) process (async) driver (exposing an API supporting future
semantics) can provide DB specific queueing to some extent. MQs and MOMs are
very useful but not always necessary and the additional latency due to node
hops is sometimes not acceptable.

------
kamakazizuru
great post! this is what I wish I had as a starting point when I was learning
how to get web apps to do work smarter rather than diving straight into celery
documentation cause someone told me that would be the way to go!

~~~
nesquena
Thanks! That's exactly why I am writing this series. I plan to take people
slowly through everything about asynchronous processing, message queues,
handling job processing, etc. Message queues are underused in modern web apps,
and I think also not understood as well as they should be. Planning to include
plenty of diagrams along the way to help with that.

