
5 subtle ways you’re using MySQL as a queue, and why it’ll bite you - abredow
http://www.engineyard.com/blog/2011/5-subtle-ways-youre-using-mysql-as-a-queue-and-why-itll-bite-you/
======
cellularmitosis
I've been met with looks of disgust for using a filesystem to implement a
queue, but I feel it's unjustified. A modern unix filesystem is surprisingly
well suited to this task: You get atomicity "for free", inotify allows it to
be interrupt driven rather than polled, it inherently supports multiple
processes (thus different parts of the system can be implemented in different
languages), there's no need for locking as long as you implement the queue
using directories and 'mv', and it's extremely quick to implement, understand,
and modify.

The only caveats are that of performance (with a traditional server I wouldn't
worry about performance until you need to process hundreds of items per
second, but on EC2 nodes that threshold is more near the range of dozens per
second), and the need to regularly archive the "done" directory (cron solves
this nicely).

~~~
snprbob86
I recently read through news.arc (the source to Hacker News itself) and was
dumbfounded by how such a simple, file-system backed system was able to
cleanly and performantly handle many of the use cases of a document store or
key value store. Are there any good resources on the DOs and DONOTs of
building apps in this "Hey.... dummy.. Just use the file system!" -style?

~~~
cellularmitosis
Watch out for the 32,000 subdirectory limit. If your job tickets are complex
enough to be implemented as a directory instead of a file, you'll get bitten
by this (the number of files in a directory is only limited by the number of
inodes in the entire filesystem).

If you are really lucky, and your tickets only need to represent a single
piece of data (some sort of ID for example), you can just use the name of the
file itself for the data storage and deal only with empty files. Because this
only uses a single inode/block, it represents the best case scenario for speed
and scalability in terms of the number of tickets which can accumulate before
you need to archive. But more likely, you are going to have to worry about
ticket namespace collisions (unless you have some sort of "set" like
requirement where each ID can only be in the queue once at a time) which means
you are using something like mktemp to create the file and then storing the ID
inside the file.

Another key is to make sure you create new jobs in a "staging" dir, and then
mv them into the "in" dir. Otherwise you have a race condition between your
queuing system and whatever creates the tickets.

Here's a basic layout: /stage, /in, /active, /done. Some process on your
system creates a ticket (which could be a single file or a dir) in /stage and
then moves it into /in. This wakes up your queue, which moves it to /active
when it starts processing it, and then moves it to /done and moves on to the
next ticket in /in.

Another nice thing this gives you is that recovering from a crash / unclean
state amounts to running ls on /stage, /in, and /active.

~~~
arethuza
"32,000 subdirectory limit"

One top tip from personal experience is to make the resulting structure
reasonably straightforward to browse manually - having huge numbers of
subdirectories is going to be a barrier to this.

------
Woost
Percona always seems to have good articles.

I think, as he said, everyone shouldn't run out and replace a mysql job queue
for their wordpress blog. In a great many cases it doesn't matter.

I also like how he never said "Don't use mysql as a queuing system" but "be
careful of these things". I've used mysql as a queuing system, and it works
fine. I looked at replacing it with a different database, but in that
situation it was not worth the investment.

Signaling mysql + archiving performed work + no locks that lock more than the
exact row that's being updated (and also avoiding concurrent workers acting on
the same task) will take a mysql backed queuing system far. I've set up a
system that processes well over 5,000 tasks / day using it.

Do I think everyone should use mysql as their queuing backend? No. People
should probably use a queuing library, with persistence to a database (redis?)
enabled for critical tasks. Of course, as the article said, be careful about
the choice of backends.

~~~
fhars
That is about one task per 50 trillion CPU cycles, you would be hard pressed
to write a queue implementation that is to slow for that. Numbered files in a
single directory on a synchronously mounted filesystem designed for few large
files that only allows linear directory scans might qualify, but I am not even
sure about that.

------
andrewvc
This is why Redis / *SQL is my favored stack. It just covers so many bases,
you get things like safe queuing, caching, pub/sub, and weird high-performance
low-durability cases from Redis, and great, safe relational support from SQL.

For best results, it's good to have at least two redis servers, one with
snapshotting as a cache (fast, less durable), one with 1 second Append only
files (still fast, but slower) for data you care more about.

~~~
rbranson
What happens when the queues back up and Redis runs out of RAM?

~~~
j_baker
If your queues are getting that backed up, you're either facing bigger
problems than your queueing system (most likely workers being down), or you're
big enough to afford more machines and/or a custom solution (such as kestrel).

~~~
rbranson
The parent post referred to using Redis as a multi-purpose store. If there is
a bunch of other non-queue data in Redis and the setup is only expecting it to
use 1GB or so, there's likely not a giant amount of room for queue entries
left. While everyone is sleeping, some crashed workers combined with broken or
poorly configured monitoring can fill up queues very quickly. Been there, done
that. Either the OOM hits and there is some data loss or the swap hits and
brings everything down.

------
kogir
Queues in the DB are so common that in MSSQL they made it a first class
feature: SQL Server Service Broker. Using it is an XML and T-SQL nightmare,
but since it guarantees in-order, only once delivery, and supports routing and
in-DB worker activation, you can build some really robust and powerful stuff
with it.

MySpace used it to keep their partitioned databases in sync:
[http://www.microsoft.com/casestudies/Case_Study_Detail.aspx?...](http://www.microsoft.com/casestudies/Case_Study_Detail.aspx?CaseStudyID=4000004532)

------
mojuba
> Instead of SELECT FOR UPDATE followed by UPDATE, just UPDATE with a LIMIT,
> and then _see if any rows were affected_

Should be noted, this is not necessarily a good solution: a concurrent
consumer, which may be another incarnation of a given script running with a
lag, may hijack the queue element locked this way; as a result you may end up
having two or more incarnations of the consumer handling the same queue
element.

The most universal approach to DB queues is to assign each consumer process a
unique ID which it should use for locking queue elements in their UPDATE ...
LIMIT 1.

~~~
Limes102
I agree. I set a process identifier and the time of the update.

------
Ogre
I'm just going to count killing a SLEEP(100000) query as a means of signalling
a worker as the something new I learned today. I'm not sure I've ever written
anything where implementing that would have had any real impact, but it's
filed away for the future.

------
k7d
The article didn't mention the main advantage of storing queues in DB -
transactions. Say you need to update other records in DB while processing a
job with 100% consistency. If it's all in the same DB you can update both job
as well as data in a single transaction.

~~~
Devilboy
MySQL is not too hot on Transactions

~~~
mootothemax
MySQL != MyISAM. Check out InnoDB :)

~~~
moe
InnoDB isn't too hot on transactions either.

Google for "InnoDB deadlock".

~~~
mootothemax
I've googled, but am seeing the same kind of results that I see when I try
"oracle deadlock", "postgresql deadlock" and "db2 deadlock."

Still being relatively new to InnoDB, what particular deadlock dangers make it
more troublesome than other DBMS engines?

~~~
pkteison
There are three possible ways that I know of to handle concurrency issues on
shared data: Use some sort of a journaling approach (google keywords
'snapshots' or MVCC), or use locks, or ignore the problem.

If it ignores the problem, it's not a database.

Locks suck for volume. Locks cause much more deadlock than other options.
Locks are fast in the simple case. Locks are easier to program and take less
resources.

InnoDB uses locks.

SQL Server also defaults to locks. People often specify 'ignore the problem'
mode (nolock/read uncommitted). There is a new journaling approach available,
but it was only introduced in 2005 and I don't think many people are using it
yet. Which is a shame, it's a great feature.

Oracle and Postgres both do a journaling approach. They will have less
deadlock problems because readers and writers don't need to block each other.
With InnoDB or default sql server, read locks block writes, which sucks. See
[http://dev.mysql.com/doc/refman/5.0/en/innodb-lock-
modes.htm...](http://dev.mysql.com/doc/refman/5.0/en/innodb-lock-modes.html)

~~~
mbirk
By default, InnoDB doesn't acquire read locks when doing queries. It "runs
queries as nonlocking consistent reads by default, in the style of Oracle."
See [http://dev.mysql.com/doc/refman/5.0/en/innodb-transaction-
mo...](http://dev.mysql.com/doc/refman/5.0/en/innodb-transaction-model.html)

~~~
pkteison
Wow, looks like I got that completely wrong w.r.t. innodb specifically. Wish I
could still edit my first post. :( Thank you.

[http://dev.mysql.com/doc/refman/5.0/en/innodb-consistent-
rea...](http://dev.mysql.com/doc/refman/5.0/en/innodb-consistent-read.html)

------
dmk23
The title of the article is not very representative of its contents. It should
be: "5 subtle ways you’re using MySQL as a queue, and how it COULD bite you IF
you use poor schema design NOT optimized for YOUR workload".

MySQL queues work just fine with the recommendations Barron provides himself
"1) avoid polling; 2) avoid locking 3) avoid mixing queue and archive tables".

------
damir
Openbsd folks used lpq to queue mp3 playlist. Cups is also an option and you
get full stack of goodies built in.

[http://patrick.wagstrom.net/weblog/2003/05/23/lpdforfunandmp...](http://patrick.wagstrom.net/weblog/2003/05/23/lpdforfunandmp3playing/)
<http://rendermania.com/building-a-renderfarm-with-cups/>

------
kingkilr
Is the page's rendering totally busted for anyone else? Chrome on Ubuntu.

~~~
BauerUK
Yes, rendering issues for me, too.

Chrome 13.0.782.220 Ubuntu 11.04 (Linux 2.6.38-11-generic) GNOME 2.32.1

Extensions:

\- Adblock Plus for Google Chrome™ (Beta) - Version: 1.1.4

\- Xmarks Bookmark Sync - Version: 1.0.16

\- Reddit Enhancement Suite - Version: 3.4 (Disabled)

Examples:

1\. <http://i.imgur.com/TD8UU.png>

2\. <http://i.imgur.com/HIXbP.png> \-- with text selected.

------
Limes102
I've been in a situation where I've needed to queue about 100k of messages.
Each message unique with custom attributes populated also from MySQL.

I used to generate the messages and then insert them into queuing system but
for 100k messages I never managed to make this fast... I have managed to queue
all these messages in less than half a second using just one MySQL query.

If anyone has any better ideas, please let me know!

------
lunaru
MongoDB offers findandmodify which makes for a good synchronized queue up to
some point. If anyone's using PHP and Mongo, feel free to take a look at
MongoQueue: <https://github.com/lunaru/MongoQueue>

Once you start hitting hundreds of jobs per second, you'll want to scale
horizontally, but that shouldn't be the case for 99% of use cases.

------
IgorPartola
And here is one explicit way to use MySQL as a queue:
[https://www.pingbrigade.com/blog/entry/selector-workers-
reco...](https://www.pingbrigade.com/blog/entry/selector-workers-recorder-
pattern)

~~~
molesy
Hi! Please read TFA. You should not be advocating this pattern to anyone. It
sucks, it will break very quickly, and Baron explains why.

~~~
IgorPartola
Hey there. The pattern in TFA is somewhat different: in the SWR pattern not
every worker talks to the DB. Instead, only the selector does. It then hands
out the work to the workers via a fast local queue. The ratio I set up for
Ping Brigade is 1:1000 selector to workers. Thus a handful of selectors can
feed a few thousand workers.

~~~
molesy
This may be working great in your application but you're implying that it
scales nicely and you're wrong about that - hand waving may work in your case,
but Baron's whole point was that there are easy solutions that will make
things better if and when an application grows to the point at which it's an
issue.

I've personally been down this road many times, and the last time I made the
mistake of relying on SELECT FOR UPDATE in a queueing system it broke down
somewhere on the road between 1msgs/sec and 50msgs/sec. That application
committed before it dispatched to the worker app so I would consider it a
fairly similar access pattern as yours.

The solution I went with in that case was exactly what Baron describes at
"Locking is actually quite easy to avoid." - something along the lines of
UPDATE queue SET selected_by = dispatcher_id, selected_time = NOW().. and then
SELECT * FROM queue WHERE selected_by = dispatcher_id. I hate putting pseudo-
SQL because it's already setting bad ideas in some random reader's head.
Anyways, that scaled up to several thousand messages per second and ran
happily for years, long after I left that particular company. May still be
running depending on who you ask.

Long story short, it's great that your solution is working for you but the
weight of public knowledge suggests it's not a great solution for anyone else
to pick up on. Ping Brigade looks nifty, I hope it works great for you. Please
don't suggest this pattern to other people.

Personally the system I work on day-to-day these days runs a Redis set-based
queue similar to Resque to send a few thousand emails per second and I'm ok
with it. Not thrilled, but happy enough that I don't read the Resque
introduction text and blanch in horror as I did reading your article,
especially as a reply to Baron's which is based on... lots and lots of real
world experience with many different applications.

~~~
IgorPartola
Fair points. This solution works for me for now and I certainly know it is not
limitless. The solution with setting selected_by is something I thought about
and may implement at a later point. I also am a big fan of using GET_LOCK(),
for locking rather than relying on MySQL/InnoDB's built-in locking since you
have finer grained control over timeouts, etc.

I understand your concern about sharing this "dangerous" knowledge, but I
disagree that the solution is to hide it in a deep dark place. Would you find
it acceptable if I updated my post with a discussion on scalability and a link
to TFA? That way a reader will get more information about building such
systems, not less.

~~~
molesy
The point of the UPDATE .. SELECT if updated pattern is that it's a mark-and-
sweep completely outside of a transaction. Avoid locking > lock as little as
possible > lock as quickly as possible.

Your blog is your business, the Lorax speaks only for HN.. (hey is Internet
Lorax a job?) It would of course be great to update your readers, it'd be
cooler to update your application and tell everyone how it worked out! Then
you've got a story you can actually submit again

------
codehero
My project uses couchdb for task queuing:
<https://github.com/codehero/scheddesk>

Still kind of alpha, but working for my purposes.

------
iradik
i always wished that mysql had a skip locked rows feature, so if you do a
select for update it would skip any rows that are already locked. this way if
you created a queueing system you could run select for update, but then skip
rows that are already being processed (the locked rows).

i actually implemented this once partially on innodb, and it worked pretty
well, no waiting for locks, but abandoned my efforts due to another project.

