Sneakers is a high-performance background-job processing framework based on RabbitMQ.
It uses a hybrid process-thread model where many processes are spawned (like Unicorn) and many threads are used per process (like Puma), so all your cores max out and you have best of both worlds.
It's being used in production for I/O intensive jobs as well as CPU intensive jobs.
On a recent 2012 MBP it reaches 7000req/s for a silly microbenchmark while Sidekiq keeps at the hundreds (600-700req/s).
Three problems - one - it is too tightly coupled to Redis. Secondly, Celluloid (the actor model behind Sidekiq) proved to be some of the problem here. Lastly, I wanted both a process and thread processing model which doesn't exist there.
Then, I tried rigging Celluloid to perform all these but after failing and speaking with Celluloid commiters I understood that it is under "major revision" and that they are not happy with some of the core elements right now (threadpool).
THEN, I opted to reinvent the wheel. Even here, I've chosen serverengine as the basic process management infrastructure to work on - a production tested core, to build Sneakers on.
But I feel it might be missing the point, RabbitMQ was used to remove the typical bottleneck from the broker which exists on most Ruby background frameworks, and using Postgres would bring it back. All in all, RabbitMQ will give you a good, transparent HA story with active-active which is useful when you must not lose messages (Postgres will give you just failover).
Edit: to cover @artellectual's response - I think OP meant to swap RabbitMQ with Postgres as the backend but if OP meant "is it possible to use existing Rails models / etc" - then yes, @artellectual is completely right - easy to do.
* when you back up your main data store, your jobs are also backed up. If you use sidekiq, are you remembering to do frequent backups of redis?
* only one data store, not multiple. Simpler architecture.
* jobs can be inserted/updated in the same transactions as the rest of your data
* PostgreSQL supports listen/notify, don't have to poll for new jobs
* can use SQL to query jobs
* jobs can have foreign key constraints to the rest of your data
Obviously it really depends on what you're doing, if the jobs are re-entrant, if they're the fire-and-forget kind of types, ... Still in most of the cases I would just store the jobs as a table entry and pass the table+id to the job processor. Even if PostgreSQL was the primary data store for the job processor there would still be issues with external state that needs to be healed when restoring from backup.
For RabbitMQ - you can have jobs persisted to disk on all nodes in the cluster which is like a backup. And of course transactional messaging is a problem that needs special attention.
This is of course talking about the stand-alone version. And I am talking of course about using it in tandem with the rabbitmq for job storage.
I've been wanting to migrate away from it to use more of a micro-service approach, so the code doesn't live inside the monolithic rails application.
Bunny looked like a start, but I really wanted something more, this seems like the answer, will definitely be using it for the migration.
@jundot, great job! looks like a solid project!
Without having looked at it yet, how does the pre-built UI compare? From an initial glance, it looks like anything that's missing is available via the DSL, but Sidekiq gave me so much of what I needed out of the box (having come from a custom-made solution using SQS + open source CFML) it made my head explode.
Batch handling - is actually "prefetch" in RabbitMQ/AMQP. Error reporting - via logging, and a "dead letter mailbox" (called a dead-letter-exchange in RabbitMQ) which is a great enterprise integration pattern for properly handling errors and retries in jobs.
RabbitMQ does the heavy lifting in another surprising facet - the UI. The management UI is excellent and at some point in time I decided not to compete with it by creating my own UI (you have to pick your fights :).
Here's some more info about the management plugin for RabbitMQ.
I suspect that if this project takes off someone will put together a front end that mirrors Sidekiq's in short order.
I do wonder how this one would work out though, will have to give it a try at some point.
Discussing reliability - this is something that sadly Sidekiq will never give you, by virtue of the fact that it uses Redis. RabbitMQ can be clustered in active-active mode, which means you have won over reliability here by just using a cluster.
When comparing queue systems, comparing Sidekiq+Redis to RabbitMQ is a bit unfair - because RabbitMQ was born to do this. And that's why if you're doing proper background jobs and messaging it's better to pick the right tool.
That being said, I do keep using Sidekiq for small Rails apps for the typical background emailers, denormalizers, etc. But I keep an eye open for when I realize that I'm doing proper messaging - in which case I'll switch over to something like Sneakers.
Redis Clustering tutorial: http://redis.io/topics/cluster-tutorial
Redis persistence (using AOF or RDB or both): http://redis.io/topics/persistence
Even though it doesn't have clustering - it's rock solid in production and I haven't experienced a drop in one of my Redis servers in around 3 years.
That being said, if you are building a system where reliability is an explicit requirement you can't take those risks.
i would start here: http://aphyr.com/tags/Redis
However, what was your concurrency level set at? I know when I've set it to one, I've had good success with the most demanding tasks. You essentially lose the threaded benefit, but keep the other benefits.
At least, as reliable as you can get with Redis...
I've used RabbitMQ for years in production, for the last two years with my Ruby background processing system Woodhouse. One of the nice things I got out of using RabbitMQ was the ability to expose job arguments as AMQP headers and then to use headers exchanges to segment queues based on that. This makes it a lot easier to allocate extra resources for high-priority jobs without having to explicitly create new priority queues.
For the author: are the issues you had with Celluloid mostly due to your requirement to run on MRI? For a while I was maintaining a serviceable monkeypatch for Celluloid on MRI, but I eventually stopped needing it. It does unfortunately seem to be a bit of a moving target.
Yes, you nailed it. For MRI I had a bit of a different challenge. I already solved this problem a year and a half ago, and with the benefit of being able to use JRuby performance was a bit easier to reach (by dropping to "bare" Java amqp driver and Executors) - https://github.com/jondot/frenzy_bunnies
As always, the first question on my mind is "Why X, instead of A,B,C?" (sidekiq in this case). The OP's page is here:
The "auto-scaling" is still manually controlled, right? (Dynamic scaling, on-the-fly scaling?) Or does Sneakers actually change the number of processes/threads by itself depending on load?
For the less ops-savvy among us, what are some good heuristics for deciding on the balance between processes and threads?
The sad news is, I've gotten some feedback that I believe may be true - self-daemonizing processes is a bad practice and that we should let the OS handle daemonization. And this kind of autoscaling is a bad practice in of itself because of it. This is why I've started to deprecate this feature (passively by just including a notice for now).
The question between number of processes and number of threads is excellent. It is mostly based on the workload - and the good news is that it's all scientific. You first need to understand the peak job run time (always try to upper-bound your jobs with timeouts) which can be had by some trial runs.
If it takes 200ms per job (I/O bound), it means each thread can do 5 units of work per second. If you need 1000 jobs/sec - you need around 200 at worst of these little guys to do work. Now, you can divide those into 4 processes on a dual-core machine (2 per core is a good rule of thumb). You end up with 50 threads per worker which is pretty relaxed.
The punchline is - if you need 1000req/s - with Sneakers the question isn't "can the broker support 1000req/s" anymore, because RabbitMQ should virtually look down and laugh at those numbers :).
I've never used RabbitMQ–is it easy to setup on, say, an Ubuntu 12.04 vps? How do you restart sneakers gracefully when deploying? How do you monitor rabbit/workers? (this is probably most important)
Thanks for this project–I'm looking forward to trying it out.
Is batches of job on the roadmap? something like
FirstWorker when done succesfuly > MySecondWorker x 30 in parrallels > ClosingWorker
Once the ClosingWorker is finished the batch is complete