

Show HN: Sneakers – Fast background processing for Ruby - jondot
http://sneakers.io

======
jondot
Author here - Hi! :)

Sneakers is a high-performance background-job processing framework based on
RabbitMQ.

It uses a hybrid process-thread model where many processes are spawned (like
Unicorn) and many threads are used per process (like Puma), so all your cores
max out and you have best of both worlds.

It's being used in production for I/O intensive jobs as well as CPU intensive
jobs.

On a recent 2012 MBP it reaches 7000req/s for a silly microbenchmark while
Sidekiq keeps at the hundreds (600-700req/s).

~~~
judofyr
Have you considered pluggable backends (with a Sidekiq/Resque/Rediss backend)
so you can easily upgrade a Sidekiq stack to Sneakers?

~~~
jondot
Yes. I dislike reinventing the wheel, I actually bumped my head against making
Sidekiq use a different backend.

Three problems - one - it is too tightly coupled to Redis. Secondly, Celluloid
(the actor model behind Sidekiq) proved to be some of the problem here.
Lastly, I wanted both a process and thread processing model which doesn't
exist there.

Then, I tried rigging Celluloid to perform all these but after failing and
speaking with Celluloid commiters I understood that it is under "major
revision" and that they are not happy with some of the core elements right now
(threadpool).

THEN, I opted to reinvent the wheel. Even here, I've chosen serverengine as
the basic process management infrastructure to work on - a production tested
core, to build Sneakers on.

~~~
joevandyk
How hard would it be to use a PostgreSQL backend with sneakers?

~~~
jondot
Not hard because most things are abstracted (though - there's a trade off
between too much abstraction and peformance).

But I feel it might be missing the point, RabbitMQ was used to remove the
typical bottleneck from the broker which exists on most Ruby background
frameworks, and using Postgres would bring it back. All in all, RabbitMQ will
give you a good, transparent HA story with active-active which is useful when
you must not lose messages (Postgres will give you just failover).

Edit: to cover @artellectual's response - I think OP meant to swap RabbitMQ
with Postgres as the backend but if OP meant "is it possible to use existing
Rails models / etc" \- then yes, @artellectual is completely right - easy to
do.

~~~
joevandyk
I meant using PostgreSQL as the backend.

Benefits:

* when you back up your main data store, your jobs are also backed up. If you use sidekiq, are you remembering to do frequent backups of redis?

* only one data store, not multiple. Simpler architecture.

* jobs can be inserted/updated in the same transactions as the rest of your data

* PostgreSQL supports listen/notify, don't have to poll for new jobs

* can use SQL to query jobs

* jobs can have foreign key constraints to the rest of your data

~~~
zimbatm
It doesn't hurt to keep a higher-level job representation in DB and sync the
state at each step of the processing. I don't know about sneakers but you
might want to have more states than the usual new/processing/success|fail.

Obviously it really depends on what you're doing, if the jobs are re-entrant,
if they're the fire-and-forget kind of types, ... Still in most of the cases I
would just store the jobs as a table entry and pass the table+id to the job
processor. Even if PostgreSQL was the primary data store for the job processor
there would still be issues with external state that needs to be healed when
restoring from backup.

------
avitzurel
I have been using Sidekiq in production for over a year now, running over 11B
jobs over that year, peaking around 20m jobs per day.

I've been wanting to migrate away from it to use more of a micro-service
approach, so the code doesn't live inside the monolithic rails application.

Bunny looked like a start, but I really wanted something more, this seems like
the answer, will definitely be using it for the migration.

@jundot, great job! looks like a solid project!

~~~
avitzurel
11B == 1B, sorry

------
bdcravens
Look forward to checking this out. The biggest win regarding Sidekiq for me
wasn't so much performance (which is was definitely better than Resque) but
it's retry logic, error reporting, batch handling (granted, that's paid), ease
of plugging into an existing app's security model, and more.

Without having looked at it yet, how does the pre-built UI compare? From an
initial glance, it looks like anything that's missing is available via the
DSL, but Sidekiq gave me so much of what I needed out of the box (having come
from a custom-made solution using SQS + open source CFML) it made my head
explode.

~~~
chadcf
Sidekiq's retry and robust handling of jobs (at least with reliable queuing in
pro) was the big win indeed. I had a app I tried switching over to it though
and it did not go well. In my case I'm running background jobs that take 4-5
minutes to complete and are pretty cpu intensive, and I suspect because
sidekiq is threaded it's only using one CPU core and completely thrashing the
machine. I can run about twice as many workers with resque as I can with
sidekiq, so I had to switch back. Sadly now I've lost the reliable queuing and
automatic retries (the latter at least is not too hard to implement in an
ensure block).

I do wonder how this one would work out though, will have to give it a try at
some point.

~~~
jondot
More or less the same. Plus it will use all cores on MRI, so you can keep
using gems with C-Extensions (i.e. you don't have to run rbx or JRuby to max
all cores).

Discussing reliability - this is something that sadly Sidekiq will never give
you, by virtue of the fact that it uses Redis. RabbitMQ can be clustered in
active-active mode, which means you have won over reliability here by just
using a cluster.

When comparing queue systems, comparing Sidekiq+Redis to RabbitMQ is a bit
unfair - because RabbitMQ was born to do this. And that's why if you're doing
proper background jobs and messaging it's better to pick the right tool.

That being said, I do keep using Sidekiq for small Rails apps for the typical
background emailers, denormalizers, etc. But I keep an eye open for when I
realize that I'm doing proper messaging - in which case I'll switch over to
something like Sneakers.

~~~
MoOmer
I've got to say that the more I hear "Redis can never be reliable" the more I
cringe. It just seems like one of those things that's been said and repeated
without people stopping to fact-check along the way.

Redis Clustering tutorial: [http://redis.io/topics/cluster-
tutorial](http://redis.io/topics/cluster-tutorial)

Redis persistence (using AOF or RDB or both):
[http://redis.io/topics/persistence](http://redis.io/topics/persistence)

~~~
jondot
Right now Redis cannot be clustered production-ready. I wish. As I stated in
the Wiki, you'll have to pry Redis from my dead body, I am very happy with it,
and for me its a true swiss army knife and I've used it as such.

Even though it doesn't have clustering - it's rock solid in production and I
haven't experienced a drop in one of my Redis servers in around 3 years.

That being said, if you are building a system where reliability is an explicit
requirement you can't take those risks.

------
djur
This looks great. I'm really happy to see more frameworks being built on top
of RabbitMQ. AMQP gets some deserved heat for its design-by-committee nature
but RabbitMQ makes something really good of it.

I've used RabbitMQ for years in production, for the last two years with my
Ruby background processing system Woodhouse[1]. One of the nice things I got
out of using RabbitMQ was the ability to expose job arguments as AMQP headers
and then to use headers exchanges to segment queues based on that. This makes
it a lot easier to allocate extra resources for high-priority jobs without
having to explicitly create new priority queues.

For the author: are the issues you had with Celluloid mostly due to your
requirement to run on MRI? For a while I was maintaining a serviceable
monkeypatch for Celluloid on MRI, but I eventually stopped needing it. It does
unfortunately seem to be a bit of a moving target.

[1]: [https://github.com/mboeh/woodhouse](https://github.com/mboeh/woodhouse)

~~~
jondot
djur - thanks :)

Yes, you nailed it. For MRI I had a bit of a different challenge. I already
solved this problem a year and a half ago, and with the benefit of being able
to use JRuby performance was a bit easier to reach (by dropping to "bare" Java
amqp driver and Executors) -
[https://github.com/jondot/frenzy_bunnies](https://github.com/jondot/frenzy_bunnies)

------
tdumitrescu
Looks totally sweet.

The "auto-scaling" is still manually controlled, right? (Dynamic scaling, on-
the-fly scaling?) Or does Sneakers actually change the number of
processes/threads by itself depending on load?

For the less ops-savvy among us, what are some good heuristics for deciding on
the balance between processes and threads?

~~~
jondot
tdumitrescu "auto-scaling" is exactly what Unicorn gives you. You can scale up
or down a running pack of worker processes by sending signals (kill -USRX) to
the supervisor.

The sad news is, I've gotten some feedback that I believe may be true - self-
daemonizing processes is a bad practice and that we should let the OS handle
daemonization. And this kind of autoscaling is a bad practice in of itself
because of it. This is why I've started to deprecate this feature (passively
by just including a notice for now).

The question between number of processes and number of threads is excellent.
It is mostly based on the workload - and the good news is that it's all
scientific. You first need to understand the peak job run time (always try to
upper-bound your jobs with timeouts) which can be had by some trial runs.

If it takes 200ms per job (I/O bound), it means each thread can do 5 units of
work per second. If you need 1000 jobs/sec - you need around 200 at worst of
these little guys to do work. Now, you can divide those into 4 processes on a
dual-core machine (2 per core is a good rule of thumb). You end up with 50
threads per worker which is pretty relaxed.

The punchline is - if you need 1000req/s - with Sneakers the question isn't
"can the broker support 1000req/s" anymore, because RabbitMQ should virtually
look down and laugh at those numbers :).

------
bch
For a moment, was confused, but _why's Ruby project is called "Shoes".

~~~
jondot
Sorry about that :)

------
callmeed
I'm a big sidekiq fan but you've got my interest piqued ... aside from the
nice approach to retries/failures, the best thing about sidekiq for me is ease
of production setup/deployment. Redis is easy to install and sidekiq has nice
capistrano tasks. They're all very easy to monitor.

I've never used RabbitMQ–is it easy to setup on, say, an Ubuntu 12.04 vps? How
do you restart sneakers gracefully when deploying? How do you monitor
rabbit/workers? (this is probably most important)

Thanks for this project–I'm looking forward to trying it out.

------
nat
I have more of a django background, so I've never used any of the libraries
that you compare this to. How does Sneakers compare to something like Celery
([http://www.celeryproject.org/](http://www.celeryproject.org/))? Does it let
you kick off async jobs and get results back, or is it just about throwing
messages over the wall and letting workers process them?

------
knes
Wow, just finished deploying a rails app that needs a lot of scrapers with
sidekiq and it was a great process but sneakers looks very nice!

Is batches of job on the roadmap? something like

Batch1: FirstWorker when done succesfuly > MySecondWorker x 30 in parrallels >
ClosingWorker

Once the ClosingWorker is finished the batch is complete

------
jeffblake
My biggest obstacle with Sidekiq is communicating back to the frontend client
when jobs are done (i.e. credit card processed, FacebookGraph friends cached,
etc). Right now I use Pusher (websockets) with a polling fallback, but it's
clunky to develop and who likes polling. Does this solution address that at
all? If not, what would you do?

------
danso
Kudos to the OP for a thorough Wiki/documentation

As always, the first question on my mind is "Why X, instead of A,B,C?"
(sidekiq in this case). The OP's page is here:

[https://github.com/jondot/sneakers/wiki/Why-i-built-
it](https://github.com/jondot/sneakers/wiki/Why-i-built-it)

~~~
jondot
Thanks danso! I'm happy that you find it useful. I also hope to integrate
conclusions from relevant discussion here back into the Wiki.

------
steveklabnik
Resque maintainer here: this looks really great! Congrats!

