
Rails 4 will establish a new background job queueing API - jroes
https://github.com/rails/rails/commit/adff4a706a5d7ad18ef05303461e1a0d848bd662
======
nateberkopec
Thanks for accurate title - this is not a full queueing system, but a unified
API for hooking in bigger, badder queueing engines like Resque.

The point is to standardize the interface so other plugins/gems can simple
make calls to Rails.queue rather than try to accomodate every queueing engine
themselves.

~~~
phillmv
Someone please correct me if I am wrong.

Skimming through the code, this lets you register a Queue class to serialize
your jobs. So, if you use something like Delayed Job, you register the
(corresponding) DJ::Queue class that stores the jobs in whatever backend you
desire and then process it later via your daemon of choice.

So far so peachy keen. This is alright, I can get behind this - it will make
moving between queueing solutions more palatable which is not a feature I can
complain about.

My question then is: how will this work by default? Will the default Queue
have some sort of callback that executes after it returns the response? For
stuff like sending emails, for small apps, this is actually palatable - I'm
concerned about user latency than sheer requests/second.

~~~
tomstuart
The default implementation is a stdlib Queue (<http://www.ruby-
doc.org/stdlib/libdoc/thread/rdoc/Queue.html>) which will be consumed by a
Rails::Queueing::ThreadedConsumer
([https://github.com/rails/rails/blob/602000b/railties/lib/rai...](https://github.com/rails/rails/blob/602000b/railties/lib/rails/queueing.rb#L34-63)).
You just drop job objects into the queue, and the consumer thread will call
#run on them.

------
shill
It would be awesome to have this in Django too.

Celery already does a great job, but it would be nice to have the batteries
included.

~~~
simonw
I agree - I see this as similar to Django's pluggable caching backends.

~~~
kingkilr
Let's not, Celery is doing an absolutely fantastic job in this space, let's
just stay out of their way and do what we can in terms of exposing APIs to
make their job easier. There's a reason django-core didn't write celery in the
first place, we didn't have the need or the expertise; there are other people
with both, let's let them do it.

~~~
jsdalton
> do what we can in terms of exposing APIs to make their job easier.

Isn't that exactly what Simon is suggesting above?

He's not saying Django should provide its own implementation of a background
queue, he's saying it should provide a base API which could be implemented by
any number of backends, of which assuredly Celery would be one -- just as
happens now with cache backends.

I agree with Simon and the commenter above, this would be a great addition to
Django. I think it fits with the Django "batteries included" philosophy -- in
this day and age, a background queue is practically a requirement for anything
but the most basic web app. It also encourages standalone Django application
developers to make use of background queuing without fear of forcing a
specific implementation on users.

~~~
simonw
Yes, that's exactly what I meant. Like you say: today, a background queue
should be part of the default stack for a web (just like a template engine,
database, session storage and a cache have been in the past - components which
Django has provided since day one). No need to re-implement celery, but
encouraging the Django ecosystem to embrace offline queues (and letting
reusable apps know that they can push tasks in to an abstract queue of some
sort) would be very healthy.

~~~
megaman821
I agree that this would be a great addition to Django. Celery may not fit
everyone's needs. I would rather have something lighter weight for dev.

------
aaronjg
I'd love to see these job queuing platforms have better support for high
performance computing (HPC). Currently there are two paradigms of queuing
systems. Things like PBS/Torque and Sun/Oracle/Univa Grid engine which work
very well for small numbers of largish batch jobs, and things Delayed Job,
Background Job and Resque which work well for huge numbers of small jobs.

When you start dealing with large jobs, system resources start to become an
issue. A job might take 48GB of memory, or it might take 1GB of memory, and
the scheduler needs to be aware of this so that it isn't scheduling jobs on
top of each other. Or you might have some low priority jobs that should only
be run when the queue is mostly full so as not to compete with the high
priority jobs. Or you might have jobs that depend on other jobs, and you want
to enqueue them all and let the scheduler handle the dependencies. HPC
schedulers deal with these requirements well.

On the other hand, you might be in a situation where you have 10s of thousands
of jobs in the queue, and you need to add and remove jobs quickly. Things like
resque and delayed job handle these situations well.

HPC schedulers were built for research purposes, and background job schedulers
were built for the web applications. However there are more and more companies
dealing with large data problems that span both worlds. They have some large
jobs and tons of small jobs, and they don't want to manage two separate
clusters with two schedulers to handle the tasks.

------
danneu
I plucked the relevant points of discussion that reveal the thought process.

Q: "I've heard for years that pagination should remain outside rails since it
has to be lightweight, and now that !?"

homakov: good example, but "pagination" is a design-related thing(like decal
on a car) but "queue" or delayed jobs(jquery-deferred for example) is deep
engine built in feature. As cars vendor You shouldn't choose decals for driver
but you should install the best and reliable stuff under its hood IMO

...

Q: What's the point?

josevalim: The point of the Queue is to be small and provide an API that more
robust engines like resque and sidekiq can hook in. So you can easily start
with an in memory queue (as you can see, the implementation does not even
reach 100LOC) which is also easy to test and then easily swap to another one.
Why this is good? By having an unified API, tools like Devise, Action Mailer
can simply use Rails.queue.push() instead of worrying with compatibility for
different plugins. So the goal here is provide an API for queueing and with a
simple in memory implementation. It is not meant to be a robust queue system.

...

Q: Why not make it into a gem?

josevalim: The implementation today is less than 100LOC, so there is no reason
to move it to an external gem. If the implementation actually grows a lot,
which I highly doubt, we can surely consider moving it to a gem.

...

Q: Why include it in Rails at all?

DHH: This is really very simple: Do most full-size Rails applications, think
Basecamp or Github, need to use a queue? If the answer is yes, and of course
it is, this belongs in Rails proper.

...

Q: Then, and I'm not just trolling, should Rails provide an API for user
authentication or authorization?

DHH: authentication, pagination, etc are all application-level concerns -- not
infrastructure. Think Person model vs ActiveRecord model. Another way to think
of it is, would two applications have materially different opinions on
queue.push depending on what they're doing? The answer is no. That is not the
case for authentication, pagination, and other application-level concerns
where the usage is often very different depending on what the application is
trying to do.

...

Q: Is Rails getting too big?

DHH: The size of Rails itself is not a first-order metric of neither progress
nor decline. The right question is: Does Rails solve more common problems than
before without making the earlier solutions convoluted? In other words, what
are the externalities of progress? Will introducing a queue API make it harder
to render templates? Or route requests? No. It's most direct influence will be
on things like ActionMailer, so a fair question will be: Is it harder or
easier to use ActionMailer in a best-practice way after we get this? That's a
fair question, but I'm absolutely confident that this will make using
idiomatic AM usage (queuing mail delivery outside of the request cycle) much
easier. Thus, progress.

------
lkrubner
I am curious, under what circumstances would one use this, rather than
something like Rescue? And there is so much competition in this space, what
exactly is the argument for having this as part of Rails?

Or, let me put my question a little differently. Github did an awesome job
writing about their experiences, and the reasoning that lead them to create
Resque. I'm wondering if anyone on the Rails team has posted an essay with as
much background info as what Github did here:

<https://github.com/blog/542-introducing-resque>

But I'm also thinking about a conversation that happened here on Hacker News
recently. 2 weeks ago: "Rails core killed ActiveResource"

<http://news.ycombinator.com/item?id=3818223>

and the original article touches upon the issue that I'd like to ask about
here:

"It's not that I hate you or anything, but you didn't get much attention
lately. There're so many alternatives out there, and I think people have made
their choice to use them than you. I think it's time for you to have a big
rest, peacefully in this Git repository."

Can't something similar be said about job queues? "There're so many
alternatives out there, and I think people have made their choice to use them
than you."?

So why create a new job queue system, and make it an official part of Rails? I
am not sure I understand the intent.

~~~
pdelgallego
> I am curious, under what circumstances would one use this, rather than
> something like Rescue? And there is so much competition in this space, what

The goal is not to replace the existing queue solutions, but to create a
common API, so the rest of the gems can can just treat all of them in a
uniform way.

Quoting Jose Valim:

"The point of the Queue is to be small and provide an API that more robust
engines like resque and sidekiq can hook in. So you can easily start with an
in memory queue (as you can see, the implementation does not even reach
100LOC) which is also easy to test and then easily swap to another one.

Why this is good? By having an unified API, tools like Devise, Action Mailer
can simply use Rails.queue.push() instead of worrying with compatibility for
different plugins.

So the goal here is provide an API for queueing and with a simple in memory
implementation. It is not meant to be a robust queue system. "

------
sheff
Rails 4 looks like it will have some nifty features - anyone have any
information on when the first Release Candidate will be ?

~~~
xutopia
I heard that it would be when it is ready :-P

------
dancesdrunk
A feature that may very well make me finally jump over to RoR. I've recently
built quite a large site, and the only current bottle neck is when a few
emails need to be sent off at the same time with attachments, and to be able
to add that into a "que" and let the user continue browsing the site instead
of stuck on a loading page (if only for a few seconds) would make the current
set up ideal.

Incidentally - if any one has any way of doing this in PHP without having to
setup cron jobs (and not using node or its derivatives), I'm really open to
any ideas!

~~~
mibbitier
Why wouldn't you want to use a DB queue or something, and have a separate
cronjob / process the outgoing email?

~~~
morgo
A queue is FIFO oriented, a database is least-recently-used (LRU). It works,
but is not going to be the most efficient tool.

Where a queue is really useful is converting from foreground to background, so
that you can optimize for throughput, rather than having to leave free
capacity for 'random arrivals' of your foreground servers. Think of it as the
same as the same problem as the bursty traffic that a bank machine gets, and
why you always seem to have to line up.

The mathematical term is Poisson distribution:
<http://en.wikipedia.org/wiki/Poisson_distribution>

~~~
mibbitier
DB table "pending_emails" with a time field...

cronjob removes them based on time entered, and sends them.

 _shrug_

~~~
morgo
[http://www.engineyard.com/blog/2011/5-subtle-ways-youre-
usin...](http://www.engineyard.com/blog/2011/5-subtle-ways-youre-using-mysql-
as-a-queue-and-why-itll-bite-you/)

------
statictype
I don't use Rails but I often look to it for good/simple design ideas. I'm
interested in seeing how they implement simple, effective, reliable background
queuing.

------
edbloom
interesting - not sure if it's really needed though - I've used Redis and
Resque before and found it's performance was blisteringly fast. (Resque was
made by Github <https://github.com/blog/542-introducing-resque>)

~~~
bradly
It isn't about speed or choice of queue, it's about a standerized API for
working with queues so you can focus on developing your application domain.
You will still be able to use Resque or Sidekiq or DJ or anything else, there
will just be a standard API for all of them to use.

~~~
Empact
Also, it's an ecosystem feature. If a library or a component of rails (e.g.
ActionMailer) wants to process something in a background queue, the choice
doesn't have to be between a host of bad options:

    
    
      * Forcing a dependency on a particular queue
      * Writing a wrapper for all possible queues
      * Falling back to queue-less behavior in the absence of a detected queue
    

They just use the Rails queue and it works on whatever real-world queue the
user picks. Definitely good infrastructure IMO.

------
rjsamson
I'm sure there will be plenty of folks raging against it, but I for one am
glad to see the addition.

------
thibaut_barrere
Coupled with that, I would love to see Passenger support background workers
with the same lifecycle as front-end workers (but last time I suggested that,
it wasn't planned at all if I remember well).

~~~
pkmiec
We implemented something like this at the place I work.

We have a tiny ruby process, based on event machine, that subscribes to
various queues (we happen to use RabbitMQ). When a message arrives, the
process makes a request to the passenger instance passing along the message
data and waits for a response. The process limits the number of requests it
makes to prevent background requests from blocking out front-end requests (for
example, 20% of passenger_max_pool_size). We're also simulating priority by
using different prefetch values for different queues (for example, 10 messages
for high queue and 5 messages for low queue).

------
ecoffey
That's awesome to see. This was part of tenderlove's keynote.

------
dirkdk
separating API from actual implementation is always a good thing.

Like in Java, JMS API has many implementations

------
endlessvoid94
This strikes me as something that should be decoupled from rails.

~~~
j45
Maintaining interoperability between plugins (gems etc) is a perpetual
headache.

In some ways, I'm surprised this hasn't been there all along..

On the other hand I'm a little surprised something this simple is being
celebrated as a big deal.

It's nice to see rails continue to evolve, time will tell how much it ends up
looking compared to the over-arching frameworks it was out to under-do.

------
corwinstephen
Rails is literally unstoppable

------
andyl
Great news. A built-in background job queue should reduce the rails learning
curve - simpler to use a default option than research and test the various
custom options that are available now.

