The point is to standardize the interface so other plugins/gems can simple make calls to Rails.queue rather than try to accomodate every queueing engine themselves.
Skimming through the code, this lets you register a Queue class to serialize your jobs. So, if you use something like Delayed Job, you register the (corresponding) DJ::Queue class that stores the jobs in whatever backend you desire and then process it later via your daemon of choice.
So far so peachy keen. This is alright, I can get behind this - it will make moving between queueing solutions more palatable which is not a feature I can complain about.
My question then is: how will this work by default? Will the default Queue have some sort of callback that executes after it returns the response? For stuff like sending emails, for small apps, this is actually palatable - I'm concerned about user latency than sheer requests/second.
In any event, the web shows it is actually possible to separate interfaces from implementations. J2EE had other issues.
Celery already does a great job, but it would be nice to have the batteries included.
Isn't that exactly what Simon is suggesting above?
He's not saying Django should provide its own implementation of a background queue, he's saying it should provide a base API which could be implemented by any number of backends, of which assuredly Celery would be one -- just as happens now with cache backends.
I agree with Simon and the commenter above, this would be a great addition to Django. I think it fits with the Django "batteries included" philosophy -- in this day and age, a background queue is practically a requirement for anything but the most basic web app. It also encourages standalone Django application developers to make use of background queuing without fear of forcing a specific implementation on users.
When you start dealing with large jobs, system resources start to become an issue. A job might take 48GB of memory, or it might take 1GB of memory, and the scheduler needs to be aware of this so that it isn't scheduling jobs on top of each other. Or you might have some low priority jobs that should only be run when the queue is mostly full so as not to compete with the high priority jobs. Or you might have jobs that depend on other jobs, and you want to enqueue them all and let the scheduler handle the dependencies. HPC schedulers deal with these requirements well.
On the other hand, you might be in a situation where you have 10s of thousands of jobs in the queue, and you need to add and remove jobs quickly. Things like resque and delayed job handle these situations well.
HPC schedulers were built for research purposes, and background job schedulers were built for the web applications. However there are more and more companies dealing with large data problems that span both worlds. They have some large jobs and tons of small jobs, and they don't want to manage two separate clusters with two schedulers to handle the tasks.
Q: "I've heard for years that pagination should remain outside rails since it has to be lightweight, and now that !?"
homakov: good example, but "pagination" is a design-related thing(like decal on a car) but "queue" or delayed jobs(jquery-deferred for example) is deep engine built in feature. As cars vendor You shouldn't choose decals for driver but you should install the best and reliable stuff under its hood IMO
Q: What's the point?
josevalim: The point of the Queue is to be small and provide an API that more robust engines like resque and sidekiq can hook in. So you can easily start with an in memory queue (as you can see, the implementation does not even reach 100LOC) which is also easy to test and then easily swap to another one. Why this is good? By having an unified API, tools like Devise, Action Mailer can simply use Rails.queue.push() instead of worrying with compatibility for different plugins. So the goal here is provide an API for queueing and with a simple in memory implementation. It is not meant to be a robust queue system.
Q: Why not make it into a gem?
josevalim: The implementation today is less than 100LOC, so there is no reason to move it to an external gem. If the implementation actually grows a lot, which I highly doubt, we can surely consider moving it to a gem.
Q: Why include it in Rails at all?
DHH: This is really very simple: Do most full-size Rails applications, think Basecamp or Github, need to use a queue? If the answer is yes, and of course it is, this belongs in Rails proper.
Q: Then, and I'm not just trolling, should Rails provide an API for user authentication or authorization?
DHH: authentication, pagination, etc are all application-level concerns -- not infrastructure. Think Person model vs ActiveRecord model. Another way to think of it is, would two applications have materially different opinions on queue.push depending on what they're doing? The answer is no. That is not the case for authentication, pagination, and other application-level concerns where the usage is often very different depending on what the application is trying to do.
Q: Is Rails getting too big?
DHH: The size of Rails itself is not a first-order metric of neither progress nor decline. The right question is: Does Rails solve more common problems than before without making the earlier solutions convoluted? In other words, what are the externalities of progress? Will introducing a queue API make it harder to render templates? Or route requests? No. It's most direct influence will be on things like ActionMailer, so a fair question will be: Is it harder or easier to use ActionMailer in a best-practice way after we get this? That's a fair question, but I'm absolutely confident that this will make using idiomatic AM usage (queuing mail delivery outside of the request cycle) much easier. Thus, progress.
Or, let me put my question a little differently. Github did an awesome job writing about their experiences, and the reasoning that lead them to create Resque. I'm wondering if anyone on the Rails team has posted an essay with as much background info as what Github did here:
But I'm also thinking about a conversation that happened here on Hacker News recently. 2 weeks ago: "Rails core killed ActiveResource"
and the original article touches upon the issue that I'd like to ask about here:
"It's not that I hate you or anything, but you didn't get much attention lately. There're so many alternatives out there, and I think people have made their choice to use them than you. I think it's time for you to have a big rest, peacefully in this Git repository."
Can't something similar be said about job queues? "There're so many alternatives out there, and I think people have made their choice to use them than you."?
So why create a new job queue system, and make it an official part of Rails? I am not sure I understand the intent.
The goal is not to replace the existing queue solutions, but to create a common API, so the rest of the gems can can just treat all of them in a uniform way.
Quoting Jose Valim:
"The point of the Queue is to be small and provide an API that more robust engines like resque and sidekiq can hook in. So you can easily start with an in memory queue (as you can see, the implementation does not even reach 100LOC) which is also easy to test and then easily swap to another one.
Why this is good? By having an unified API, tools like Devise, Action Mailer can simply use Rails.queue.push() instead of worrying with compatibility for different plugins.
So the goal here is provide an API for queueing and with a simple in memory implementation. It is not meant to be a robust queue system. "
I see this as a similar thing to having an interface for caching which can then be backed by memcached, redis or the filesystem. It strikes me as an excellent idea - pretty much every web application should have an offline queue of some sort these days.
Considering Rails has always been about best practices--and background job queueing is definitely a best practice--I think this is a great move.
This will also allow other gems/plugins to have an easy way to push their own jobs into the queue rather than trying to support a bunch of different queue implementations.
Incidentally - if any one has any way of doing this in PHP without having to setup cron jobs (and not using node or its derivatives), I'm really open to any ideas!
This news isn't about Rails implementing its own background queue, but rather creating a unified API for interacting with background queuing systems; of which there are many. Resque (crafted at GitHub ) is probably the most popular: https://github.com/defunkt/resque.
I've used it in production for a few different projects and highly recommend it (both the PHP and Ruby versions).
As someone pointed out in the OP comments, this is like Rack for queues.
Where a queue is really useful is converting from foreground to background, so that you can optimize for throughput, rather than having to leave free capacity for 'random arrivals' of your foreground servers. Think of it as the same as the same problem as the bursty traffic that a bank machine gets, and why you always seem to have to line up.
The mathematical term is Poisson distribution: http://en.wikipedia.org/wiki/Poisson_distribution
cronjob removes them based on time entered, and sends them.
Alternatively, you can use curl to trigger a request inside of your page to another page (send_email.php) and don't wait for the response.
I ended up doing a little status page for my newsletter; I set it up to auto refresh in Opera, each one of of the refreshes sends 10 emails, and prints their statuses/destination/titles as they go (it's also rate limited in memcached). I chuck that the laptop or a third monitor and leave it for a couple of hours, keeping an eye on it as it goes.
Using something off the shelf I could trust would be much nicer.
So you will have to add some kind of throttling to make it work.
* Forcing a dependency on a particular queue
* Writing a wrapper for all possible queues
* Falling back to queue-less behavior in the absence of a detected queue
We have a tiny ruby process, based on event machine, that subscribes to various queues (we happen to use RabbitMQ). When a message arrives, the process makes a request to the passenger instance passing along the message data and waits for a response. The process limits the number of requests it makes to prevent background requests from blocking out front-end requests (for example, 20% of passenger_max_pool_size). We're also simulating priority by using different prefetch values for different queues (for example, 10 messages for high queue and 5 messages for low queue).
Like in Java, JMS API has many implementations
In some ways, I'm surprised this hasn't been there all along..
On the other hand I'm a little surprised something this simple is being celebrated as a big deal.
It's nice to see rails continue to evolve, time will tell how much it ends up looking compared to the over-arching frameworks it was out to under-do.