

Ask HN : Suggestions for Writing Background jobs - mabid

I need to implement a few background jobs, that run  continuously in the background and do a lot of http calls to some API's to gather some data and store it in mysql. I would like your suggestions and comments on what will be the best way, architecture, language to use so that the system is scalable.<p>I will need it to be threaded so that i can pull data simultaneously.<p>What language should I use.<p>1) Ruby
2) Scala
3) Java 
5) PHP<p>I know Ruby PHP and Java well.<p>Thanks
======
JBerlinsky
RabbitMQ (or any AMQP broker) sounds like it would do the job, since it sounds
like you need to send very little data (a URL) to the background worker. I
believe that there are some good AMQP/Rabbit gems for Ruby, but I'm a little
dated on that. The basic premise is that a broker (RabbitMQ-Server, etc.)
holds onto a queue of messages, that are distributed to N subscribers when
they ask for them. The more subscribers you have to the broker, the more jobs
can happen at the same time.

However, RabbitMQ might be too much overhead for what you want to do. With
Ruby, specifically, I've had a good deal of experience with Resque, which uses
Redis (key-value store) as a queueing system, much like RabbitMQ. It's easy to
set up, and gets the job done just as well.

------
drKarl
If you know java and scala, I would recommend you to give Play framework a
try. It is extremely simple and fast to set up and develop with, plus it uses
Quartz scheduler library for the jobs.

<http://www.playframework.org/documentation/1.2.3/jobs>

Alternatively you could simply use Quartz alone.

<http://www.quartz-scheduler.org>

~~~
drKarl
And for the rest of your needs, Play gots you covered too...

You can use play's WS class to make asynchronous HTTP calls from your server,
and it integrates nicely with MySQL or your database of choice too. I think it
uses Hibernate under the hoods, but leverages all its power while simplifying
configuration and usage.

You can use Play with Scala, and if you do so, there is a nice database layer
called Anorm.

~~~
mabid
I see Play is web dev framework. As far as my system is concerned I dont need
to have a full web app. The system just needs to sit in the background read
the database and then request 2-3 API's for data and put that back in the db.
I expect a lot of writes to the database. I am reading about Play's support
for jobs. Do you still think Play is the way to go ?

~~~
drKarl
I see your point. Play is a framework based in the pattern MVC. If you just
don't provide any View layer, you can use the Controller to access the Model,
and use the rest of the goodies Play gives you for free, like Jobs, the WS
class for easy HTTP calls, a RESTFUL interface, etc.

As I said, you can always use just Quartz for the jobs (that's what Play uses
anyway), and create your own Data Access Layer or use an ORM, or what you
like.

------
ColinWright
It would be polite to put to least a hint of your question in the title so
people know what it's about.

~~~
mabid
I noticed that after i posted. I have following HN for over a year but this
was my first post so thats why...

