

Ask HN: How should I manage recurring jobs? - Abundnce10

I use Python for most of my data processing jobs (e.g. Requests to fetch a webpage or hit an API, BeautifulSoup to parse HTML, Selenium to mimic browser actions).  Most of my jobs go out to external sources, such as GoogleAnalytics or vendors we use with their own reporting interface, and then bring back data into our own PostgreSQL database.<p>I prefer to use Ruby on Rails when building a web app since I know it the best.  Right now I have a fairly basic app that authenticates my colleagues (w&#x2F; Devise) and displays a couple reports.  For jobs that need to run every hour&#x2F;day I use a Linux cronjob.<p>Basically the workflow goes I create a new Rails model&#x2F;view&#x2F;controller for a report I want to build.  I then create a Python script to fetch the data, once that&#x27;s working I&#x27;ll set it up as a recurring cronjob.<p>However, this doesn&#x27;t seem like a very manageable approach.  Right now it&#x27;s feasible because I only have a handful of jobs that run hourly&#x2F;daily.  If it doesn&#x27;t run one hour it&#x27;s not the end of the world.  I log any cronjob errors to a text file but I don&#x27;t have a system to alert me if something goes wrong, and I don&#x27;t have a way to easily run a job again if it fails.<p>What I&#x27;m looking for is any advice on how I could tweak my approach to be more sustainable in the long run.  Anything specific to Python&#x2F;RoR would ideal.  Thanks!
======
maratd
Take a look at CasperJS. That will eliminate the Python side of things. You
can probably build your report within CasperJS too, but I would just export
the relevant data to a database and standardize your RoR codebase to the point
where adding extra reports is a simple proposition.

~~~
Abundnce10
Even if I use CasperJS I'll still need to use Python to connect to the
GoogleAnalytics API, right? I guess I'm more interested in any alternatives to
cronjob to managing/monitoring recurring jobs.

