
Ask HN: What do you use to manage chains of long-running jobs? - polm23
For running chains of multi-hour jobs with dependencies I&#x27;ve always used Jenkins but always been unsatisfied with it. I&#x27;ve also heard of Luigi but haven&#x27;t given it a shot yet. Is there anything else similar I should look at? I&#x27;ve had trouble finding a single name for programs like this to search with.<p>Luigi: https:&#x2F;&#x2F;github.com&#x2F;spotify&#x2F;luigi<p>Example jobs:<p><pre><code>  - download a bunch of raw data from different sources
  - clean and transform raw data into a single format
  - push a lot of data into a DB
  - routine ML model updates
  - generate reports
</code></pre>
These would have dependency relations, some would be short and some would be long, sometimes they fail (which may be important or not), etc...
======
jasonmotylinski
Luigi, while great and simple, has been superseded by Apache Airflow:
[https://airflow.incubator.apache.org/](https://airflow.incubator.apache.org/).

Spotify built a scheduler system named Styx that handles orchestration
([https://github.com/spotify/styx](https://github.com/spotify/styx)) but it's
fairly beta for external use.

------
ditemis
The search term "batch processing" might give you some alternative solutions.

Jberet, a Java Batch API (JSR 352) implementation, is a batch processing
framework and can be run in a standalone Java process or within Wildfy/Jboss
application server. It provides a robust batch execution runtime, XML
configuration for jobs, REST API for batch management, a lightweight web
frontend and various data readers/writers.
[https://github.com/jberet/jsr352](https://github.com/jberet/jsr352)

------
eb0la
I use Talend Open Studio a lot for this kind of stuff.

It might not be the best tool for the ML part; but for moving data, and ssh'in
automatically to servers and executing commands is really good.

I used it (a long time ago) to ssh to a set of Juniper routers, execute a
command, parse the results (in XML), update some db tables, and execute
another commands in other routers...

I really like the wait for file feature (
[http://imgur.com/a/3kKLC](http://imgur.com/a/3kKLC) )

------
shoo
> For running chains of multi-hour jobs with dependencies I've always used
> Jenkins but always been unsatisfied with it

What in particular are you unsatisfied with?

What are you happy with that you wouldn't want to lose?

------
weirc
Rundeck?

