Nextflow was mentioned. I think what most people want is probably closer to Airflow, although it takes some
time getting it up to production in a cloud environment (there is astronomer.io and a GCP product).
HTCondor via DAGMan has existed a long time, and there’s even engines built on that (Pegasus, Wings).
There’s Swift (http://swift-lang.org/main/) and it’s successor Parsl. Cray has Chapel. These are a bit different, in that they are more like a distributed computer program. Of course, so is Julia, but built into these languages is the assumption you can be using unreliable, in some way, computing. Makeflow and GNU Parallel are closer to this category too.
Then there’s Beam, but that’s dataflow.
The crappy thing about this is it’s hard to understand when to use a solution and when to not use a solution. Why are there so many solutions? Because there’s a ton of different needs, and a lot
of these focus on a few in particular:
Scalability or workers
Dynamic Scalability of workers
Integration with existing Schedulers
Workflow Code Management (container support)
Maintainability of very large DAGs
Testability of DAGs/Development support
Execution Management support/Web APIs
Error recovery (especially for long running workflows)
Data Management (next to data processing)
... the list goes on.
It's amazing it works at all in my opinion.
This file  contains much of the complexity as a messy, stateful, monolithic block of Python. Having had to chase down deep bugs / limitations in this software, I'm now convinced that Python, with it's GIL, weak typing, lack of concurrency primitives, and generally OOP / imperative style is just the wrong tool for the job.
https://trio.readthedocs.io is an extremely good python concurrency library based on the model of Structured Concurrency (https://vorpus.org/blog/notes-on-structured-concurrency-or-g...).
The typing issues are far improved in current Python with annotations and attrs/dataclasses.
Indeed! I am working on Cylc  right now, which is a cyclic workflow system, where users need more than DAG.
It was created to automate weather forecast operations, but now there are a few cases of users trying to use it for cyclic graphs for more general problems.
I don't have a reference at hand now, but I vaguely remember such a comment from the FOSDEM presentation.
So I guess it's just a set of macros for Guile (which I believe is a Scheme implementation, or contains a Scheme implementation, or something like that...).
This looks useful, but can it submit jobs to cloud compute clusters or HPC systems and operate locally? Maybe I’m missing the point in terms of the purpose.
I'm very excited that this is based on Scheme -- I've long wanted a lispy workflow language!
I'm a little concerned about the coverage of the Guix package manager though. I guess instead of writing Dockerfiles the user would have to learn to write Guix packages.
The "Getting Started" example uses samtools so I guess this is oriented towards a similar bioinformatics audience. However without HPC/Cloud support it's probably not too practical, yet.
Addendum: Listened to the FOSDEM 2019 talk, seems like it does support Docker and HPC. However I need AWS Batch support for it to be really useful to me, hopefully that will be implemented at some point.
Edit: adding gnu guix manual link
It's actually very cool since you can drop in any Java library you like, which is particularly nice in the bioinformatics space where HTSJDK, Picard and co. give you enormous power in that space.
My efforts with BioNix (https://github.com/PapenfussLab/bionix) achieve reproducible pipelines by using Nix to capture the software, workflow, and handle execution either locally, on a compute cluster, or HPC.
Guixwl looks similar to BioNix, though BioNix is a thin layer of Nix expressions and Guixwl seems to be more then that and could be more general. BioNix is targeted at bioinformatics and just builds on nixpkgs.
“Conceptually, Luigi is similar to GNU Make where you have certain tasks and these tasks in turn may have dependencies on other tasks.”
Having this directly type of functionality integrated into Guix (an alternative Nix) looks very interesting. I’d encourage the Guix workflow developers to study Luigi and SciLuigi for inspiration on design ideas.
I will be sure to follow this effort and see how it progresses.
Why take such a defeatist attitude? I'm sure you can get the pronunciation right with a little effort.
A good workflow manager builds on these ideas further by managing environments, job submission, parallelization, cloud/cluster submission, and other options that make processing large amounts of data a lot easier and more efficient.
Lisps can also support arbitrary input formats using reader macros, so it might be using that (I haven't looked at the implementation yet).
Depending on whitespace and indentation creates such fragile code that I can't understand why the trade-off would be made.
The venerable `make` is a DSL. awk is a DSL.