
Show HN: Afctl – CLI to manage and deploy Airflow projects faster and smoother - sunasra
https://github.com/qubole/afctl
======
chocks
Pretty cool! We use Airflow heavily here at Instacart. Some of our teams use a
managed service from google for deployment and orchestration
[https://cloud.google.com/composer/](https://cloud.google.com/composer/) For
companies wanting a standard structure of dags and self hosting their airflow
deployments, your tool would be super helpful to get started. One suggestion -
would be cool to add separate deployments for the different components of
airflow - webserver, workers, scheduler etc. reading through the readme it
looks like you deploy the single image to the Qubole Cloud? Often times
deploying code to airflow is updating the dags files in airflows file system.

~~~
sunasra
Thanks for the feedback.

The main motivation behind building this tool was to make onboarding easier on
Apache Airflow. There was no standard structure for a airflow projects and
setting it up on local can be a nightmare sometimes. The simple CLI tool makes
it very easy to create and test your project locally before deploying it to
your production or staging environment via your CI/CD.

Right now we are using a docker-compose file which brings up all the Airflow
services but we are also currently working on providing a command to control
individual process.

Qubole is not a cloud but a self managed Data Platform. Deploying on Qubole
means just putting all the Dag files on the machine (AWS/ GCP/ Azure) where
airflow is running. Qubole provides out of the box solutions for running
airflow on your cloud with a click of a button. We offer bunch of different
things (Spark, Presto, notebooks, etc) and have a great eco system build
around Airflow.

------
ramraj07
What's the simpler solution analogous to airflow that I an deploy on the same
ec2 instance as my webserver? I prefer to use the simpler tool until it
becomes unwieldable, and my current stack is just an elastic beanstalk
deployment that runs the webserver as well as a celery worker. This seems easy
to make the site highly available (scale up and down instances easily), with
managed RDS and a redis broker taking care of managing state. A scheduled task
runner is the missing piece - turns out celery is just not designed for long-
running tasks, and both celery and airflow seem to require that only one
instance of their scheduler run at any given time. I'd much prefer a tool
where the service is robust enough to handle multiple redundant schedulers so
I can go home and sleep without having to bring in kubernetes just to deal
with this.

Currently using some custom code to do this using a database table but totally
aware of how fickle and easy to screw up this method can be, and open to
suggestions.

------
Gys
'Airflow' as in ....?

It does ring a bell, but airflow gives a lot of hits on google. Would be
helpful to add a link in your readme to 'your' Airflow.

~~~
lord_ozb
Apache Airflow :)

~~~
diroussel
As well as commenting here. It would make sense to update your readme.

~~~
lord_ozb
Thank you. It was updated.

------
verdverm
[https://astronomer.io](https://astronomer.io) offers a managed platform and
installation into your cloud or data center.

Great product experience, fully open source. Gives your people using Airflow a
sweet UI so they don't need to go CLI

~~~
chattarajoy
It depends on your preference really. Most devs I know would prefer a CLI over
UI, plus CLIs can be easily used to automate workflows like deployment and
pushing pipelines to production.

~~~
verdverm
Yes, devs prefer CLI, but most users of Airflow are not devs. Just because
something is OSS does not mean it's geared towards devs

------
chrisMyzel
Deploying airflow workers took me loads of sweat to get used to - will
defenitely try this

