Great! Thanks for the questions. The Dhalion repo (https://github.com/Microsoft/Dhalion) contains only the general Dhalion APIs. These are not Heron specific, that's why they are not part of Heron. In Heron, the healthmgr module essentially integrates these Dhalion APIs with Heron. Currently, Heron contains two Dhalion policies. The first one automatically restarts instances that exhibit backpressure. The second one scales up or restarts instances depending on some topology characteristics. Please join slack and we can show you how to use/modify the appropriate Heron yaml files so that you can use Dhalion. Dhalion does not require a separate installation -- it is already integrated with Heron and just needs to be configured depending on your use case. We are currently working on more aggressive autoscaling policies which should be part of the code base soon. Please let us know if you have other questions.
For us, the ability to self-tune is one of Heron's most attractive features. Tuning components/executors in Storm was a big pain point. Have you guys ever tried using the Dhalion autoscaling combined with hardware autoscaling from a cloud provider (i.e. Azure, AWS EC2)?
No, we have not used auto-scaling with automatic hardware provisioning. We assume that the hardware is there and we just request more containers (e.g, YARN containers). Theoretically, it should be possible to do that if the autoscaling policy is extended to request for additional hardware before changing the parallelism of a stage. But this functionality is not currently there.