I've also been a happy user of Windmill.dev. I guess ToolJet has more of a low-code focus with their drag and drop builder, while Windmill is more focused on developers who want to turn their scripts into production workflows.
Very excited to see all these open-source projects take off in the internal tooling space. I regret how much time I spent building custom DIY tooling at previous jobs!
I agree with you, and I hope Delight will be very useful!
We're showing new metrics (CPU & Memory) that you couldn't get in the Spark UI (you had to use something like Ganglia, and then jump back and forth between Ganglia and the Spark UI by comparing timestamps).
We alos hope to display this new information in a way that will make it more easy for most users to get insights. Make problems more obvious, more quickly.
This post is about a free and partly open-source monitoring tool for Apache Spark that we just released.
It works by installing an open-source Spark agent (https://github.com/datamechanics/delight) on your Spark infrastructure — whatever it is: commercial or open-source, on Kubernetes or on YARN, in the cloud or on-premise.
This agent streams event metrics from Spark (metadata about your Spark applications) to our backend, which then serves a dashboard listing the Spark applications, and giving you access to the Spark UI (Spark History Server) for each of them. The blog post has a lot more details about the architecture and security of it.
This release is just a first milestone for us, in our next release in January we will add new screens to gradually replace the Spark UI with a new monitoring view, you can see a glimpse of it at the GIF at the bottom of this page (https://www.datamechanics.co/delight)
We’d love your feedback about it — is it easy to install and use? What would you like to see in following releases?
Thanks so much!
JY
Glad our post sparked some pretty deep discussions on the future of spark-on-k8s ! The OS community is working on several projects to help this problem. You've mentioned NFS (by Google) but there's also the possibility to use object storage. Mappers would first write to local disks, and then the shuffle data would be async moved to the cloud.
Thanks for taking the time on this detailed and thoughtful feedback. We've implemented some of the points you mentioned (SparkOperator, Airflow connector, CLI is WIP) and have projects for the other points you mentioned, like how to make it easy to transition from local development to remote execution.
Sorry to hear about the layoffs. I'd like to follow-up with you to get your feedback on specific roadmap items we have in mind. Would you email us at founders@datamechanics.co to schedule a call, or at least keep in touch for when we have an interesting feature/mockup to show you? Thanks and good luck as well!
Thanks for the wishes! Spark is heavily used and its adoption keeps growing, but there are indeed new frameworks like Dask that look promising and are on our radar. Our goal is to foster good practices in the distributed data engineering/science world, whatever the technologies involved, so we'd love to add support for new frameworks in the future.
Thanks for the detailed feedback. Spark can sometimes be frustrating. Automated tuning has a major impact but it is no silver bullet, sometimes a stability/performance problem lays in the code or the input data (partitioning).
That's why we're working on new monitoring solution (think Spark UI + Node metrics) to give Spark developers the much needed high-level feedback on the stability and performance of their apps. We'd like to make this work on top of other data platforms (at least the monitoring part, the automated tuning would be much harder).
Case studies: Thanks, we're working on them. Check our Spark Summit 2019 talk (How to automate performance tuning for Apache Spark) for the analysis of the impact at one of our customers.
Very excited to see all these open-source projects take off in the internal tooling space. I regret how much time I spent building custom DIY tooling at previous jobs!