More

jstephan · on Jan 11, 2023

I've also been a happy user of Windmill.dev. I guess ToolJet has more of a low-code focus with their drag and drop builder, while Windmill is more focused on developers who want to turn their scripts into production workflows.

Very excited to see all these open-source projects take off in the internal tooling space. I regret how much time I spent building custom DIY tooling at previous jobs!

jstephan · on April 22, 2021

I agree with you, and I hope Delight will be very useful!

We're showing new metrics (CPU & Memory) that you couldn't get in the Spark UI (you had to use something like Ganglia, and then jump back and forth between Ganglia and the Spark UI by comparing timestamps).

We alos hope to display this new information in a way that will make it more easy for most users to get insights. Make problems more obvious, more quickly.

Let us know your feedback!

jstephan · on Dec 11, 2020

Hello HN community,

I’m JY, co-founder of Data Mechanics (YC S19, https://www.datamechanics.co). But this post is NOT about startup core product (a managed Spark platform, deployed on a k8s cluster in our customers cloud account, see our HN Launch https://news.ycombinator.com/item?id=23142831).

This post is about a free and partly open-source monitoring tool for Apache Spark that we just released.

It works by installing an open-source Spark agent (https://github.com/datamechanics/delight) on your Spark infrastructure — whatever it is: commercial or open-source, on Kubernetes or on YARN, in the cloud or on-premise.

This agent streams event metrics from Spark (metadata about your Spark applications) to our backend, which then serves a dashboard listing the Spark applications, and giving you access to the Spark UI (Spark History Server) for each of them. The blog post has a lot more details about the architecture and security of it.

This release is just a first milestone for us, in our next release in January we will add new screens to gradually replace the Spark UI with a new monitoring view, you can see a glimpse of it at the GIF at the bottom of this page (https://www.datamechanics.co/delight)

We’d love your feedback about it — is it easy to install and use? What would you like to see in following releases? Thanks so much! JY

jstephan · on May 14, 2020

Glad our post sparked some pretty deep discussions on the future of spark-on-k8s ! The OS community is working on several projects to help this problem. You've mentioned NFS (by Google) but there's also the possibility to use object storage. Mappers would first write to local disks, and then the shuffle data would be async moved to the cloud.

Sources: - end of presentation https://www.slideshare.net/databricks/reliable-performance-a... - https://issues.apache.org/jira/browse/SPARK-25299

jstephan · on May 12, 2020

Thanks for taking the time on this detailed and thoughtful feedback. We've implemented some of the points you mentioned (SparkOperator, Airflow connector, CLI is WIP) and have projects for the other points you mentioned, like how to make it easy to transition from local development to remote execution.

Sorry to hear about the layoffs. I'd like to follow-up with you to get your feedback on specific roadmap items we have in mind. Would you email us at founders@datamechanics.co to schedule a call, or at least keep in touch for when we have an interesting feature/mockup to show you? Thanks and good luck as well!

jstephan · on May 12, 2020

"There are two hard things in computer science: cache invalidation, naming things, and off-by-one errors."

Good luck with your venture :)

jstephan · on May 12, 2020

Thanks for the wishes! Spark is heavily used and its adoption keeps growing, but there are indeed new frameworks like Dask that look promising and are on our radar. Our goal is to foster good practices in the distributed data engineering/science world, whatever the technologies involved, so we'd love to add support for new frameworks in the future.

jstephan · on May 11, 2020

Congrats for RudderStack, what you're saying makes a lot of sense. Reaching out to you directly to follow up on a potential integration!

soumyadeb · on May 11, 2020

Thanks a lot. Will follow up with you.

jstephan · on May 11, 2020

Thanks for the detailed feedback. Spark can sometimes be frustrating. Automated tuning has a major impact but it is no silver bullet, sometimes a stability/performance problem lays in the code or the input data (partitioning).

That's why we're working on new monitoring solution (think Spark UI + Node metrics) to give Spark developers the much needed high-level feedback on the stability and performance of their apps. We'd like to make this work on top of other data platforms (at least the monitoring part, the automated tuning would be much harder).

Case studies: Thanks, we're working on them. Check our Spark Summit 2019 talk (How to automate performance tuning for Apache Spark) for the analysis of the impact at one of our customers.

jstephan · on May 11, 2020

Thanks for the feedback! We're preparing a demo for the upcoming Spark Summit next month... Stay tuned :)

In the meantime you can book a time with one of our data engineers through the website to get a live demo: https://www.datamechanics.co