Hacker News new | past | comments | ask | show | jobs | submit login
Show HN: Prophecy Spark IDE public beta (prophecy.io)
12 points by ibains 9 days ago | hide | past | web | favorite | 7 comments

Raj here, I'm the founder of Prophecy. Previously, I was the product manager for Apache Hive at Hortonworks (and wrote compilers for CUDA at NVIDIA, and Microsoft). The tooling for data engineers is really poor, and I saw many of my customers struggle with it. It's 2020, Elon just send people to Space station, and we can't build semi-decent data engineering tools?? - we've got to do better. So we'll fix it! Prophecy is unique:

- Unlike all existing ETL tools, we're code-first and Git centric. So we're perhaps we are the first Data Engineering tool.

- We want to give Apple like high quality design, and it-just-works experience.

We've built a unique CODE=VISUAL IDE where you can toggle between visual and code editors instantaneously. You can edit code, and the changes show up in visual graph, and vice-versa. So even if you use visual drag-and-drop view to edit a workflow, you're doing Git commits. You can just go to the underlying Git folder and do maven build. Same for unit tests. We've added one-click spin up of Databricks cluster from IDE and step-by-step execution.

We want to enable you to:

- Develop 10x better code for Spark

- Develop 10x faster on Spark

- Deploy to production 10x faster with CI/CD on Spark/Airflow.

For the beta, when you submit your e-mail, we create an account and send you login details. You can create Spark workflows, spin up clusters (in-built, on Databricks, free - please be kind with usage) and execute these. Please try out and give feedback, help us make data engineering better!

This is great! Very cool that you can switch between visual and code environments so seamlessly.

Looking forward to seeing launch of the beta release at #ApaxheSpark conference 2020!

Does it also support managed cloud offerings such as EMR ?

For the public beta and later our developer version, we only support Databricks.

However, for our Enterprise version, we install within your network and this supports EMR, Dataproc, Databricks.

This is only relevant where we are spinning up (and down) Spark clusters from within the IDE, you can always connect to an existing cluster from any distribution.

Does it also support managed environments such as EMR ?

Awesome - great to hear

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact