Show HN: Ploomber Cloud (YC W22) – run notebooks at scale without infrastructure

indoorskier · on June 29, 2022

Very nice, and well done on building an active community around it. I like the integration with the cached data, makes it feel like a build system :)

We were unaware of ploomber until now, but have been building meadowrun https://news.ycombinator.com/item?id=31694827 and share many goals with you. So obviously I think this is solving a real problem.

All the best with the cloud launch!

idomi · on June 29, 2022

Thanks for sharing! Yeah definitely a problem, we've been trying to build it mainly on the feedbacks we're getting from the open-source solution.

edublancas · on June 29, 2022

Hi all! This is Eduardo, Ploomber co-founder. I'm excited to show the HN community what we've been working! We'd love to get your feedback so please give it a try and let us know what you think!

etaioinshrdlu · on June 29, 2022

This seems somewhat similar to what Floydhub was doing: https://news.ycombinator.com/item?id=13659914

idomi · on June 29, 2022

You're right to some extent - at least on some of the concepts. We don't focus on a specific ML use case, like computer vision. In addition, we're oriented towards notebooks and this notion allows us to break it into smaller tasks, cache the results and execute in parallel (locally or via this new cloud service). BTW, we tried talking to the founders but couldn't get a hold of them, if you or anyone know them - we'd love to chat!

mvoodarla · on June 29, 2022

This looks really cool. Are there ways to orchestrate jobs? Like having one notebooks output trigger another based on some logic? I'm imagining running a bunch of different deep learning models on separate notebooks, or running the same model on different chunks of a piece of data in parallel.

idomi · on June 29, 2022

Yes, you should follow best practices and isolate each job to the smallest task possible (and then reuse components). We have this functionality in 2 flavors, you can define hooks as part of your pipelines(https://docs.ploomber.io/en/latest/api/spec.html#id2), in addition you can define this dependency as part of your jobs DAG (https://docs.ploomber.io/en/latest/get-started/basic-concept...), e.g get the data, clean it, train the model and test it.