papermill is good and ploomber is a thing to watch. Ploomber makes it systematic...

edublancas · on March 26, 2022

(ploomber maintainer here)

Any feedback for us? What can we do to improve Ploomber?

kzrdude · on March 27, 2022

Maybe I haven't come far enough with my ploomber use to tell yet! It works nicely but I know I'll learn more and open my eyes more as I go.

As a first impression, I eventually found meta.extract_upstream = False which I think is important. Reason: The code for each step should be a lego piece, a black box with inputs and outputs. That code should not itself hardcode what its predecessor in the pipeline is - you connect the pieces in pipeline.yaml. (extract_upstream = False is not by itself enough to solve this, since you also need to be able to rename inputs/outputs for a notebook to be fully reusable as a lego piece, but it's good enough for now.)

I also for my own sanity need to know more about how the jupyter extension part works, how it decides to load injected-parameters or not. But maybe I could learn that somehow from docs.

In general I want components that are easy to understand and plug together and less magic (but the whole jupyter ecosystem's source code feels this way to me unfortunately, lots of hard to follow abstractions passing things around). But it's developing rapidly and already very useful, thank you so much!

edublancas · on March 29, 2022

This is great feedback, thanks a lot!

I'll ensure we display the "extract_upstream" more prominently in the docs, we've been getting this feedback a couple times now :)

Re: the Jupyter extension injects the cell when the file you're opening is declared in the pipeline.yaml file. You can turn off the extension if you prefer.

Feel free to join our community, this feedback helps us make Ploomber better!

https://ploomber.io/community