
Automated Reports with Jupyter Notebooks (Using Jupytext and Papermill) - JA7Cal
https://medium.com/capital-fund-management/automated-reports-with-jupyter-notebooks-using-jupytext-and-papermill-619e60c37330
======
brummm
I don't think Jupyter notebooks should be used for automated jobs. They're
great for exploratory stuff but once things are getting fleshed out and
cleaned up, one should move to proper python files that can be unit tested and
versioned without having to go to crazy lengths...

~~~
_coveredInBees
To be honest, I've always been puzzled with how extensively people use Jupyter
notebooks, despite how deficient they can be in many situations. Don't get me
wrong, I have used them a lot and I still use them on occasion, but pretty
much the only times I would really reach for them are when I want code +
documentation to reside together.

So if I am prototyping things that also would benefit from Markdown + Latex
comments/documentation, it is far superior to a script with text comments. The
only other situation where I've found them to be pretty useful are when
utilizing their interactive widgets to let the end user explore the data in
interesting ways.

But most people seem to use it as an IDE when it is quite deficient compared
to something like PyCharm. Even for quick prototyping, I find PyCharm to be
far more useful because a) I can directly run things in the IPython console,
b) Examine variables in the variable display widget, c) Attach a debugger to
the console at any time and start debugging with an unmatched debugging
experience in Python land, d) Easily have the correct venv be utilized by the
IPython console, e) Have outstanding linting + code introspection +
autocompletion, f) Have sane git diffs that make it easy to use version
control appropriately and frequently unlike with Jupyter notebooks, and
probably a bunch of other benefits that come with having access to a powerful
IDE. All of these make rapid prototyping much faster than anything I can
achieve in a Jupyter notebook.

I've also been pretty unimpressed with the quality of the average Jupyter
notebook that I find on Github repos. They encourage dumping everything into
global state, rely on state across code cells in non-obvious manners and in
general result in ugly scripts that need a lot more work if they had to be
refactored into modular packages/modules.

Running automated jobs from Jupyter seems a bit crazy and I hope people stop
to think whether that is the appropriate path to take when trying to write
automated jobs.

~~~
the_watcher
The reason Jupyter notebooks are so popular is that people who only learned to
code or are learning to code to enable some kind of analysis can easily do it
without needing to get much of anything else set up. If all you really need is
to read in some data, do some statistical tests, and make some plots, I'd
argue it's by far the simplest solution. Many academics don't need much more
than this. Similarly, when a data scientist or product analyst is just doing
analysis, not much more is needed.

Your comments about what Jupyter notebooks incentivize are definitely true
(I'd argue its part of the appeal - people new to programming often don't
immediately grok types of state and Jupyter kind of just says "eh, work around
it"). I certainly fall victim to it and often wish for a week of "the
expectation is that you will go through your notebooks and create modules or
packages for them".

I also agree that running automated jobs from Jupyter isn't an optimal
solution, but for many companies, if it means a data scientist (whose coding
skills are primarily statistical) can get a report into production without a
single engineering hour, it's often worth it.

------
KyleOS
Awesome article - I'm wondering, for "Publishing the Notebook" part of the
workflow, have you ever seen Kyso ([https://kyso.io](https://kyso.io)) -
disclaimer, I'm a founder. We started Kyso to make it easier to communicate
insights gained from analysis to non-technical people by converting data
science tools (e.g. Jupyter Notebooks) into conversational tools in the form
of blog posts. You can make public posts or have an internal "data blog" for
your team, where you push your work to Github and it is reflected on Kyso.
Would love to hear your thoughts on how it could fit into existing workflows.

------
mafm
Great article. The package that ties notebooks to git-friendly markdown
equivalents sounds it solves one of the main problems with using jupyter.

------
xvilka
I dream about native version of Jupyter, without the need of a browser,
JavaScript, HTML, and CSS. Probably a Vulkan-rendered widgets. Would improve
performance by an order of magnitude.

~~~
Ftuuky
I think Visual Studio Code will add that feature

~~~
diffeomorphism
Visual Studio Code itself is "a browser, JavaScript, HTML, and CSS", no?

------
tommaho
What library is the author using, containing the sundial visualization?

~~~
the_watcher
Looks like Plotly to me.

UPDATE: Plotly confirmed via looking at the notebook source repo:
[https://github.com/CFMTech/jupytext_papermill_post/blob/mast...](https://github.com/CFMTech/jupytext_papermill_post/blob/master/plots.py)

~~~
tommaho
Thank you!

