I love working with notebooks but hate the .ipynb file format with the burning passion of a thousand suns.
For those of you who are similarly affected, you may want to try jupytext. It is a tiny package that converts threeway between ipynb, plain python files (whose comment blocks are interpreted as markdown cells) and plain markdown files (whose code blocks are interpreted as code cells). Moreover, if you have jupytext installed, then jupyter will read and write python and markdown files transparently as notebooks so that you don't need to deal with the stupid ipynb files anymore. This is nice because then you can track the evolution of notebooks easily in git, and your local graybeards can edit notebooks with a plain text editor, with no need of a web browser at all.
Of course, the output cells are not saved, but this was always in bad taste anyway. If you want a nice html with the cell outputs you can run "jupyter nbconvert --execute --to html" from the command line (and yes, it will work with plain python or markdown files if you have jupytext installed).
Why this is not part of jupyter proper is beyond me.
EDIT: regarding nbdev, I would like to learn what are the advantages with respect to the minuscule jupytext thing. I'm particularly concerned by the soundness of the quarto dependency, which is a scary behemoth made by accretion of haskell and javascript.
The fact that everyone I talk to agrees that this format sucks and then turns around and ask us to use yet another formatting tool (jupytext, nvcovert blah blah) is equally vexing.
Rmarkdown (+rstudio) has shown that one can have the cake and eat it too. Ironically, I prefer rstudio for doing exploratory coding with python. Hopefully their recent pivot to more-than-byond-R will save us from this .ipynb mess.
> [...] ask us to use yet another formatting tool (jupytext [...]
Heh. Notice that jupytext is not merely a file format converter. It is a plugin to jupyter that allows it to treat python and markdown files as notebooks. This works both for input and output, with transparent and idempotent round-trips. Once you have this plugin you don't see ipynb files ever again (unless for some bizarre reason you specifically save your notebooks into this format).
They have an extension that watches a qmd file for changes and regenerates the preview, while also integrating with the language-specific extensions so that you can continue to get intellisense, run code cells, etc despite the file's language being Quarto and not (say) Python.
Ty for enlightening a jupyter novice (who never understood why people use them, given git garbage), on how to actually use notebooks with a team. Ive been manually moving everything between notebooks and other files
Hamel here, one of the core developers on this project. I just want to say that we are really excited about this new release of nbdev and the added functionality it brings to users. The thing I'm most excited about is that you can use nbdev for more things than ever before, such as: Documenting existing codebases (even if they aren't written in nbdev), blog posts, books, presentations, etc.
We also have an amazing notebook runner, as well as many other quality of life improvements. We will be adding more tutorials, walkthroughs and examples in the coming days. If you are interested in using nbdev please get in touch!
Jeremy Howard is definitely a visionary, his approach to ML, development and the learning process in general is revolutionary. It's people like him that can imbue passion to a young generation of Data practitioners.
I don't dislike notebooks, but I definitely have one foot in the "don't use notebooks for serious software" camp. I recently worked on a project where we're trying a notebook as the development platform for a data visualization report. The designer uses the notebook + bokeh to iterate on the report, and then we use nbconvert with some environment variables to create reports for different datasets.
My biggest issue with this paradigm is we actually had a lot of problem getting notebook development to work consistently bug-free on the various environments we have (Windows, Linux, VScode vs in-browser Jupyter, etc). It seemed like it would've been so much easier to just use a vanilla python script that generates the html report files. With hot reloading the iteration could be just as fast.
The other issue is that everything was horribly slow with the amount of data we were dealing with (~150MB of json). This is probably more related to python/bokeh than the notebooks themselves, but it meant that re-executing some cells was painful and would often hang or block the IDE.
I did run into some problems with nbconvert from time to time. It's worth noting that nbdev doesn't use nbconvert at all - it uses Quarto instead (which AFAICT does everything nbconvert does plus a lot more, and is faster, more extensible, and more "batteries included").
Having said that, it's possible that, given your experiences with needing to re-run some slow cells and having trouble making that work well, you might prefer to use "pure Quarto" instead of nbdev. With Quarto you can write your report as a .qmd file directly: https://quarto.org/ .
Personally, I quite like the notebook environment for situations like this where there are some really slow cells -- I mainly do deep learning, and some of my cells take many hours to run -- since that state is cached and I can easily manipulate it and visualise it afterwards. I generally will then add some kind of serialization or caching once it's working so I don't have to re-run the slow bits every time. I'll often also use nbdev to export a .py script from the notebook so it's easy to re-run the whole thing from scratch.
(BTW we also released something today that's particularly helpful for this workflow: https://fastai.github.io/execnb/ . Basically, it's a parameterised notebook runner. It doesn't rely on Jupyter or nbclient or nbconvert. It's in the same general category as Papermill, but it's much more lightweight and requires learning far fewer new concepts.)
Let me know if you have any questions or comments -- I'm the lead dev on the project. (If you're in the "I don't like notebooks" camp, please watch this first: https://www.youtube.com/watch?v=9Q6sLbz37gk )
Thanks Jeremy et al for a great piece of software.
I have used nbdev numerous times to introduce data scientists to good practices in software development.
Getting them to go from zero to a well tested, documented, CI/CD-ready code in an hour and seeing their faces light up always brings me joy. Keep up the great work.
When I looked into it a couple years ago, it was possible to configure nbdev to use e.g. GitLab instead of GitHub for those of us who can't use GitHub for whatever reason. Is this still possible with the rewrite? Any major things we'd be missing out on?
And thanks for putting together such an awesome resource, I'm excited to try kicking the tires on it again!
Frankly it's not something I've been working on -- the GitLab support in v1 was added and maintained by the community. I certainly want it to work (even although I'm not a GitLab user myself), so if you try it and find it doesn't, please send us PRs/Issues if you can.
It's not focused on collaboration, but it does add some critical pieces that otherwise make Jupyter development frustrating when working with a team. Specifically: `nbdev_prepare` ensures that diffs are as small as possible, by removing and standardising notebook metadata; and `nbdev_fix` fixes merge conflicts so that they are cell-level, rather than line level, so they can be opened and fixed in notebooks.
Something else we've found helpful for collaboration (not associated - just happy users) is this: https://www.reviewnb.com/ . It means we can get a nice notebook-based PR workflow.
This looks great! I'm curious though if there are ways to get a bit more control over page layout. For example, can you put three charts on one page and have them resize appropriately? Or put two charts next to each other (without relying on the charting library itself)? This functionality is pretty important for our use case, but I've been assuming that notebook-style formats, when they output to pdfs or a printed page, are just going to output html that gets split up by the rendering engine.
This kind of thing is coming soon. This is something that will work with shiny for python [1] which will be integrated with Quarto (which nbdev is built on top of). When its more stable, this is something we will look into integrating.
In the meantime, the home page for nbdev https://nbdev.fast.ai/ is built with a notebook, and as you can see it is reactive and resizes appropriately. You could follow this example to do something if you wanted to do something today.
For those of you who are similarly affected, you may want to try jupytext. It is a tiny package that converts threeway between ipynb, plain python files (whose comment blocks are interpreted as markdown cells) and plain markdown files (whose code blocks are interpreted as code cells). Moreover, if you have jupytext installed, then jupyter will read and write python and markdown files transparently as notebooks so that you don't need to deal with the stupid ipynb files anymore. This is nice because then you can track the evolution of notebooks easily in git, and your local graybeards can edit notebooks with a plain text editor, with no need of a web browser at all.
Of course, the output cells are not saved, but this was always in bad taste anyway. If you want a nice html with the cell outputs you can run "jupyter nbconvert --execute --to html" from the command line (and yes, it will work with plain python or markdown files if you have jupytext installed).
Why this is not part of jupyter proper is beyond me.
EDIT: regarding nbdev, I would like to learn what are the advantages with respect to the minuscule jupytext thing. I'm particularly concerned by the soundness of the quarto dependency, which is a scary behemoth made by accretion of haskell and javascript.