Jupyter notebooks were fun to use for a bit, then I hit the inevitable wall of "ok, now let's turn this into a real, properly built script, but now everything is breaking for inexplicable reasons".
Notebooks are fine for early stage experimenting, but if you've got to the point where you start up you are relying on a notebook for anything, or your workflow consists of "start up notebook, run cells until I get to the one I'm working on" then you need to stop, and build it into something proper.
Experimentation is fine, but it should not come at the cost of writing things properly when the time comes, and they are not an excuse for not knowing some good software engineering. Serious alarm bells go off in my head when I read tweets like "Data science code doesn't need to follow the rules of good software engineering".
Edit: that remote notebook - is that honestly not one of the most terrifying things you've seen? That runs so counter to almost every good bit of software design and engineering practice about clarity, maintainability, good practice, security, etc that I can think of. It's the very definition of indecipherable, inscrutable hidden state and unknown side effects.
The only thing I would use notebooks for is demonstration / teaching.
But if you do use libraries instead of just a huge mess of notebooks, you're stuck if you want to change the code in a well-supported way. You can ask your notebook to monkey-patch the code (which is an even bigger mess), or you can use an unreliable magic extension, "%autoreload".
When you're in a job that requires running experiments, analyzing results, and summarizing findings, all within an hour, shortcuts are required.
I think that analysts/scientists using python and Jupyter notebooks are doing IT a favor. If your job is to refactor/re-engineer python notebooks and corresponding workflows, you've either forgotten how much worse it could be with excel+vba or you fortunately arrived after trench warfare.
- code and document mixed. You have full markdown syntax, chunks, titles, table of contents. You can also convert RMarkdown into a script with comment if you need.
- To render a RMarkdown, it will run in a separate environment, start from scratch. It's assumed you should make it reproducible, this is good for report/sharing.
- I also use RMarkdown to write code, and run code/code chunks interactively in a session. I can write plan, notes, references, TODO in document, test code in chunks, execute code chunks in any order. Basically you just code and document. In the end you can turn it into a report, or refactor the tested code into functions and scripts. I kept the original RMarkdown as design document, which have all the original notes and previous version of code.
- It's plain text so version control is fully supported.
RMarkdown started to support python, but it may be preliminary for now.
The problems with state felt sort of overstated (heehee), but it obviously is an issue. I always wonder why the menu option Kernel -> Restart & Run All is not a first class citizen, with a big red button at the top of the window. It's my main way of interacting with the notebook for anything that takes under a minute to run. Running cells one by one throughout the netbook isn't very useful. You either want to rerun the cell you're working on, or clear the state just in case and run everything.
Module versioning is another thing that should be a no-brainer.
I also wish the notebooks diffed better when it comes to SCM, without all the JSON artifacts.
Proper integration with existing Python tools/IDEs would be wonderful.
Also yeah, kernel sharing is one of the most horrific ideas I've heard recently.
This is one of my biggest qualms as well. I built a DevOps tool  that uses Notebooks & not being able to diff and review Notebooks was a pain. I have decided to solve it with a GitHub marketplace app. Wrote more about it here: https://medium.freecodecamp.org/how-to-handle-version-contro...
Jupyter Notebook is a presentation software, for demonstrating something to yourself or others. There is hardly anything comparable to build interactive demonstrations. You'd have to implement a (multi-paged?) GUI application, or a web application. Just plain html output may cut it for certain use cases, but still more painful.
And I've done more stupid stuff than that. For example I have a demo recording and showing EEG data. And a snake game implemented as a widget with Bokeh output. And live-update graphs from FlightGear. Lately I've started experimenting with controlling my 3D printer from the notebook.
But the problem OP staes is that notebooks encourages unreproducible demonstrations. Markdown + code cells makes data scientists to think a single notebook file is enough to upload to GitHub. Missing dependancy information and etc as OP states make others very hard to reproduce the notebooks. You can see tons of GitHub repos that have notebooks only and impossible to reproduce out of the box.
His criticism is valid but the main lesson is to use the appropriate tool for the job. He himself uses VSCode + iPython, which is a better tool more generally for the desired use case of code development.
It doesn't have to be an either/or. The granddaddy of the modern Notebook, Literate Programming  _was_ about implementing and organizing software. Just because today's Notebooks like Jupyter aren't currently sufficient to implement and organize software doesn't mean that they cannot be (again) in the future.
A lot of the problems pointed out in the slides are solvable. The DevOps of Notebooks is stuff we can absolutely sink our teeth into as an industry. We can make good Notebook formats that source control well. (Some folks have filters already for Jupyter.) We can make better bridges to (incrementally, per user interest) move Notebooks into source control, CI, testing, etc. Jupyter should already have some idea of the environment it is running in, it could certainly build things like requirements.txt or even full Docker containers. We can build beyond the single cell or single Notebook page and ask deeper questions about how do we organize Notebooks, how do we organize software in Notebooks, how do we interoperate with maybe some code that has a strong narrative to live in a Notebook alongside code that doesn't have a strong narrative or doesn't need one (or wasn't written with one in mind and is legacy code in the project). It could be great to take an existing Python codebase and say "this feature is best explained in a Notebook" and just build it that way. Similarly it could be great to say "this Notebook I found is already a great module, I'm going to build a more traditional app around it" and getting the Notebook's own help in bootstrapping that effort.
We certainly have the technology and the opportunities to do interesting software development in/with/alongside Notebooks and Notebook-like tools. The questions are certainly more ones of what are our priorities? Knuth argued in the 80s that all software development was best when embedded in a human-focused narrative. I'm not that extreme myself, but I certainly see some great opportunities for pragmatic middle grounds where you can mix-and-match as opportunity/interest/need warrants. Having Notebooks as a tool in software development _can_ make us better software developers. It's cool to have a lot of tools in your toolbelt so you can pick and choose the best ones for the jobs at hand.
Is anyone working on bridging the gap between notebook sketches and production ready code?
To be able to have a scratchpad and play around with techniques while keeping a sort-of record is also amazing.
I see it like I see excel, it's a fantastic tool for data exploration and some visualisation. It's not something that should be used in any final workflow or production system.
Like excel, it can be badly mis-used but they unlock ways of working that simply weren't possible before it.
The criticisms about hidden state are fair, I think it would be better if previous step data was more explicitly wrapped in the following cells so you could choose to use the "wrapped package" of the previous data or choose to use a re-evaluating version.
I think it works best when most cells have few side-effects, even if that means repeating previous calculations.
Play with your code in the notebook, when you know what the right way to split it is, turn cells into functions. When you have a decent body of functions turn them into a module.
Notebooks excel as ad hoc interfaces to libraries and modules. They don't replace them.
I'm not sure the problems with scientists' code are enabled by notebooks to a significant degree. We used to exclusively use PyCharm and I was still debugging spaghetti scripts with massive global state interacting in weird ways.
Notebooks give people more foot-guns with hidden state. I think they are neutral when it comes to modularity. And I think they are a positive for encouraging documentation.
It's certainly not a Jupyter problem per se. I went through a similar progression when I learned BASIC in high school, in 1981. Projects grow to a size where they become unmanageable without some structure.
In my view, the importance of documentation is huge. I've been programming for a long time. I don't write software. I use programming as a problem solving tool. Jupyter has made a profound difference in my ability to pick up work that I did a month ago, or years ago, and figure out what I did, mistakes and all. For me it's more about reproducible problem solving than software development.
I always view them as an execute in order abstraction. I suspect what we really need to improve them is a subset of python which only deals in immutable data structures somehow - and then let notebooks branch off that immutable state but still be "execute forward" only.
The only problem is that these good coding practices aren't that exact, and tend to go on and on all the way to infinity.
Data science practices are similarly inexact - a lot of good decision making comes from experience, knowing when to apply each tool to a specific problem, when to just throw in a hack etc.
For the same reason we can't just get good programmers and train then in biology or chemistry or structural engineering. Sure they exist, as do data scientists that are really good programmers, it's just that they're more rare and in very high demand.
Often much easier to find a domain expert and a programmer and have the programmer rework the code done by the domain expert. In fact that used to be my job for a while (working with physicists), and it was actually quite fun.
It's basically what people are now calling 'Research Software Engineers'
That said, I think it goes both ways: people from the sciences tend to be cavalier coders, and people from software background tend to be cavalier about the underlying mathematics.
Seems to me that the solution needs to be a stronger culture of both increased scientific and software engineering rigor.
My main point is that too many people in data science don't care at all. They don't care about the repeatable results, about the code quality, even about units (they can even use `mb`, `Mb` and `MB` for megabytes in the same document).
I hope companies will learn that it's really important that if you have code it should be a good quality code, not a randomly gathered set of lines.
One thing though--one of the best books I've read, Trefethen's ATAP, was written as a collection of .m files, which when run would produce a pdf of each chapter. The .m files were filled with small formatting details that were simply omitted from the generated book. The slides suggest something equivalent is not possible with notebooks. That's unfortunate.
I’ve been moving functionality to modules when I can too which helps minimize the amount of code actually in the notebooks, and I also will break up code into different notebooks (occasionally saving/loading specific variables between kernels) when it makes sense to. Maybe all this is helping a lot, have you all needed notebooks with dozens of cells as this presentation mentions?
On topic: I applaud the effort to set things straight, but I'm afraid that ship has sailed. Hacky "Kaggle notebook solutions" are now Data Science. That's what the term evolved to mean, and all the rest of the impedance highlighted in this presentation follows.
If your work involves creating well-designed, well-factored, tested, reusable ML software that is meant to be integrated, picked apart, extended and applied over time in practice (as opposed to be submitted once to an oracle in order to claim "SOTA"), you better come up with another term. And obviously Notebooks are not a good fit, beyond as a tool for documentation & reporting (which Joel correctly calls out as genuinely useful).
I'll add positives notes, though:
-They're excellent for beginner tutorials that explain an abstract concept or a library's use cases
-They're very well suited for academic peer-review. It's basically a way to say: here's what we obtained, here are the exact steps we did to obtain it, you can obtain it as well if you do exactly as we did. You'd think this reproducibility requirement would be common sense but in practice you're usually thankful if the data is available, or the software is released and works.
They are really practical for situations where you want to play around (er even outright work with) principally not with code, but that code's output.
You see, I give lots of trainings.
Notebooks offer me an excellent way to mix commands, their output, and explanations into a single document with little effort.
I'm able to show my students exactly what happens (including the literal messages), going step-by-step.
They are wonderful to create exercises.
However, for my use case, the notebook is the output (perhaps rendered as PDF).
Rules I've adopted for my own training notebooks:
* the first lines are to print the versions of all things I'm using, e.g. "git --version" for git trainings
* I use "restart&run all" frequently
* obviously, notebooks are version-controlled, including their output
* before checking in, prove that "restart&run all" provides exactly the desired result
The tool is just a detail. My point was rather that they are very much "code", and deserve to be treated as carefully as any other.
My notebooks also get refactored on occasion.
1) It's claustrophobic! Trying to work in notebooks always felt like trying to do a math problem with too little paper when you're used to big empty sheets.
2) Readability is weird! I want to see all the code and then see the plots, not little crazy crunched up snippets with little crazy crunched up plots that might allow interactivity.
3) I might actually want to do something that involves looking at more than one plot at a time! Seriously? This always struck me as ridiculous. I have screen real estate, I want to use it! ESPECIALLY for interactive data analysis.
4) They're not as portable as they should be! There's always some drama when you open up someone else's notebook locally.
5) They encourage people to write weird ass code! Scientists already have a tendency to be messy if they're not CS types, this just makes it worse.
vim + tmux + ipython REPL + vim/tmux slime for shipping stuff from the editor to the REPL + matplotlib in QT mode
It's not as good as MATLAB but it gets close.
The advantage of a notebook over a repl is that the code you typed stays there and can be re-run and modified later. Re-executing all cells in order ensures that that actually works.
You can do away with the cell concept, and instead have some way to annotate which lines display their output. Then the distinction between an editor and a notebook almost disappears. An editor plus a way to annotate which lines' outputs are displayed, plus a way to type rich text, becomes a notebook.
Even better, you could allow users to display the output of lines inside functions, and have a way to select which concrete call is actually displayed. Sean McDirmid has already implemented such an editor. This removes the incentive to avoid abstraction, because it allows you to display outputs even if you put code inside a function or class. It's even better than a repl in this regard, and it doubles as a powerful debugger that can navigate through the execution.
The navigation works similar to an IDE's go to definition, except that when you go to definition on a call, it sets the concrete execution context of the call. For instance,
y = x+2
Since most people work with small data sets or can use a subset of their dataset during development, I think this would be a better default for a notebook.
And if he hadn't included it, all the comments would be "yeah, and who is he to talk".
And, for serious stuff, of course, python in text files with modules, tests (pytest) etc.
The strength that kept me coming back to notebooks was their power at iterating on a problem but I was continuously frustrated at the difficulty of extracting my solution / tracking it in git / collaborating with colleagues etc. Also I didn't enjoy the editor experience from a UX point of view.
I've since started using hydrogen , a plugin for Atom which (via a Jupyter kernel) seems to get what I wanted from both worlds - it's just a python file but I get most of the notebook fun!
BA: We found this duplicate field error in our database, do you know about this.
DS: I told you about this 2 years ago.
BA: Oh so you have a script to produce all the duplicate records?
DS: No I did this 2 years ago I don't know what version of Python or any of the libraries I used for the notebook.
In my experience a lot of data scientists like this work alone on business problems - if they were in a more collaborative environment they'd probably run into the problems that Joel is discussing a lot more and be more open to build libraries.
I had actually thought that they would be great for learning until I read this.
(I am a bit disappointed that the presentation doens't actually explain the reason for out of sequence execution, which seems to be one of the central complaints).
Even the fast.ai library, which is a wonder, has broken notebooks. For those who try to follow the course at home, trying to run the notebooks is frustrating, as things are out of order and so errors pop up all the time. Jeremy is a wonderful teacher, but compare following a Fast.ai course video, which uses notebooks, to following a python video from, e.g., Raymond Hettinger , which uses sphinx and a shell. While the documentation style and ugly shell don't look nearly as cool, they are so much clearer and better structured.
Notebooks become popular because they fill one gap that was left uncovered. As the scientific community moves away from Matlab into Python and R, reading code - pushed, amongst others, by the popularity of Github - becomes a day-to-day activity. Matlab scripts were easily explorable because users would load them, set breakpoints here and there, and look at results interactively - exactly what notebooks aim to provide.
The difference is that what used to be breakpoints now become cells, comments now turn into Markdown and figures are inlined to add an extra layer of convenience. Yet all the awful problems of sloppy Matlab development are now masked, marketed as something fancy and start to pollute the Python dev. environment. Reproducibility and testing are gone, dependency management (which is not required for Matlab) breaks down completely, documentation is non-existent, and sharing becomes heavily constrained.
Notebooks may be suitable for scratching things, but not much else, at least nothing serious. Hopefully the slideshow above gets the attention it deserves. And in all cases, kudos for the Jupyter dev team for fighting the good fight, even with the drawbacks of their experiment.
The jupyter implementation of notebooks has two serious problems, however:
2. The editor component has a lot of strange quirks to the point of being unusable.
My notebooks are usually set up as:
1) a cell to load external libraries
2) a cell to load any of my own modules
3) a cell to set model parameters
4) a cell containing the functions that I use to load data (to be honest these are not always encapsulated in functions in the early stages of analysis)
5) cells containing functions that do stages of analysis that I'm happy with (but unique to this analysis and therefore not turned into modules).
6) A "main" cell that runs the cells in  and gets everything ready for
7) working cell[s] where I mess around with new analysis
While I'm working I only execute from  onwards unless I need to reload my data.
This is pretty easy to turn into production if needed because when the analysis is done, you can clean up the working cells and incorporate them into your main function.
If I have to get data that takes 15 minutes to query, another 10 to preprocess, and then maybe a few more steps before it's ready to put into my algorithm and start poking around, notebooks shine. In an IDE you could (and I often do) use have cached datasets partway through the process saved to disk, as mentioned in OP. But that's a hack too. And it still takes a minute or so to load and process anyways if cached.
So the options are
1) Shitty practices, but I can write try new changes on your data immediately.
2) Better practices, wrapped in caching hacks, but I have to wait minutes between every single change.
LMAO. Using a notebook with executable code is literally the modern version of Knuth's literate programming.
While I personally don't do much of it, I loved Peter Norvig's python notebooks. Go through one of them and tell me it's not a good aid in teaching people concepts.
i mean i get that theyre nice for doing data stuff and presenting it, but it just seems a bit crazy to use it as some sort of development platform.
I'd like to highlight that Notebooks do have use in pre-production while your are exploring data subset and trying out your initial hypothesis. But please, don't throw away decades of software development best practices and reliability for mere development convenience.
We are providing machine learning tooling and our CTO published a blog post talking about the same exact issues we keep encountering with dozens of customers. (https://blog.valohai.com/leveling-up-your-ml-code-from-noteb...)
I teach Python to adult beginners including those with heavy Excel background.
Beginners catch on quickly that you need to run all the cells once to get reproducible results.
Worst comes to worse you just restart the kernel.
With Anaconda I can start teaching immediately over all 3 platforms(Win,MacOS and there are even Linux Python beginners) without worrying about differing versions or troubleshooting installations.
My students just git clone/pull my projects and start following around and experimenting themselves.
What would be the alternative, give them black on white REPL?
That said for personal projects that I do not intend to share I reach for my trusty text editor(VS Code is pretty good these days).
This avoids much of the annoyances that the slides point out about Notebook.
I use notebooks all the time, but I put repetitive code in a module, try do put all functions at the start etc. It is much easier to share data science experiments (provided you share the modules as well), but indeed notebooks can be used and presented in horrible ways.
So... now we write a presentation on how we can horribly abuse VSCode + an iPython console and call it even?
Emacs + ESS all the way .
Maybe a "this are potentially out of date" marker on out of order cells would help with the problem.
Just... roasted Jupyter notebooks.
Also they are lighthearted and fun and life is short so why not have some fun even when being serious?
They aren't fun, don't have anything to do with technical content and just put me off, specially if the presenter thinks s/he could have a 2nd job as stand up comedian.
>don't have anything to do with technical content
Most public presentations / speeches contain a few jokes. They don't have much to do with the technical content either, but they help people pay attention and ease in the speech.
It's not like we don't have enough boring powerpoint presentations already. It's also not like those going into the trouble of compiling the slides and sharing their knowledge owe us anything...
Most would welcome one or more jokes in a developer speech. So much so, that it's common advice for any kind of public speaking and presentation to add a few jokes to lighten the mood (you can find thousands of articles, books, and public speaking training sessions advising about this).
Some of the best technical speakers add humor in their presentations (often lots of it, e.g. Raymond Hettinger).