
JupyterCon: I don't like Notebooks [slides] - tosh
https://docs.google.com/presentation/d/1n2RlMdmv1p25Xy5thJUhkKGvjtV-dkAIsUXP-AL4ffI
======
FridgeSeal
I seriously cannot agree more.

Jupyter notebooks were fun to use for a bit, then I hit the inevitable wall of
"ok, now let's turn this into a real, properly built script, but now
everything is breaking for inexplicable reasons".

Notebooks are fine for early stage experimenting, but if you've got to the
point where you start up you are relying on a notebook for anything, or your
workflow consists of "start up notebook, run cells until I get to the one I'm
working on" then you need to stop, and build it into something proper.

Experimentation is fine, but it should not come at the cost of writing things
properly when the time comes, and they are not an excuse for not knowing some
good software engineering. Serious alarm bells go off in my head when I read
tweets like "Data science code doesn't need to follow the rules of good
software engineering".

Edit: that remote notebook - is that honestly not one of the most terrifying
things you've seen? That runs so counter to almost every good bit of software
design and engineering practice about clarity, maintainability, good practice,
security, etc that I can think of. It's the very definition of indecipherable,
inscrutable hidden state and unknown side effects.

~~~
daveguy
I don't even like it for initial experimentation and exploration. Much prefer
the command line repl (the jupyter/ipython one). Faster feedback and easier
navigation then you can just export the history when you're ready to capture
and make permanent some workflow. It still requires going through and picking
out the important bits, but reading the history like a story of commands is
generally enough to pick out the important parts.

The only thing I would use notebooks for is demonstration / teaching.

~~~
cdancette
If you work with image processing, notebooks are very handy as they can
display images directly. Also to display tables nicely with pandas, or for any
data visualization actually.

~~~
srean
Doesn't IPython repl running in a QTconsole terminal do that ?

~~~
cdancette
Yeah but it's less nice, and if you're running on a distant server you have to
deal with X forwarding

------
dracodoc
RMarkdown have all the advantages of notebook and without most of its
problems.

\- code and document mixed. You have full markdown syntax, chunks, titles,
table of contents. You can also convert RMarkdown into a script with comment
if you need.

\- To render a RMarkdown, it will run in a separate environment, start from
scratch. It's assumed you should make it reproducible, this is good for
report/sharing.

\- I also use RMarkdown to write code, and run code/code chunks interactively
in a session. I can write plan, notes, references, TODO in document, test code
in chunks, execute code chunks in any order. Basically you just code and
document. In the end you can turn it into a report, or refactor the tested
code into functions and scripts. I kept the original RMarkdown as design
document, which have all the original notes and previous version of code.

\- It's plain text so version control is fully supported.

RMarkdown started to support python, but it may be preliminary for now.

------
j_4
I absolutely love Jupyter as a computer science student who uses it a lot to
fiddle with new concepts, create visualizations, or write markdown reports
with annotated code. I do agree with some points, though.

The problems with state felt sort of overstated (heehee), but it obviously is
an issue. I always wonder why the menu option Kernel -> Restart & Run All is
not a first class citizen, with a big red button at the top of the window.
It's my main way of interacting with the notebook for anything that takes
under a minute to run. Running cells one by one throughout the netbook isn't
very useful. You either want to rerun the cell you're working on, or clear the
state just in case and run everything.

Module versioning is another thing that should be a no-brainer.

I also wish the notebooks diffed better when it comes to SCM, without all the
JSON artifacts.

Proper integration with existing Python tools/IDEs would be wonderful.

Also yeah, kernel sharing is one of the most horrific ideas I've heard
recently.

~~~
amirathi
> I also wish the notebooks diffed better when it comes to SCM, without all
> the JSON artifacts.

This is one of my biggest qualms as well. I built a DevOps tool [1] that uses
Notebooks & not being able to diff and review Notebooks was a pain. I have
decided to solve it with a GitHub marketplace app. Wrote more about it here:
[https://medium.freecodecamp.org/how-to-handle-version-
contro...](https://medium.freecodecamp.org/how-to-handle-version-control-and-
reproducibility-with-jupyter-notebook-e1fbc0b8f922)

[1] [https://nurtch.com](https://nurtch.com)

~~~
camel_Snake
Might want to use the full url [0] when sharing your work - the one you linked
gives me problems with the ssl cert.

[0] [https://www.nurtch.com/](https://www.nurtch.com/)

------
bayesian_horse
Notebooks were not created as a way to implement and organize software. You
can do that with code files.

Jupyter Notebook is a presentation software, for demonstrating something to
yourself or others. There is hardly anything comparable to build interactive
demonstrations. You'd have to implement a (multi-paged?) GUI application, or a
web application. Just plain html output may cut it for certain use cases, but
still more painful.

And I've done more stupid stuff than that. For example I have a demo recording
and showing EEG data. And a snake game implemented as a widget with Bokeh
output. And live-update graphs from FlightGear. Lately I've started
experimenting with controlling my 3D printer from the notebook.

~~~
hatmatrix
Exactly - it's a classic case of a tool being used for a different purpose
("IDE") than its original intention ("presentation software") and getting
criticized for not holding up to misguided expectations.

His criticism is valid but the main lesson is to use the appropriate tool for
the job. He himself uses VSCode + iPython, which is a better tool more
generally for the desired use case of code development.

~~~
kbumsik
Although the OP criticizes missing IDE features but that is not his main
points.

~~~
hatmatrix
But his main points aren't really showstoppers if you're using it as
presentation software.

------
gaff33
The thing that bothers me about notebooks is that they don't seem to be
getting better. Splitting the data from the code so that they play nice with
revision control should already be implemented by default! Allowing notebooks
to also be libraries should be doable. Testing mechanisms on notebooks should
be standardised etc.

Is anyone working on bridging the gap between notebook sketches and production
ready code?

------
WorkLifeBalance
When I first found notebooks I was amazed at their utility. The ability to
quickly display algorithms and processes to a wider (potentially layman)
audience is amazing.

To be able to have a scratchpad and play around with techniques while keeping
a sort-of record is also amazing.

I see it like I see excel, it's a fantastic tool for data exploration and some
visualisation. It's not something that should be used in any final workflow or
production system.

Like excel, it can be badly mis-used but they unlock ways of working that
simply weren't possible before it.

The criticisms about hidden state are fair, I think it would be better if
previous step data was more explicitly wrapped in the following cells so you
could choose to use the "wrapped package" of the previous data or choose to
use a re-evaluating version.

I think it works best when most cells have few side-effects, even if that
means repeating previous calculations.

------
Certhas
I have tought Python with notebooks to scientists and we always emphasize the
pipeline:

Play with your code in the notebook, when you know what the right way to split
it is, turn cells into functions. When you have a decent body of functions
turn them into a module.

Notebooks excel as ad hoc interfaces to libraries and modules. They don't
replace them.

I'm not sure the problems with scientists' code are enabled by notebooks to a
significant degree. We used to exclusively use PyCharm and I was still
debugging spaghetti scripts with massive global state interacting in weird
ways.

Notebooks give people more foot-guns with hidden state. I think they are
neutral when it comes to modularity. And I think they are a positive for
encouraging documentation.

~~~
analog31
This seems like a decent progression for easing beginners into the basics of
good programming practices.

It's certainly not a Jupyter problem _per se._ I went through a similar
progression when I learned BASIC in high school, in 1981. Projects grow to a
size where they become unmanageable without some structure.

In my view, the importance of documentation is huge. I've been programming for
a long time. I don't write _software._ I use programming as a problem solving
tool. Jupyter has made a profound difference in my ability to pick up work
that I did a month ago, or years ago, and figure out what I did, mistakes and
all. For me it's more about reproducible problem solving than software
development.

~~~
XorNot
Notebooks for me were something I found after iPython Qt console, when I was
getting frustrated at how hard it was to replay a set of scripts to get back
to my state.

I always view them as an execute in order abstraction. I suspect what we
really need to improve them is a subset of python which only deals in
immutable data structures somehow - and then let notebooks branch off that
immutable state but still be "execute forward" only.

------
pleasecalllater
The code written by a couple of "data scientists" I was working with is the
worst code I have ever seen. They don't care, they just want to have an
experimental results. The problem starts when their experimental "code" needs
to be used on production or they are asked to describe how it works. Why
cannot we just get good programmers and train them as data scientists?

~~~
dagw
_Why cannot we just get good programmers and train them as data scientists?_

For the same reason we can't just get good programmers and train then in
biology or chemistry or structural engineering. Sure they exist, as do data
scientists that are really good programmers, it's just that they're more rare
and in very high demand.

Often much easier to find a domain expert and a programmer and have the
programmer rework the code done by the domain expert. In fact that used to be
my job for a while (working with physicists), and it was actually quite fun.

~~~
physicsguy
> Often much easier to find a domain expert and a programmer and have the
> programmer rework the code done by the domain expert. In fact that used to
> be my job for a while (working with physicists), and it was actually quite
> fun.

It's basically what people are now calling 'Research Software Engineers'

------
Fomite
Had a project once where the "super helpful notebook" ended up being nicknamed
"The Wall of Madness". Tons of weird out of order errors, #DONT RUN BELOW THIS
LINE WITHOUT TALKING TO JIM, etc.

------
gp7
My experience with notebooks matches the one presented here. For me the most
disappointing thing is that notebooks are JSON, and not just marked up .pys.
The cons of this decision outweighs the pros for me.

One thing though--one of the best books I've read, Trefethen's ATAP, was
written as a collection of .m files, which when run would produce a pdf of
each chapter. The .m files were filled with small formatting details that were
simply omitted from the generated book. The slides suggest something
equivalent is not possible with notebooks. That's unfortunate.

------
sgillen
I feel like I haven’t really dealt with this out of order execution madness
this guy and some others are talking about here, usually I end up with like
2-5 big cells that I use, typically just a “init everything cell” a “run
simulation/training cell”, and then some plotting cells and other “utility
cells” that I use to poke at things.

I’ve been moving functionality to modules when I can too which helps minimize
the amount of code actually in the notebooks, and I also will break up code
into different notebooks (occasionally saving/loading specific variables
between kernels) when it makes sense to. Maybe all this is helping a lot, have
you all needed notebooks with dozens of cells as this presentation mentions?

------
pablobaz
One thing not mentioned was that version controlling notebooks is horrible -
try diffing a big json mess that mixes input and output.

~~~
stadeschuldt
I would encourage everyone to use
[https://github.com/kynan/nbstripout](https://github.com/kynan/nbstripout)
before commiting.

------
tolmasky
I’m glad these concerns are becoming more widespread, since they are somewhat
subtle and when we originally shipped RunKit many people didn’t know why we
didn’t “just write a js backend for Jupyter”. The reality is that solving
these problems are a huge engineering challenge and we spent the entire first
year of development at RunKit on unifying the “module” and “REPL”
environments. Our litmus test was that notebooks would be logical to work with
once they could literally be required by other packages as if they were just
modules with no modifications. The solution we came up with was VM-level time
traveling: if you modify a previous cell, you should rewind the entire state
of the machine (including undoing changes made to the file system, spawned
processes, etc) and “pick up from there”. In RunKit, if cell 3 deletes a file,
you can still read the file if you modify cell 2, since we snapshot the entire
computer, and thus you don’t have to “pretend” you’re modifying cell order
like with Jupyter. In this way you really do get the best of both worlds: a
notebook never has out of order cells or is displaying its contents in an
unintuitive way because if behaves “as if” you had just rerun the entire
notebook from the start on every change - _but_ with the feel and speed of
iteratively appending cells. Additionally, since this isn’t done at the
“language level”, computer-level side effects don’t become “out of sync” with
your notebook - “oops I dropped a table in the database” - don’t worry just
rerun the cell with corrected contents, it’ll always run with the same state
it started with. You can read more about this in a blog post we wrote when we
initially released (even though it reads like a direct response to slide 25 of
this presentation): [https://blog.runkit.com/2015/09/10/time-traveling-in-
node-js...](https://blog.runkit.com/2015/09/10/time-traveling-in-node-js-
notebooks/)

------
Radim
Fascinating presentation! Joel is such a meme factory :-) Has anyone done a
"rejected memes" section at the end of a technical talk before?

On topic: I applaud the effort to set things straight, but I'm afraid that
ship has sailed. Hacky "Kaggle notebook solutions" _are_ now Data Science.
That's what the term evolved to mean, and all the rest of the impedance
highlighted in this presentation follows.

If your work involves creating well-designed, well-factored, tested, reusable
ML software that is meant to be integrated, picked apart, extended and applied
over time in practice (as opposed to be submitted once to an oracle in order
to claim "SOTA"), you better come up with another term. And obviously
Notebooks are not a good fit, beyond as a tool for documentation & reporting
(which Joel correctly calls out as genuinely useful).

------
aluren
See a previous post of mine
([https://news.ycombinator.com/item?id=17840216](https://news.ycombinator.com/item?id=17840216))
for my general (frustrating) experience with notebooks.

I'll add positives notes, though:

-They're excellent for beginner tutorials that explain an abstract concept or a library's use cases

-They're very well suited for academic peer-review. It's basically a way to say: here's what we obtained, here are the exact steps we did to obtain it, you can obtain it as well if you do exactly as we did. You'd think this reproducibility requirement would be common sense but in practice you're usually thankful if the data is available, or the software is released and works.

------
InternetOfStuff
I love notebooks! They are an excellent tool, if used judiciously.

They are really practical for situations where you want to play around (er
even outright work with) principally not with code, but that code's output.

You see, I give lots of trainings.

Notebooks offer me an excellent way to mix commands, their output, and
explanations into a single document with little effort. I'm able to show my
students exactly what happens (including the literal messages), going step-by-
step.

They are wonderful to create exercises.

However, for my use case, the notebook _is_ the output (perhaps rendered as
PDF).

Rules I've adopted for my own training notebooks:

    
    
      * the first lines are to print the versions of all things I'm using, e.g. "git --version" for git trainings
      * I use "restart&run all" frequently
      * obviously, notebooks are version-controlled, including their output
      * before checking in, prove that "restart&run all" provides exactly the desired result
    

Having said all that, I'd never use a notebook to write actual programs. It
feels weirdly impractical, to the point that I was wondering of the
presentation was actually presenting reality, or a strawman (I'm not doubting
the veracity of the description, I just had a hard time accepting it as real).

~~~
ragebol
What do you use to version-control your notebook?

~~~
InternetOfStuff
Just git.

The tool is just a detail. My point was rather that they are very much "code",
and deserve to be treated as carefully as any other.

My notebooks also get refactored on occasion.

------
a-dub
Agree 100%! The best use of notebooks seems to be making demos of very simple
things that demonstrate the idea of notebooks. My gripes:

1) It's claustrophobic! Trying to work in notebooks always felt like trying to
do a math problem with too little paper when you're used to big empty sheets.

2) Readability is weird! I want to see all the code and then see the plots,
not little crazy crunched up snippets with little crazy crunched up plots that
_might_ allow interactivity.

3) I might actually want to do something that involves looking at more than
one plot at a time! Seriously? This always struck me as ridiculous. I have
screen real estate, I want to use it! ESPECIALLY for interactive data
analysis.

4) They're not as portable as they should be! There's always some drama when
you open up someone else's notebook locally.

5) They encourage people to write weird ass code! Scientists already have a
tendency to be messy if they're not CS types, this just makes it worse.

Alternative:

vim + tmux + ipython REPL + vim/tmux slime for shipping stuff from the editor
to the REPL + matplotlib in QT mode

It's not as good as MATLAB but it gets close.

------
jules
What if notebooks re-executed all cells in order as you type? That would solve
the ordering and hidden state problem. To speed that up you could take a
snapshot of the program state at each cell and re-execute from the snapshot of
the cell preceding the cell you're modifying.

The advantage of a notebook over a repl is that the code you typed stays there
and can be re-run and modified later. Re-executing all cells in order ensures
that that actually works.

You can do away with the cell concept, and instead have some way to annotate
which lines display their output. Then the distinction between an editor and a
notebook almost disappears. An editor plus a way to annotate which lines'
outputs are displayed, plus a way to type rich text, becomes a notebook.

Even better, you could allow users to display the output of lines _inside
functions_ , and have a way to select which concrete call is actually
displayed. Sean McDirmid has already implemented such an editor. This removes
the incentive to avoid abstraction, because it allows you to display outputs
even if you put code inside a function or class. It's even better than a repl
in this regard, and it doubles as a powerful debugger that can navigate
through the execution.

The navigation works similar to an IDE's go to definition, except that when
you go to definition on a call, it sets the concrete execution context of the
call. For instance,

    
    
        function foo(x)
          y = x+2
          return 3*y
        end
    
        foo(5)
        foo(6)
    

If you click on the foo(5) call then it jumps to the definition of foo and
sets x=5 and displays the outputs of expressions you've annotated, such as y =
x+2. A similar mechanism allows you to pick the iteration of a loop. It even
works fine in the presence of lambdas, allowing you to debug through callbacks
(unlike conventional debuggers).

~~~
wting
This causes issues when working with large data sets or running expensive
operations, which is an advantage of modifying a single step independently
from the rest.

~~~
reilly3000
As an example, loading up a dataset I was working with into RAM from my NVME
drive took about 9 minutes each time. Rerunning that step on each script run
is productivity-prohibitive as I am just trying to explore and clean the data.

------
tw1010
I wish we didn't have a culture of being bitterly angry in our talks.

~~~
kkitay
I agree. Found the "I know a few things" spiel distasteful, and the general
tone to be patronizing and rude. Attitudes likes this are negatively impactful
and points for or against notebooks can be made without the toxicity.

~~~
coldtea
> _Found the "I know a few things" spiel distasteful_

And if he hadn't included it, all the comments would be "yeah, and who is he
to talk".

------
carlosvega
Agree with most of the things in this presentation. BUT. I generally use
Jupyter Notebooks for these things: \- Play around with some libs and charts
\- Draw charts for my papers or whatever \- In short: draft stuff

And, for serious stuff, of course, python in text files with modules, tests
(pytest) etc.

------
rwasher
Great presentation, does anyone know if it was recorded?

The strength that kept me coming back to notebooks was their power at
iterating on a problem but I was continuously frustrated at the difficulty of
extracting my solution / tracking it in git / collaborating with colleagues
etc. Also I didn't enjoy the editor experience from a UX point of view.

I've since started using hydrogen [1], a plugin for Atom which (via a Jupyter
kernel) seems to get what I wanted from both worlds - it's just a python file
but I get most of the notebook fun!

[1] [https://github.com/nteract/hydrogen](https://github.com/nteract/hydrogen)

------
jackgolding
Ironic this pops up today when a colleague has this scenario happen...

BA: We found this duplicate field error in our database, do you know about
this.

DS: I told you about this 2 years ago.

BA: Oh so you have a script to produce all the duplicate records?

DS: No I did this 2 years ago I don't know what version of Python or any of
the libraries I used for the notebook.

In my experience a lot of data scientists like this work alone on business
problems - if they were in a more collaborative environment they'd probably
run into the problems that Joel is discussing a lot more and be more open to
build libraries.

------
collyw
I'll be honest and admit that I have barely used a Notebook, but I can see
they have their place. They are very popular with scientists, where they are
writing experimental code, much of which will be thrown away. For stuff that
needs to be used in production then they seem like the wrong tool.

I had actually thought that they would be great for learning until I read
this.

(I am a bit disappointed that the presentation doens't actually explain the
reason for out of sequence execution, which seems to be one of the central
complaints).

------
dangom
Notebooks may be suitable for scratching things, but not much else. The
problems noted by the OP are very serious, specially in the scientific world,
where many do not have proper software engineering skills. The real issue is
that newcomers do not know better, and so they don't realize the damage
they're inflicting upon themselves and others before it's too late
(irreproducible code, hidden states, dep. management, etc).

Even the fast.ai library, which is a wonder, has broken notebooks. For those
who try to follow the course at home, trying to run the notebooks is
frustrating, as things are out of order and so errors pop up all the time.
Jeremy is a wonderful teacher, but compare following a Fast.ai course video,
which uses notebooks, to following a python video from, e.g., Raymond
Hettinger [2], which uses sphinx and a shell. While the documentation style
and ugly shell don't look nearly as cool, they are so much clearer and better
structured.

Notebooks become popular because they fill one gap that was left uncovered. As
the scientific community moves away from Matlab into Python and R, reading
code - pushed, amongst others, by the popularity of Github - becomes a day-to-
day activity. Matlab scripts were easily explorable because users would load
them, set breakpoints here and there, and look at results interactively -
exactly what notebooks aim to provide. The difference is that what used to be
breakpoints now become cells, comments now turn into Markdown and figures are
inlined to add an extra layer of convenience. Yet all the awful problems of
sloppy Matlab development are now masked, marketed as something fancy and
start to pollute the Python dev. environment. Reproducibility and testing are
gone, dependency management (which is not required for Matlab) breaks down
completely, documentation is non-existent, and sharing becomes heavily
constrained.

Notebooks may be suitable for scratching things, but not much else, at least
nothing serious. Hopefully the slideshow above gets the attention it deserves.
And in all cases, kudos for the Jupyter dev team for fighting the good fight,
even with the drawbacks of their experiment.

[1]
[http://course.fast.ai/lessons/lesson3.html](http://course.fast.ai/lessons/lesson3.html)
[2]
[https://www.youtube.com/watch?v=9zinZmE3Ogk](https://www.youtube.com/watch?v=9zinZmE3Ogk)

------
enriquto
I love the notebook concept, it is the best way to show to other people pieces
of working code and their results.

The jupyter implementation of notebooks has two serious problems, however:

1\. The notebook source is not natively stored as a text file that you can
easily edit with a text editor. You are forced to use the javascript
interface.

2\. The editor component has a lot of strange quirks to the point of being
unusable.

------
Mvandenbergh
I use notebooks a lot but when I've got a piece of code that I re-use more
than once I pull it out into a module or two.

My notebooks are usually set up as: 1) a cell to load external libraries 2) a
cell to load any of my own modules 3) a cell to set model parameters 4) a cell
containing the functions that I use to load data (to be honest these are not
always encapsulated in functions in the early stages of analysis) 5) cells
containing functions that do stages of analysis that I'm happy with (but
unique to this analysis and therefore not turned into modules). 6) A "main"
cell that runs the cells in [5] and gets everything ready for 7) working
cell[s] where I mess around with new analysis

While I'm working I only execute from [7] onwards unless I need to reload my
data.

This is pretty easy to turn into production if needed because when the
analysis is done, you can clean up the working cells and incorporate them into
your main function.

------
imh
Some of this is interactive development vs static development. A lot of the
IDE features and stuff you get in "static" development, you could imagine
getting into notebooks one day. The thing I don't know any good way to deal
with is cached state during development.

If I have to get data that takes 15 minutes to query, another 10 to
preprocess, and then maybe a few more steps before it's ready to put into my
algorithm and start poking around, notebooks shine. In an IDE you could (and I
often do) use have cached datasets partway through the process saved to disk,
as mentioned in OP. But that's a hack too. And it still takes a minute or so
to load and process anyways if cached.

So the options are

1) Shitty practices, but I can write try new changes on your data
_immediately_.

2) Better practices, wrapped in caching hacks, but I have to wait minutes
between every single change.

------
bitL
I use notebooks to test initial model performance, add some meaningful stats,
descriptions of parts of computation performed and reasoning behind it, links
to external (arxiv/github) papers/sources with detailed method descriptions,
visualization of preliminary results of initial computation or hyperparameter
search; that is a vital piece of information I provide to my clients who can
then make an informed decision which way(s) they should move forward or abort
completely. Then a production-ready code is developed outside notebook and
productionalized including appropriate services (SaaS, serverless etc.). I
found this workflow pretty good in convincing clients as they can play with
results early in their decision process.

------
mbrumlow
I think notebooks are neat, and can be a useful tool, especially for learning.
But until the last few days I had no clue people were trying to use them for
actual software development. To me this is a mistake, and adds a lot of
tooling to what was and should be fairly simple process of opening up a text
editor (of your choice, mine will be emacs). I am also fairly shocked there is
a conference around this idea. I feel that this is a fad and distraction :/.
It is a shame so much effort is being spent on this idea as a tool for
building software. I think its focus should be on becoming a good aid in
teaching people concepts. The last thing I want required to write software is
a web browser.

~~~
Cmerlyn
> I feel that this is a fad and distraction

LMAO. Using a notebook with executable code is literally the modern version of
Knuth's literate programming.

While I personally don't do much of it, I loved Peter Norvig's python
notebooks[1]. Go through one of them and tell me it's not a good aid in
teaching people concepts.

[1]: [https://github.com/norvig/pytudes/](https://github.com/norvig/pytudes/)

------
ruksi
Joel is definitely not alone, I agree wholeheartedly with everything mentioned
in the slides.

I'd like to highlight that Notebooks do have use in pre-production while your
are exploring data subset and trying out your initial hypothesis. But please,
don't throw away decades of software development best practices and
reliability for mere development convenience.

We are providing machine learning tooling and our CTO published a blog post
talking about the same exact issues we keep encountering with dozens of
customers. ([https://blog.valohai.com/leveling-up-your-ml-code-from-
noteb...](https://blog.valohai.com/leveling-up-your-ml-code-from-notebook))

------
sireat
Notebooks are a fantastic teaching tool despite some valid criticisms in the
slides.

I teach Python to adult beginners including those with heavy Excel background.

Beginners catch on quickly that you need to run all the cells once to get
reproducible results.

Worst comes to worse you just restart the kernel.

With Anaconda I can start teaching immediately over all 3 platforms(Win,MacOS
and there are even Linux Python beginners) without worrying about differing
versions or troubleshooting installations.

My students just git clone/pull my projects and start following around and
experimenting themselves.

What would be the alternative, give them black on white REPL?

That said for personal projects that I do not intend to share I reach for my
trusty text editor(VS Code is pretty good these days).

------
anonu
For the longest time I was looking for a "highlight and run" environment for
python. I've got Spyder IDE connected via ssh forwarded ports to a python
kernel on a remote server.

This avoids much of the annoyances that the slides point out about Notebook.

------
spicymaki
This was a well crafted and engaging slide deck. I enjoyed this thoroughly. I
hope the issues presented are addressed by the Jupyter team. We should always
promote good software development habits regardless of the target audience.

------
teekert
Hmm, agreed. But before you share a notebook, restart kernel and rerun all
cells, fix all errors, and then share... Solves many of his gripes.

I use notebooks all the time, but I put repetitive code in a module, try do
put all functions at the start etc. It is much easier to share data science
experiments, but indeed notebooks can be used and presented in horrible ways.

So... now we write a presentation on how we can horribly abuse VSCode + an
iPython console and call it even?

------
tr0ut
This was great, bravo! While I am not necessarily the audience intended. I've
been following some Notebook development. I have always thought of it as
convoluted and pedantic. I don't mean to distract from some very solid and
well intentioned work. Just that it added a layer of complexity and turned out
not to be as portable as advertised.

------
usgroup
I only use notebooks in R (via knitr). Unless you choose to cache specific
blocks by flagging it, it re-runs the whole notebook from scratch every
time... So personally I haven't had any of these problems before. Further, the
rendering is static, so you can't really choose to run things out of order
either.

~~~
wodenokoto
If you use notebooks inside R-Studio (like I imagine 99% would do) you click
ctr+enter inside a code block and that block and only that block is executed,
and below, the output of that block is rendered.

~~~
usgroup
Absolutely don’t :)

Emacs + ESS all the way .

------
riffraff
When I used notebooks the first time my mind automatically assumed the cells
would work as a sort of data-flow variables and auto-update when a previous
one changed.

Maybe a "this are potentially out of date" marker on out of order cells would
help with the problem.

------
craigmcnamara
I love notebooks, but I agree with you. I like to use notebooks to instrument
and debug modular models. Top down design, which is what I was taught years
ago when I was learning C/C++ seems harmful when paired with out of order
execution.

------
drej
Previous discussion under a submission from the conference where this was
presented.
[https://news.ycombinator.com/item?id=17839188](https://news.ycombinator.com/item?id=17839188)

------
fiatjaf
Has anyone mentioned [https://observablehq.com/](https://observablehq.com/)? I
hate notebooks too, but I like this one a little.

------
phenomax
Although I am a huge fan of Jupyter, Joel made some serious points there.
First, we should think about embedding used package versions in our notebook
to enhance usability.

------
wodenokoto
They are not perfect, but they are still great

------
bionsystem
So, as a sysadmin who is on the verge of deploying jupyter massively (and
support it), what should I say to my management ?

------
kbumsik
Yes. Reproducibility is a real problem. I saw my friends who study ML found to
reproduce notebooks on github.

------
honkycat
This person is my hero.

Just... roasted Jupyter notebooks.

------
genofon
Serious question, I don't want to attack this presentation but in general am I
one of the few that finds meme annoying in slides?

~~~
gdfasfklshg4
I disagree. The memes give me memory cues when I think about the talk later. I
have a visual memory so maybe this is why?

Also they are lighthearted and fun and life is short so why not have some fun
even when being serious?

~~~
tw1010
They only work in that capacity if you already watch a lot of memes, so they
effectively act as a shibboleth (in-group out-group separator) for separating
people who spend a lot of time on reddit from those that don't.

~~~
gdfasfklshg4
I'm not a Reddit user nor do I keep up with memes so I hadn't seen most of
these memes until looking at this slidedeck.

------
tree_of_item
The problem here is not notebooks, it's Python. Notebooks don't have tons of
hidden state, _Python does_. Fix the language and notebooks are incredible.

------
tobyhinloopen
I also prefer a Desktop, but I can't take it with me.

