
Nbdev: Use Notebooks for Literate Programming - pcr910303
https://github.com/fastai/nbdev
======
foxes
I'm sorry, slight rant, but I do not think notebooks are good for a completed
piece of software. Maybe it's good for development/brainstorming but once you
are done the code needs to be extracted into a more traditional format.

In research I have seen countless mathematica/python notebooks which have been
released into the wild, and they are all over the place. Maybe software
developers will structure it better, but I'm kind of sick of seeing an
incoherent spaghetti state. Especially since there is no strict ordering of
the evaluation of cells.

Maybe it would be nice if notebooks could break from the traditional "page of
working" format. It could display as some tree/graph which orders the
state/dependency in evaluation. Otherwise it needs an "export" button.

~~~
dr_zoidberg
What you complain about is more about the author of the notebook being
disorganized rather than the format. The format also allows that to happen,
but it's a lack of discipline on the author that creates that mess.

Many people don't care about reproducibility, or having a paper that can be
simply read and understood. They just care about getting their paper out. Part
to blame is also on the reviewers who don't really evaluate this as part of
the paper (assuming they're given access and the opportunity to evaluate the
code, that is!).

I recently reviewed a paper for a conference and it was a mess in every
aspect. I wish they had included the .ipynb file. Instead, they provided
mangled (both by cropping and jpeg compression) screenshots of the notebook.
The code was an utter mess (instead of doing X_prime = X[10:100], they did
X_prime = [X[10], X[11], X[12], ..., X[98], X[99]]).

I gave that mess the strong reject it deserved (the rest of the paper was on
par with the code, if not worse). I'm still waiting to see if the other
reviewes will do the right thing, or they'll jump on the fad that """deep
learning""" (they used 90's machine learning methods at best) has become.

~~~
dr_zoidberg
Since I can't edit, I'll reply to myself:

The other reviewers are on the deep learning fad train. The paper has been
accepted, despite being a big pile of nonsense.

------
erikgaas
Been using this for work projects. A lot of raised eyebrows when people hear
jupyter first development, but the automated docs, flexibility with prose,
inline testing, out of the box pip packaging, and git integration make it well
worth it. A bit of a learning curve, but very rewarding.

~~~
smabie
I've written a lot of jupyter notebooks and honestly, emacs+org-mode is way
better. Maybe if jupyter wasn't in a webbrowser, had vim or emacs keybindings,
it would be better. Also, I'm not sure the idea of notebooks is even a good
idea: it's very easy to get into a inconsistent state and with no textual
source of truth it can be very difficult getting back.

Though, thinking about it, the real problem is that when I'm using a real
editor (emacs), I feel like a wizard, I know it like the back of my hand and
have any number of extensions and libraries I can use. With jupyter, I'm
always fighting something and there's no meaningful way to configure it to do
what you want. Also, the intellisense sucks. In addition, and maybe this is
silly, but I find using a webbrowser to write code to be distasteful.

~~~
ginko
I'm feeling the same way. I really like the idea of jupyter, but it should be
a native application, not something running in a browser. Maybe it's also me
not being used to working with notebooks, but I find it strange that I
manually have to reevaluate all following cells when I change an earlier one.
Shouldn't that just happen automatically?

I dabbled a bit with EIN[1], an emacs client for Jupyter, but it didn't work
all that well for me. In particular it didn't work well with my dark color
scheme and you still needed to run a jupyter server to connect to.

[1] [http://millejoh.github.io/emacs-ipython-
notebook/](http://millejoh.github.io/emacs-ipython-notebook/)

~~~
rocqua
> I find it strange that I manually have to reevaluate all following cells
> when I change an earlier one. Shouldn't that just happen automatically?

I get where you are coming from but this would ruin a lot of my data analysis
stuff. These are cases where I have 30 minute queries in the lower cells. I
don't want those to fire every time.

What might be a nice addition is the ability to either 1) clear the output of
all those cells, or 2) mark those cells as inconsistent.

That being said, there are enough other foot-guns besides out-of-order
execution in jupyter notebooks. The number of times that persisted variables
defined in long-deleted cells have masked bugs is more than I'd care to admit.

~~~
smabie
>The number of times that persisted variables defined in long-deleted cells
have masked bugs is more than I'd care to admit.

Definitely. It happens particularly often when you change a variable name (and
modify its definition) and forget to update the name for parameters further
down in the notebook. Suddenly, without any warning, you are using stale data
for your analysis, which can really throw a wrench in things. It would nice if
when you changed a cell it erased all the old definitions in that cell.

And I don't think I'm the only one who doesn't have trust in their notebook
definitions: I've noticed a trend among pretty much anyone who uses them that
after they finish their analysis, they restart the kernel and rerun the entire
notebook from start to finish as they have little faith that the results in
the notebook are actually derived from the cells currently in the notebook.

------
spv
Can nbdev be used with other Jupyter kernels, for eg like Julia.

My main issue with notebooks is putting them under version control in git.
This seems like it could help with that.

~~~
amirathi
Yes, nbdev can help with merge conflicts for notebooks [1]. Also checkout,

\- GitPlus[2] - JupyterLab extension for git version control

\- ReviewNB[3] - For notebooks diffs & commenting

Disclaimer: I built GitPlus & ReviewNB

[1] [https://nbdev.fast.ai/#Avoiding-and-handling-git-
conflicts](https://nbdev.fast.ai/#Avoiding-and-handling-git-conflicts)

[2] [https://github.com/ReviewNB/jupyterlab-
gitplus/](https://github.com/ReviewNB/jupyterlab-gitplus/)

[3] [https://www.reviewnb.com/](https://www.reviewnb.com/)

~~~
spv
reviewnb looks interesting. cool project.

------
bpesquet
Are there some example projects available somewhere? I cannot find any (apart
from nbdev itself).

