$ pip install --upgrade nbstripout # install nbstripout bin
$ nbstripout --install # install Git hook in current repo
Then, any .ipynb files that you check in will have their output stripped in the index (without affecting your working copy).
(Surprised it's not mentioned in the article.)
The problem with notebooks is that they get unwieldy, and you want to keep a bunch of code around that's only useful in certain cases, or just starts doing "too much".
Sure, you can factor this code out into a library/function, but there's nothing that makes that easy, and once you've made it into a library, there's nothing that helps you easily make changes to that library in a different notebook.
Jupyter has great potential to be a new kind of IDE, it just needs more resources.
Export isn’t great atm but can be combined with pweave: http://mpastell.com/pweave/docs.html
I think VSCode has something similar.
This gives another advantage of using a proper editor and its entire ecosystem.
I'm still struggling to find a setup in which cells are auto-generated (or unnecessary like in RStudio) and the autocomplete works as well as in JupyterLab. If I could reliably see all methods/submodules/inline documentation + path autocomplete quickly and for all packages, I would switch to VSCode. (There's a good chance that this just due to me not being fully aware of what's available in VSCode. )
edit: Atom IDE (that this package links to) has been deprecated last week or so by Facebook, I'm not sure what dependencies packages like the above have on the atom-ide-ui.
I use cells/notebooks in Python, so I can keep my code organized and run computationally intensive things once... Is this something that is not needed in R?
Then R also had RMarkdown which allows to have notebooks with executable cells (code chunks) and they play much nicer with version control than .ipynb files.
What I was referring to in my previous post is working with a .R file (which is plain text) in RStudio. If my cursor is on a single line which is also one statement, ctrl/cmd + enter executes that statement and shows me the output in the console or in a separate pane for plots. If the cursor is within a multi-line expression such as a plot declaration, beginning of a loop, function declaration, then the interpreter figures out that I want to run multiple lines and executes the whole loop/declares function/creates plot. Or I can also select some code and run it.
Ideally, this is the kind of behaviour that I'd like to replicate with a .py file. It's a nice interactive workflow and also solves the problems that jupyter has with version control.
Jupytext includes a bit of YAML in the e.g. Python/R/Julia/Markdown header.
If you need more than that, use the plain text file source code.
Actually, just forget the Jypyter Notebooks and use good old plain text source code like the rest of the programmers.
Caching to disk is cumbersome for data that's usually junk.
Cells and integrated vis is such a massive leap forward that using plain old text feels like banging rocks together.
Being able to quickly check the output while iterating on a an algorithm, or visualise intermediate results is irreplaceable.
This makes the notebook just a convenient way to visualize or share with non team members.
Spyder and Rodeo don't even come close at this point. Does PyCharm allow something similar?