Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Loving Jupyter, I use it every day and for sure I will try this out. What I do miss though is good separation between code and data, it is a pain when someone just takes a look at your notebook and it autosaves, the code block counters reset, this alters the file and GIT reports a lot of alterations.



Here is a pip package called "nbstripout" which tell git to ignore notebook output: https://github.com/kynan/nbstripout It can really help establish good practice in a project with little effort:

    pip install --upgrade nbstripout
    nbstripout --install


This so much. It is my only real complaint about notebooks. Would be so much nicer if a notebook was split in two files, like `my-notebook.input.ipynb` and `my-notebook.output.ipynb`, where `my-notebook.input.ipynb` would only contain code and be editable with any text editor similar to a .md file (and not some verbose xml). The output would contain all outputs, so that would be easily separated from the input if needed.

I can see the benefits of stuffing everything into a single file, but separating would be so much better IMHO. Version control is too important to mess with. Sometimes I want input and output to be version controlled, sometimes I only want the input. By splitting I can easily do that with simple .gitignore rules.


The notebook server have "contents managers" which decide how notebooks get stored. It is perfectly possible to write what you request, and some users have done it: https://github.com/aaren/notedown here without the outputs but it's easy enough to add.

The other possibility is to export a notebook as an actual files and folder tree: https://github.com/takluyver/nbexplode so rich object (png, svg... etc) are independently editable.

It though can be challenging to have work well because of different filesystems.

You can even go further and tell the server to store nothing on disk but in a database, postgres for example :https://github.com/quantopian/pgcontents


> Version control is too important to mess with.

Version control is too important to be left to the content-blind tools we typically use for it. In a perfect world, there'd be a core version control engine with content-specific plug-ins.


I feel your pain. Have been using git-filters for this, specifically this repo: https://github.com/toobaz/ipynb_output_filter

It strips all notebook output from *.ipynb-files before commit.


In JupyterLab, under the `Settings` menu, there is a setting you can toggle (Autosave Documents) if you prefer to turn off this behavior.




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: