
Python Data Science Handbook: Full Text in Jupyter Notebooks - TsukiZombina
https://github.com/jakevdp/PythonDataScienceHandbook
======
killjoywashere
An acquaintance once advised me to keep a context file: all the little "notes
to self", code snippets, key config elements, etc, in a file. I've tried a few
times in Vim but finally really got traction in Jupyter, through a combination
of my org's massive Windows dependencies, which is definitely not the Jupyter
community's default (needed to document lots of little idiosyncrasies), and
actually having interesting data in that world. What I really like about
Jupyter for this is that it's trivial to mix it all together: a link to a
handbook like this, how to decode and encode Windows environment variables,
tips on Vim, python, pandas, plotting, etc.

And I was really struck how a number of the headings in this handbook mapped
exactly to the headings in my context file. I suspect this will not be the
last time I click that link.

~~~
random_kris
hmm interesting idea. I've been trying all kinds of different tools for
documenting my work/code snippets/notes.... But nothing sticked. Mind
explaining more on that? Would it be stupid to have a wiki like system inside
jupyter?

~~~
mark_l_watson
I really prefer small text files, preferably org-mode. But, I just made a
radical change that is so far working for me: I signed up for G Suite for my
personal domain, and manually copied over org-mode, Apple notes, etc. to Keep
Notes. I copied all purchased PDF eBooks, ACM Communications PDFs, and
important research papers to Google Drive, and copied over all old email. With
Cloud Search, I can find any of this stuff instantly.

As a programmer, there is no larger time saver than having notes for code
snippets, configuration file examples, etc.

I used to use Evernote, then I wrote a personal version of Evernote in Clojure
that worked really well for me, except everything was just on my primary
laptop. G Suite is not great from a privacy standpoint (but I can live with
it) but for me wins out for convenience - well worth $12/month.

EDIT: I used to keep Jupiter-lab running on a GPU leased server for machine
learning educational projects. If I still did that, as other people here have
pointed out, with the new file interface Jupiter-lab would be a good choice,
esapecially with some customization to implement a global search to find stuff
quickly in all notebooks.

------
fulafel
Is there a way to run notebooks automatically, so you could regenerate
notebooks like this after some library code changes or dependency upgrades and
check that everything stillw orks?

~~~
teej
Yep, Jupyter notebooks have an execution API, you can find more of it here -
[https://nbconvert.readthedocs.io/en/latest/execute_api.html](https://nbconvert.readthedocs.io/en/latest/execute_api.html).
Hosted notebooks as a service is a growing area of investment and those
services presumably use this API.

------
qd6pwu4
It seems to be a nice introduction to numpy, pandas, and matplotlib

------
quotz
Any reviews on this ?

~~~
closed
I have a fair amount of experience with pandas, and find the notebooks very
help to refer to! I would say it's worth noting that his book is organized by
technology (e.g. numpy, then pandas, then plotting), which makes it feel more
like a technical reference, than a walk-through of basic to advanced DS
activities.

It's also worth checking out the notebooks for Wes McKinney's data science
book. Daniel Chen doesn't have the code from his DS book on GitHub, but does
have some useful notebooks he uses for workshops.

[https://github.com/wesm/pydata-book](https://github.com/wesm/pydata-book)

[https://github.com/chendaniely/pandas_for_everyone/tree/mast...](https://github.com/chendaniely/pandas_for_everyone/tree/master/training)

~~~
FranzFerdiNaN
An incredibly critical review of McKinneys book can be found here:
[https://medium.com/dunder-data/python-for-data-analysis-a-
cr...](https://medium.com/dunder-data/python-for-data-analysis-a-critical-
line-by-line-review-5d5678a4c203)

~~~
closed
Ah thanks for pointing out--I mostly agree with his posts (and his minimally
sufficient pandas is a great one!), and it's definitely worth reading. A
common quirk with a lot of the python DS books is them being "reference
manuals".

(I'm a little concerned with the aggressive way he's come at Wes McKinney in
posts and on twitter, considering Wes has given a lot of his time working on
open source contributions)

------
pooya13
When an open source book has 150 open PRs and the last commit is from 4 months
ago I am discouraged to spend time on it.

~~~
gojomo
Or maybe you _should_ spend time on it, by creating a fork with all the good
PRs applied?

~~~
pooya13
I would probably do that if the time investment was worth it. For example if
it was something I was using on a day to day basis but not for
leisurely/exploratory reading.

~~~
contrast
Even for leisurely reading, you expect an author to still be updating a book
several years after it was published? What experience has led you to believe
that's a reasonable expectation?

~~~
pooya13
I don’t “expect” the authors to do anything. But I am not going to spend many
hours of my time reading a book when I see that the book is not maintained
because there are many great books on my backlog that ARE being maintained by
the authors/community.

~~~
altairiumblue
All of my books are in pdf and receive zero maintenance - many of them are
still incredibly useful.

~~~
pooya13
I did not make a blank statement about the usefulness of outdated books. But
certain topics do get out of date pretty quickly and would be less valuable to
a maintained book.

------
nicholast
I also recommend checking out the open source Automunge tool for automated
data wrangling at automunge.com

~~~
VvR-Ox
Is this an advertisement?

Some ideas / questions: \- The documentation on GH is unreadable like this \-
On GH it says "Patent Pending" so is this not open source after that or is
that phrase just a joke? \- How is it related to the mentioned Data Science
Handbook?

