Instructors are writing their lectures as IPython notebooks, and distributing them to students, who then work through them in their JupyterHub environment.
Our most ambitious so far has been setting up each student in the course with a p2.xlarge machine with cuda and TensorFlow so they could do deep learning work for their final projects.
We supported 15 courses last year, and got deployment time for an implementation down to only 2-3 hours.
In conclusion, IPython good, JupyterHub good.
Edit: surfacing the link to the open source repo on GitHub https://github.com/harvard/cloudJHub
I've had many courses that were bogged down by software setup issues in college; I would rather than not be the case.
I want to say that, if it's pedagogically valuable, then it needs to be made into a small lab course (or part of the lab unit for an intro class), and taught once, in an organized manner.
And then stop letting professors hide behind this lame excuse so that they can get on with teaching the stuff that their course is actually about.
(I am actually leading a machine learning for high-schoolers camp in 2 weeks and we are using Jupyter notebooks so that all students, with heterogeneous backgrounds, will start in the same place and get to the fun stuff fast. Many will never have used Python and will not know or care about 2.7 vs 3, just to give the most high-level and basic example!)
An exception would be the example of the deep learning project work. In this case JupyterHub was utilized as an easy way to deploy a centrally managed, cost effective environment for a large class to use GPU resources without the risk of running up huge AWS costs for each student.
When the number of hours are limited, it's best to skip it entirely, and just provide a solid paper tutorial.
Uh, you know its Harvard we're talking about right?
It's almost a waste of time teaching people to setup a deep learning stack.
NVidia will break everything you do with every release and any instructions you write will be outdated in weeks.
For example, the TensorFlow/CUDA/CuDNN installation changes continually because you can't install the default releases of any of them and get a working system.
I do understand the dilemma. I work at a K-12 and the office next to mine is where they put together the science lab kits for students. It takes a fair bit of understanding to do that correctly sometimes, and that preparation work is some knowledge the students seem to miss out on in order to get to the subject matter. My coworker has mentioned on more than one occasion that with certain modules it feels like she does most of the work and the students just do the final step.
It is true that we had to deal with some issues that might not have occurred had students gone through the process of setting up the environment themselves, like have to rebuild the machine of the student who uninstalled Cuda.
Wow! How expensive was this? Do you do any sort of shutdown/startup work or use pre-empt instances?
The average cost over all of the other courses was something like $2-3 per month per student. The deep learning course ended up being closer to $20 per student. Thanks to Amazon Educate almost the entire cost was covered with credit.
I wrote a tool that does that with Keras but I'm not sure if it's actually useful for real-world use cases.
I do think that Jupyter notebooks are an amazing thing for CS Education. I wish more college level classes would utilize them. It adds a nice layer of interactive experimentation to any program/assignment/project.
What I ended up using was z2jh , which is working out great for right now!
We aren't yet allowing students to use GPUs or any libs that would require them, but we may look into that in the future.
Feel free to reach out to me if you would like more info.
"we had an issue with Oauth when we upgraded JupyterHub version 0.7 to version 0.8. The instance spawner we wrote needed to be updated to fix the issue. The case that opened about this in our repository* fixed with the latest update of the instance spawner"
So it seems we could be using GitHub standard OAuth now. But 95% of our implementations utilize Canvas auth reconciling with our university AD.
I know there've been some work to have instructions on how to deploy on AWS, and work on the k8s helm charts to do so if that can be of help. If any work could be consolidated to both decrease the workload of you (and us), that would be good. Are any of you attending JupyterCon in August ? In person feedback is always welcome (Sturday August 25th is open, free, Community Day/ hackathon / sprint/ open studio, where the Jupyter team will be there)
Another great trick. Anywhere you want to debug or play in your scripts run `import IPython` and the run `IPython.embed()` and your program at that point with all its locals drops into an IPython session, which is nice.
The keyboard shortcuts it has are superficial. My number one shortcuts in emacs are compile, jump to next error, grep/occur, index, and magic. Just up and down? Obviously u use them a lot, but the arrow keys do work fine. Beginning of line and begging of text are both huge. But really, the screwy tab behavior kills me in Jupyter.
The one other super handy shortcut is CTRL-R for shell like reverse search. That is pretty sweet.
syntax is a little different dependant on whether you're in py3 or 2.7 so just google it but the jist of it is you replace the sys.excepthook with this colortb then when your script hits an uncaught exception say, you'll be presented with an ipdb colorised traceback.
You can step up and down the stack frames, embedding at any point with `from IPython import embed` then `embed()` to get a fully functional ipython repl. game changer
< https://news.ycombinator.com/item?id=17202704 >
We found this out the hard way when we tried to productionize ML code in Jupyter. We had to export to .py and add boilerplate. This works fine unless there is back and forth iteration between modeling and prod, which there invariably is; our data scientists had to make changes to the notebook and we had to redo our boilerplate, so the notebook code and production code were constantly out of sync. This could have been alleviated with automation -- but such automation is bespoke and hard to generalize.
PyCharm has a Scientific Mode (similar to RStudio's IDE approach, where you are actually writing code in a text file but are able to statefully/interactively run code by pressing Ctrl-Enter on code blocks). Spyder, Matlab and a bunch of other IDEs implement this idea too.
Unlike notebooks, this is, I feel, a good middle ground between interactive exploration and having production-ready code.
Additionally, I use the following settings in my ipython_config.py file to automatically reload modules:
c.InteractiveShellApp.extensions = [ 'autoreload' ]
c.InteractiveShellApp.exec_lines [ '%autoreload 2' ]
For example, at least in some previous versions, Caffe and TensorFlow make incompatible assumptions about the ability to claim all available GPU memory. So there can be situations where you first import Caffe, then later import TensorFlow with restrictions on its GPU policy. If you naively re-import the Caffe code, it can evict TensorFlow from whatever GPUs it had reclaimed, and coming up with a group of settings that reliably prevent this, across possibly different machine where the notebook will be run, is very tricky.
This once led to a huge time sink because someone on my team created a mistaken GitHub issue claiming our TensorFlow model had a bug (since the notebook was producing an error). We spent all this time trying to reproduce it and figure out why it wasn't working, and eventually realized it was because of this hidden auto-reload setting on his specific IPython setup that caused Caffe to evict TensorFlow just for his specific usage pattern, resulting in strange errors because the TensorFlow model was no longer loaded in GPU memory.
There can be other problems too, like auto-reloading modules that have large start-up times (say if they load a very large model into memory). Sometimes you want to re-run a cell without auto-reload, even if you still want selective auto-reload functionality in other parts.
For example, I might make two shell tabs in tmux, and make one a small rectangle towards the bottom of the screen (holds my running IPython session), and a large rectangle above it (holds my Emacs where I’m editing source code).
And I might have a third shell tab somewhere that detects any time source files are changed and re-runs unit tests.
I do the tmux/vim too, but for exploratory work the experience is less well-integrated than it could be with an Rstudio-like IDE.
Rodeo  was an attempt at an IDE but development died, and now that yhat's been acquired, there's no sign of any further development. I wish the Jupyter folks would push more in this direction (and they are with Jupyter Lab) but I get the sense they are really invested in the notebook paradigm.
... and switch back-and-forth between notebook mode and text editor mode....
For me, at least, pedagogic and throw-away situations aren't a tiny subset. They're most of what I do. It's exploratory work, figuring out how the data behaves, if the data behaves, where it needs to be cleaned, churning through great heaps of experiments and iterations before hitting on the ultimate plan, and putting together a presentation to help explain what I finally settled on to colleagues and stakeholders.
Only after sinking a whole lot of sweat into that process do I go on to start building anything that we intend to keep. At which point, forget Jupyter notebooks, I'm typically not even working in Python anymore for that part of the job.
This is what is typically done out there but I suggest it breaks the feedback loop between the scientist roles and the developer roles. In rapidly changing environments those feedback loops could be crucial.
It's similar to what Wall Street folks did (still do?)--quants write models in Excel/VBA and pass them over to developers who would rewrite them in Java for production. There's a natural impedance mismatch, and back-and-forths are difficult.
I think a better approach would be for data scientists to write somewhat production-ready code, send it to prod (with the help of devs), get feedback from the production environment as well as get a sense of what tricks are needed for prod, and then iterate on that code. It also helps to remove the insulation between data scientists and the real world.
For me it's really down to efficiency. Writing somewhat production-ready code is more expensive and time-consuming than blithely hacking. In the early stages of a new project, I know that almost everything I'm doing will get thrown away. For the most interesting projects, there's even a decent chance that it will be a complete failure and everything gets thrown away. So, at that stage in the game, I'm inclined to say that any extra effort spent on production readiness is just a waste of time and money. Fail fast, YAGNI, etc.
I do agree that notebooks are good for writing throwaway code, but of n failed notebooks, typically there's one that we'd like to bring to production. That's typically the one notebook we'd want to be production ready.
When I say production-readiness, I don't mean actually working in production boilerplate in the first iteration (maybe in later iterations...). I mean writing the code in a way that lends itself to easy productionization through observance of certain constraints, e.g. being cognizant of environment/scoping/global state/namespace conflicts, writing model code in modular units (functions or classes depending on the use case) rather than just imperative line-by-line code, etc. These tiny disciplines are almost effortless but can lower the friction of iterating between model and production.
In data science work, the real proof of the pudding is in production, not in unit tests. Most people don't want to admit this but unit testing doesn't work as well in the mathematical modeling world as they do in the software development world -- much of the time our inputs aren't discrete/enumerable, and the state-space is large or infinite. So it's really important to be able to iterate between production and modeling. If I ever need to go back to my interactive environment to experiment and change the logic, there should be an easy path to flow that back into production. Right now notebook environments don't aid in that. I've observed OTOH that IDE environments do.
The thing is though, you should be involving code reviewers even at this stage, to review both the statistical methodology you intend for your experiments, and also the source code you believe implements that methodology. (Even when working alone, but absolutely when part of a team).
Instead of seeing the notebook as a big series of scratch-pad attempts to get something right, you should be using pull requests and code review as that scratch pad.
Additionally, the functions, classes and modules you create to do the work of exploring data fidelity, cleaning pre-treatments, or parameter sweeps through sets of experiment-specific parameter bundles — all that should be written like proper, testable, well-designed code, that lives in separate libraries or packages to facilitate re-using it without reinventing the wheel or copy/pasting from some old notebook, etc.
By that point, the notebook you’d use to explore data behavior or to invoke distributed training across a bunch of parameter values would be a tiny notebook that just imports everything it needs from properly maintained helper libraries you wrote.
And the value of the notebook over the same code just living in an easy-to-review script starts to be extremely questionable.
1. Having tooling support for source controlling a notebook without all the garbage generated json causing issues.
2. Having tooling support for testing
3. Having style and review guidelines for notebooked code
In other words, treat a notebook as a first class citizen, and as a legitimate part of the workflow, and it stops feeling like a crufty, unreproducible, hacky mess.
For example, notebooks inherently intermix units of display logic with units of implementation logic, but obviously these are separate concerns (in the spirit of e.g. Model-View-Controller), and you shouldn’t be writing “a module” (the notebook) that intermixes them & requires wacky coupling of display concerns for implementation questions.
(This also extends to the formatting of non-code aspects of notebooks too, which should be reviewed as a separate concern... much like how in LaTeX, what content I write is a different concern than how it renders).
Imagine the added strain on e.g. a bunch of pytest code you’ve already written for an underlying implementation library if you now require it to be used for also automatically testing display logic too.
Finally, item 3. is also tricky because the point of a notebook is presentation, so the style guide might rightfully be different. Now if you have some implementation unit (some code block, function body, whatever) it’s suddenly a debate whether it should be styled for presentation or styled according to the team’s source code guidelines .. and you’d need more linting tools that can be surgically used on subsets of the notebook, which seems needlessly complicated in comparison to just factoring out units of implementation logic into a separate helper module in the first place.
For these reasons, it is actually quite hard to treat a notebook as a first-class citizen in any sense beyond mere convention.. which is useless in the automated testing and review side of the issue.
So start with the assumption that you have magic tooling (which I know exists) that allows you to ignore diffs in generated output as far as code review and committing is concerned. The impl doesn't really matter. Then you also have magical tooling that allows you to write unit tests at the cell level, either treating a cell as a function container, and testing the single function within or treating a cell an "hermetic" function itself, where you configure globals, run the cell in the context of those globals, and see what the globals look like afterwords.
That's why I'm saying you need dedicated tooling support. And you're still thinking in the realm of "oh I test this normally". No no, I mean truly dedicated libraries that hook into the notebook environment itself. You might structure your tests as pytests, but under the hood they're spinning up a notebook env and doing magic.
>units of implementation logic into a separate helper module in the first place.
Well sure, you need this anyway though. Any kind of shared infrastructure should be factored out into a module, not because of the notebook environment, but because that's good abstraction independent of how you're presenting your models.
Writing a tool that suppresses output cells, infers global parameter blocks, etc., is trivial. Writing a linter with enough configurability to account for notebook presentation styling might be harder, but still straightforward.
Creating the raw tools that can do it is the easy part.
The hard part is that writing code for re-use and testability and with reasonable low-effort best practices at separating concerns and having modularity — all that is antithetical to the whole purpose of the notebook.
So why bother contorting the testing apparatus to accomodate testing something that is created with throw-away design principles from the start?
As soon as you start using the design principles from the beginning of the first prototype or first data exploration plot, then the value of putting them in notebook format goes away, and you’re better off using testing tools that were meant for testing proper modules, than to shoe-horn notebooks into testing with notebook-specific testing tools.
I’d also argue that the benefits of starting out from a craftsmanship-first approach from the beginning, even in exploratory data analysis, has compounding benefits and you quickly reach a state where the extra craftsmanship leads to less time spent debugging, backtracking to understand a plotting error or diagnostic bug, and faster convergence on successful output artifacts, whether it’s a report on model accuracy or production-ready code.
Assuredly, my point is that none of these are incompatible with a notebook-like environment. You can have well crafted, well designed, good code in a notebook, and get the advantages of both craftsmanship and presentation.
Good tooling allows you to focus on the craftsmanship.
>The hard part is that writing code for re-use and testability and with reasonable low-effort best practices at separating concerns and having modularity
These are all hard normally, its just that we have tooling that makes it somewhat less difficult. To be reductive, you're saying "well crafted software is difficult", which I agree with, "and so since we don't have the tooling to make well crafted software as easy in a notebook environment as in the environment we've used for 20-50 years now, we should not use the notebook", which I disagree with since you can also just say "and so we should create the tooling to mature the notebook environment".
Basically, to answer your question:
>So why bother contorting the testing apparatus to accomodate testing something that is created with throw-away design principles from the start?
Don't write notebooks with throw away design principles from the start. Treat them like mature parts of a workflow and in all likelyhood, they'll perform like one. Use good tooling, good design, and good craftsmanship when writing your notebooks (much as you would with any other piece of code you wrote) and they won't be created with "throw-away design principles".
Yes, if you treat notebooks like an unstructured second class citizen you'll get bad results, but that's true of any tool. So don't do that.
Essentially when you say the phrase “treat notebooks like first-class citizens” you’re baking in all kinds of statements about the design-level thinking that should be used for good craftsmanship when coding in the notebook.
This still won’t address the intrinsic mixing of concerns (especially units of display), but overall it roughly means that “treating notebooks as first class citizens” translates to “treat the notebook like a thin execution environment / IDE, but develop code in exactly the way you would in more standard settings.”
To me this falls flat because that’s not why people want to use a notebook. Generally they want to use it because it’s superficially easier to jumble all concerns into a single context and not think about coupling or separation, and just disregard testing and other best practices.
The notebook is optimized for this way of working, and I’m trying to call into question the underlying claim that it’s ever worthwhile to write code that way if there’s even slightest need for re-use or reproducibility.
Separately, a huge bunch of this sort of notebook usage specifically is for expressly presenting the notebook, almost like interactive slides, to other people (in which case the goals are completely antithetical to good software practices, implying implementation units should be factored out if the priority is presentation).
Basically I’m saying you’re sweeping a bunch of stuff under the rug by lumping testing, linting, tooling, and software craftsmanship all under the term “first-class citizen.”
The other reality is that notebooks aren’t first-class units, at least in Python. You can’t import a notebook like a module, unless you do a lossy export to a .py file (in which case, why weren’t you just writing the .py file to begin with and only putting units of display in the notebook that imports the .py file?) — not to mention that you’d need custom tooling instead of mature tooling to apply linting, testing, packaging, etc., like we discussed above.
But this is not unique to notebooks. You can very easily do the exact same things with raw python files, its just that in the ecosystem you work in, raw text files are treated more maturely.
I find notebooks very useful as a form of main method. You don't generally go importing your main methods anyway.
You shouldn't go developing all your code in notebooks, much as you shouldn't go developing all of your code in main methods that don't have classes. Shared infrastructure should be factored out no matter what.
Your complaints appear to come down to "people can apply bad software development practices in notebooks, therefore notebooks shouldn't be used". And my point is that no, you can just not apply the bad software practices, and that solves the problem too.
This is non-sequitur to the whole discussion. You can write bad code in any tool. That has no bearing on this.
Instead we should ask, "what does it require to write good code in a given tool."
In plain source files, we know the answer, with lots of theory of design, decoupling, architectural rules, refactoring etc. As well as mature tools for code review, viewing diffs easily, automated testing.
In notebooks, the answer is that you have to jump through a lot of hoops to write things in a non-notebook-way -- that is, specifically in a way where you factor things out into the text files anyway -- if you want those good patterns.
For example, you mention:
> "I find notebooks very useful as a form of main method. You don't generally go importing your main methods anyway."
I totally agree. Viewed this way, the notebook-based "main" method is just a driver of other code. Meaning, you don't do much work in the notebook at all. You factor things out into other modules, etc., and then put as little as is needed to drive the code into the notebook.
Which is what I have been saying all along. It reveals the notebook to be an anti-pattern (because that driver code doesn't need or benefit from any aspects of the notebook environment that are expressly designed to act like a messy linear script of ad hoc implementation units mixed with ad hoc display units).
I would say your suggested way of using notebooks is exactly an example of what reveals that notebooks aren't very good for the intended use cases (like people defining global variables for experiment parameters at the top, and then "running an experiment" becomes changing those values and re-running the cells of the notebook).
This is among the most commonly advertised and praised ways of using a notebook, so it's not like some extremely rare situation that only arises in a place with bad software practices. It's practically the intended use of notebooks.
That's why I'm saying they are self-defeating. Once you take an approach where you factor things out and leave the notebook to be just a simplistic driver script, it's immediately clear that driver scripts can just be scripts, not notebooks, and don't benefit from all the intended ad-hoc-ery that is a first-class, intended workflow of the notebook design.
I guess I still consider this a "notebook-y" way. Its just good practice, instead of bad practice. But still notebooky.
You seem to be using notbooky to mean sloppy and disorganized, whereas I mean it as interactive and literate. Those need not be correlated, and I think that "driver scripts" still benefit a lot from the interativity and literateness
Not exactly. I'm using "notebooky" to mean whatever the prominent, advertised, praised and recommended usage patterns and workflows are for a large body of notebooks and from prominent presentations using the notebook as the lingua franca, especially in circles where the notebook is claimed to be central to "reproducible science" or where the notebook is described like a software equivalent of a "lab notebook."
The way that the notebook community, from academics and prominent leaders, to people who give talks this way, on down to data science practitioners, recommends using notebooks seems to inherently result in what you call "sloppy and disorganized" code. That code is the intended type of workflow, which is why I am trying to distill out principles for why it's arguably not a good idea. (Meaning why the intended way to use notebooks is self-defeating.)
ipython3 --InteractiveShellApp.exec_lines='["from math import *"]'
[TerminalIPythonApp] CRITICAL | The 'exec_lines' trait of a TerminalIPythonApp instance must be a list, but a value of class 'str' (i.e. '["from') was specified.
$ ipython3 --InteractiveShellApp.exec_lines='["from math import *"]'
Python 3.6.5 |Anaconda, Inc.| (default, Apr 26 2018, 08:42:37)
Type 'copyright', 'credits' or 'license' for more information
IPython 6.4.0 -- An enhanced Interactive Python. Type '?' for help.
In : sin(5)
# This is a bit hacky: we use the -i flag to force the interpreter into
# interactive mode after the initial commands are executed.
# The "proper" way to do this is probably to set up a PYTHONSTARTUP file
# and put the initial commands in there.
exec ipython3 --no-banner --no-confirm-exit -i -c '
from math import *
import sys, os, platform
print("== IPython %s Calculator REPL ==" % platform.python_version())
File "<ipython-input-1-a0531db54fa8>", line 1
SyntaxError: invalid syntax
Your "stack" does nothing for solving/including ODEs, PDEs, DAEs, Fourier analysis, numerical integration, automatic differentiation, linear equation system solvers, preconditioners, nonlinear equation system solvers, the entire field of optimization, inverse problems, statistical methods, Monte Carlo simulations, molecular dynamics, PIC methods, geometric integration, lattice quantum field theory, molecular dynamics, ab initio methods, density functional theory, finite difference/volume/element methods, lattice Boltzmann methods, boundary integral methods, mesh generation methods, error estimation, uncertainty quantification...
Those are just off the top of my head, the list goes on and on.
Just for some context:
Counter-example: I use IPython in the terminal, outside of Jupyter.
IPython remains as the project maintaining the Python-specific parts of that stack. It's not deprecated, but has been limited in scope.
Edit: The link to the file: https://hpc.nih.gov/training/handouts/171121_python_in_hpc.p...
Edit: I can read it now.
We built an online service on top of Jupyter which takes away the effort of handling JupyterHub. It would be great if we could hear some feedback in the context of this conversation. We feel that DataCabinet is better in ways because it provides:
a. Autoscaling according to number of users
b. Sharing full containers easily between people. You can install pip/conda binaries and share with students/users.
c. Shared storage so nbgrader works seamlessly.
Here is a full comparison: https://datacabinet.info/pricing.html
Please excuse our landing page, it just got created today and we are fixing it.