Instructors are writing their lectures as IPython notebooks, and distributing them to students, who then work through them in their JupyterHub environment.
Our most ambitious so far has been setting up each student in the course with a p2.xlarge machine with cuda and TensorFlow so they could do deep learning work for their final projects.
We supported 15 courses last year, and got deployment time for an implementation down to only 2-3 hours.
In conclusion, IPython good, JupyterHub good.
Edit: surfacing the link to the open source repo on GitHub https://github.com/harvard/cloudJHub
I've had many courses that were bogged down by software setup issues in college; I would rather than not be the case.
I want to say that, if it's pedagogically valuable, then it needs to be made into a small lab course (or part of the lab unit for an intro class), and taught once, in an organized manner.
And then stop letting professors hide behind this lame excuse so that they can get on with teaching the stuff that their course is actually about.
(I am actually leading a machine learning for high-schoolers camp in 2 weeks and we are using Jupyter notebooks so that all students, with heterogeneous backgrounds, will start in the same place and get to the fun stuff fast. Many will never have used Python and will not know or care about 2.7 vs 3, just to give the most high-level and basic example!)
An exception would be the example of the deep learning project work. In this case JupyterHub was utilized as an easy way to deploy a centrally managed, cost effective environment for a large class to use GPU resources without the risk of running up huge AWS costs for each student.
When the number of hours are limited, it's best to skip it entirely, and just provide a solid paper tutorial.
Uh, you know its Harvard we're talking about right?
It's almost a waste of time teaching people to setup a deep learning stack.
NVidia will break everything you do with every release and any instructions you write will be outdated in weeks.
For example, the TensorFlow/CUDA/CuDNN installation changes continually because you can't install the default releases of any of them and get a working system.
I do understand the dilemma. I work at a K-12 and the office next to mine is where they put together the science lab kits for students. It takes a fair bit of understanding to do that correctly sometimes, and that preparation work is some knowledge the students seem to miss out on in order to get to the subject matter. My coworker has mentioned on more than one occasion that with certain modules it feels like she does most of the work and the students just do the final step.
It is true that we had to deal with some issues that might not have occurred had students gone through the process of setting up the environment themselves, like have to rebuild the machine of the student who uninstalled Cuda.
Wow! How expensive was this? Do you do any sort of shutdown/startup work or use pre-empt instances?
The average cost over all of the other courses was something like $2-3 per month per student. The deep learning course ended up being closer to $20 per student. Thanks to Amazon Educate almost the entire cost was covered with credit.
I wrote a tool that does that with Keras but I'm not sure if it's actually useful for real-world use cases.
I do think that Jupyter notebooks are an amazing thing for CS Education. I wish more college level classes would utilize them. It adds a nice layer of interactive experimentation to any program/assignment/project.
What I ended up using was z2jh , which is working out great for right now!
We aren't yet allowing students to use GPUs or any libs that would require them, but we may look into that in the future.
Feel free to reach out to me if you would like more info.
"we had an issue with Oauth when we upgraded JupyterHub version 0.7 to version 0.8. The instance spawner we wrote needed to be updated to fix the issue. The case that opened about this in our repository* fixed with the latest update of the instance spawner"
So it seems we could be using GitHub standard OAuth now. But 95% of our implementations utilize Canvas auth reconciling with our university AD.
I know there've been some work to have instructions on how to deploy on AWS, and work on the k8s helm charts to do so if that can be of help. If any work could be consolidated to both decrease the workload of you (and us), that would be good. Are any of you attending JupyterCon in August ? In person feedback is always welcome (Sturday August 25th is open, free, Community Day/ hackathon / sprint/ open studio, where the Jupyter team will be there)