Does anyone know how to setup a multi user notebook? Because of the nature of th...

yuvipanda · on April 4, 2017

Disclaimer: I work on the hub.

Sorry to hear you found the hub too complex. We're working on making easier-to-use hub setups that fit different use cases. Can you tell us a little more about what your use case was and (optionally) which parts of the hub setup you found too complex?

Thanks!

sandGorgon · on April 4, 2017

Thanks for replying.

So it's a bunch of different technologies - nodejs,etc. I'm kinda wondering if it can be built in Python itself. Make it part of a normal jupyter install, so just a "jupyter hub start " will work ?

EDIT: adding to that, you have built a nodejs based http proxy - can you not build it within Python (for uniformity) or nginx (for performance as well as mind share) ? Do you even need to mandate a http proxy ?

Second question is that can it run in a multiprocess - I don't want to run it in interactive mode, but just straight top to bottom. Perhaps there's huge memory savings there.

carreau · on April 4, 2017

I think yuvi wrote the CHP on nginx (https://github.com/yuvipanda/jupyterhub-nginx-chp) when we first wrote CHP, node was the only viable solution to have a dynamic websocket proxy. Nowdays Go, or Python 3 with AsyncIO may be potential contenders. It may be possible to rewrite in Python but time is limitted. I'm unsure about your second question.. run notebook top to bottom ? `nbconvert --to notebook --execute --inplace yournotebook.ipynb` ?

sandGorgon · on April 5, 2017

Well the second question was also related to multi-user deployment. From what I understand, jupyterhub will spawn multiple kernels every time someone logs in. But a lot of the time (most of the time?), you don't intend people logging into your jupyter notebook to be doing interactive stuff - maybe they just want to run the whole thing as a dashboard.

So it becomes a traditional webapp use case. Do you need all the proxy/websocket, stuff to do this ? Your nbconvert command still needs every user to spawn his own kernel right ?

About the first part - it would be great to have a simpler jupyterhub. One of the steps is to have everything in Python.

yuvipanda · on April 5, 2017

JupyterHub isn't really setup to do a 'dashboard' style web application - is purely intended for interactive use. The design choices made reflect this.

There's ongoing work on formalizing the proxy better (https://github.com/jupyterhub/jupyterhub/issues/848) - someone will probably write a pure python proxy when that gets merged :)

sandGorgon · on April 5, 2017

I just wanted to make sure you guys were aware that it is a large component of the use case. The very typical prototype in jupyter .. to ..Rewrite in production code is shortened significantly by doing this.

All the tools already exist in jupyter - except one: lightweight multiuser. I would argue that building this is going to be a fairly trivial thing for you guys (as compared to other features you build), but the end user benefit is immense.

Jupyter becomes much more than an interactive scratchpad - it becomes a full blown prototyping environment for data science and reporting. I would say, you would even go against tableau in a lot of use cases.

Please do think about it. My company will be happy to contribute to a gofundme on this.

carreau · on April 5, 2017

Reply to this plus 2 comments up. Hub spawns _servers_, not kernels. There are a lot of indirection layers, and indeed, being able to _view_ a notebook without starting a kernel is on the todo list. The multi user collaboration is in progress, it's more complicated than it looks. One of the issue is that if this is a "solved" [with many quotes] problem for static documents, as soon as you have code execution it becomes really tricky. The kernel need to run as someone, but who ? The owner of the document ? What are the permission you give to who and how ? There are some case where there are possible answers, but which are really hard to tackle in a generic way across programming languages and various kind of deployments. Ian has an already well advance JupyterLab Prototype that you can connect to Google Drive for live editing. If your company is interested in funding something like that, feel free to write to any of us privately (git log, and grep to find emails), and we can likely setup a contract with numfocus (non profit that handle our funds), the advantage will be that it will be tax deductible for your company (unlike most of gofundme campaigns).

sandGorgon · on April 5, 2017

I wish we had the kind of money to fund the full development - I mentioned gofundme because a small startup in India will not be able to do that, even though we want to. But I have a feeling that a lot of us will want to us as well.

I'm talking specifically about the usecase of code-execution (especially dashboards).

Here's a small point from me - perhaps you are overcomplicating the usecase for 90% of us. Give a proper ssl/tls+bcrypted password setup and roles: Editor and User.

I dont think you should be worried here in the context of people wanting to run a full on sagemath cloud kind of a thing.

If you can give me a low resource way of letting 100 "Users" on a dashboard form and one "Editor" (who can actually edit the underlying notebook), I'm golden. And I'm willing to bet that so will 90% of your audience.

acosmism · on April 5, 2017

check out http://gryd.us ; it looks somewhat like what your describing