
Jupyter Receives the ACM Software System Award - williamstein
https://blog.jupyter.org/jupyter-receives-the-acm-software-system-award-d433b0dfe3a2
======
carljv
The Jupyter team deserves every accolade they get and more. The console,
notebook, and now JupyterLab are some of the key reasons why Python's data
ecosystem thrives.

I think Jupyter notebooks are quite useful as "rich display" shells. I often
use them to set up simple interactive demos or tutorials to show folks or keep
notes or scratch for myself.

That being said, I do think the "reproducibility" aspect of the notebook is
overblown for the reasons other comments cite. Notebooks are hard to version
control and diff, and are easy to "corrupt." I often see Jupyter notebooks
described as "literate programs," and I really don't think that's an apt
description. The notebook is basically the IPython shell exposed to the
browser where you can display rich output.

This is where I think the R ecosystem's approach to the problem is better (a
bit like org-mode & org-babel). For them, there is a literate program in plain
text. Code blocks can be executed interactively and results displayed inline
by a "viewer" on the document (like that provided by RStudio), but executing
code doesn't change the source code of the program, and diffs/versions are
only created by editing the source. At any point, the file can be "compiled"
or processed into a static output document like HTML or PDF.

This is essentially literate programming but with an intermediate
"interactive" feature facilitated by an external program. RMarkdown source
doesn't know its being interacted with or executed, and you can edit it like
any other literate program.

Interaction, reproducibility, and publication have fundamental tensions with
each other. Jupyter notebooks are trying to do all three in the same
software/format, and my sense is that they're starting to strain against those
tensions.

~~~
your-nanny
I agree, 120%.

I like the r approach so much more.

~~~
carljv
I mean, as a medium for interactive exploration where you might want graphs
and widgets or other rich/dynamic output, I still think the notebook is
superior. But as a medium for developing complete, share-able, reproducible
data analyses, I do think R has the upper hand.

~~~
creddit
Graphs, widgets and other rich/dynamic output is also possible with the R
approach.

[https://rmarkdown.rstudio.com/](https://rmarkdown.rstudio.com/)

Additionally, Rstudio is an incredibly powerful IDE for data analysis.

EDIT: Interestingly, however, I still use ESS
[https://ess.r-project.org/](https://ess.r-project.org/) but that's because I
love Emacs too much :D

~~~
carljv
I understand. I believe I pointed that all out my comment above. I wasn't
saying that I find the notebooks superior because they allow for rich &
dynamic output, but that I find it superior to RStudio when __all you want
__is a quick exploratory REPL capable of rich /dynamic output. I simply find
it easier to fire up a notebook and start noodling around than writing an
RMarkdown notebook. That really only holds if I'm not overly concerned with
keeping or sharing the notebook. Otherwise, I believe RMarkdown is the better
option.

I also tend gravitate towards ESS, and probably split my R development time
between emacs and RStudio. I've even written a very kludgy Rmd notebook mode
that uses overlays to show evaluation results from code chunks. But RStudio is
very well-designed and ESS just doesn't compare feature-wise, sadly.

------
angrygoat
So well deserved. Jupyter is critical infrastructure, helping the scientific
community address the reproducibility crisis. It's so great that it's free and
open source, and can be shared, used and contributed to by scientists all
around the world.

~~~
scaryclam
Not only that, it's great for tutorial material as well!

------
msravi
Trivia: Ju-py-ter = Julia Python Terminal

Although it's gone far beyond just Julia and Python now.

Edit: Ahurmazda is right.

...the core programming languages supported by Jupyter are Julia, Python and
R. While the name Jupyter is not a direct acronym for these languages, it nods
its head in those directions. In particular, the "y" in the middle of Jupyter
was chosen to honor our Python heritage.

[https://github.com/jupyter/design/wiki/Jupyter-
Logo](https://github.com/jupyter/design/wiki/Jupyter-Logo)

~~~
mynewtb
Is it pronounced jupeeter or jupieter?

~~~
TallGuyShort
And is NumPy "Numb Pie" or does it rhyme with "lumpy"? (My apologies to
everyone who will picture something lumpy next time they're doing a bunch of
math).

~~~
divideby0829
It think it's pronounced numpy...

That said, I've always rhymed it with lumpy, haha

~~~
mkl
In conference talks, most people (including NumPy's creators) pronounce it
"numb pie".

------
sytse
Jupyter is awesome, well deserved award. Our CTO is working to integrate
JupyterLab and JupyterHub directly into GitLab.

~~~
pletnes
Getting a sensible git-diff would be great!

~~~
carreau
nbime can help ! NoteBook DIff and MErge.
[https://github.com/jupyter/nbdime](https://github.com/jupyter/nbdime)

------
kelvin0
I started using Jupyter + Python recently, can't say enough good things about
the project.

Sometimes you want to present data, graphics and have a bit of interactivity.
The notebooks make it easy to share your code/data/graphics. And it beats a
PowerPoint any day (for this use case anyway).

Thanks Jupyter Team!

------
gbrown
Am I the only person in the universe who doesn't like Jupyter? I much prefer
tools like Rmarkdown and Sweave.

~~~
jasongrout
We would love to hear what you like about Rmarkdown and Sweave. Jupyter
tooling is always improving, and we are very interested in engaging with users
about their needs, and helping grow the ecosystem to be able to address those.

~~~
nerdponx
My biggest frustrations with Jupyter are (see #4 for comments on Sweave etc):

1\. The default front-end is a weak platform for getting work done.

It's a JavaScript code editor. It will never be as good as my personal text
editor configuration. It will never be as good as an IDE like RStudio, Spyder,
or Pycharm. It's good that there are keyboard shortcuts for doing things like
adding cells, and extensions for things like folding cells and adding a table
of contents. But it still isn't terribly comfortable to use all day. Also I
personally hate doing everything in a browser. Apart from some useful notebook
extensions, there are no viable alternative front ends yet.

2\. Running a remote kernel is a pain in the ass (cat a config file then
manually tunnel 4 ports over SSH), and I can't seem to get it to work on
Windows at all.

This is an issue at my company because we do a lot of work on remote servers
that can be accessed only through SSH or JupyterHub. Individual users do not
have control over the latter, so we are stuck with the inadequate default
experience I just described above.

3\. No kernel other than Ipython is mature.

IRKernel is getting there. Everything else is at best a beta-quality product.

4\. Notebooks are not a plain text file format.

Hand editing a notebook is messy. They do not play well with version control
systems and diff tools. RMarkdown and Knitr/Sweave are just preprocessors for
established plain text formats (Markdown and Latex with some extra syntax).
With those formats you can take advantage of a wealth of existing tooling, as
well as having the freedom to edit the file in a normal text editor without
having to rely on a special front end. Ironically having everything formatted
as JSON should make it easier to write those special front ends, but I have
not seen any good ones yet.

~~~
wenc
1\. Jupyter Lab (note: NOT Jupyter Notebook) is an attempt to make the
interface more IDE-like. It's still not Rstudio due to the Jupyter's notebook
nature, but it's close enough for me.

I do prefer Rstudio's REPL approach of being able to run code by line or by
blocks (likely inspired by MATLAB's IDE), rather than Jupyter's approach of
executing code by cell (which was inspired by Mathematica). They both let you
try stuff out easily while maintaining state, but the former is far easier to
productionize.

2\. Remote kernels over SSH aren't that hard -- I do this all the time via SSH
tunnels. I start Jupyter Lab in an SSH console (usually on a cloud-based VM),
and create a tunnel to port 8888 (the default) using my Windows SSH app
(Bitvise). 1 port. That's it.

3\. No comment - I only use the Python kernel.

4\. Correct. Notebooks do present challenges for version control.

~~~
nerdponx
_Remote kernels over SSH aren 't that hard -- I do this all the time via SSH
tunnels. I start Jupyter Lab in an SSH console (usually on a cloud-based VM),
and create a tunnel to port 8888 (the default) using my Windows SSH app
(Bitvise). 1 port. That's it._

I want the opposite. I want to use a remote kernel with a local client.

~~~
wenc
Umm, yes, in my case, the kernel is running remotely on a cloud VM. My client
is a local browser (Chrome) which connects to localhost:8888, which is a
tunnel set up to connect to the remote machine on port 8888.

This lets me run computationally heavy Jupyter calculations on a beefy remote
backend in the cloud. My local browser merely talks to that backend via a
tunnel.

Here's something on the web that describes this [1] -- except with Bitvise on
Windows, you don't have to enter any SSH commands. The tunnel setup etc. is
all done via a GUI. This is a pretty standard SSH tunnel technique. You can
use this for more than just Jupyter.

[1] [http://www.vickyfu.com/2017/04/using-jupyter-notebook-
remote...](http://www.vickyfu.com/2017/04/using-jupyter-notebook-remotely-in-
azure-vm/)

~~~
nerdponx
Again, that's not what I mean. I want to run Jupyter (or some other front-end)
on my laptop and have it talk to a kernel running on a server. You're
describing running both Jupyter and the kernel on the server.

~~~
wenc
Oh I see now. You want to run the raw kernel with no front-end on the remote
machine and communicate with it via the 0MQ/JSON transport layer. I'm curious,
what is the advantage of doing this vs. simply running an instance of Jupyter
on a remote machine?

~~~
nerdponx
I don't necessarily want to use Jupyter as the front end. This way lets me use
e.g. Pycharm with the kernel running in a console.

BTW I managed to get it to work. I think I had missed a port the first time I
tried.

------
cosmic_ape
IPython's interactive shell is way more convenient than using the default
python command line when trying things out. But I just don't get why one would
need the whole Jupiter thing on top of that, except for maybe making
presentations.

One thing that I think is missing from IPython though is the ability to save a
given state of the interpreter, with all the variables in it. So that one
could preform a time consuming data loading/parsing once, and restart from
that point if some variables get messed up. Jupiter cann't do that either,
afaik.

~~~
mkl
Anything graphical or interactive: maths, science, data, web, machine
learning, etc. Interactive includes interactive widgets, but also almost any
kind of exploratory programming.

Even stuff that's technically plain text is easier when you can display tables
and other formatted text. E.g. I have a tiny little notebook that generates
LaTeX code for a normal distribution table; it's a notebook because then I can
display an HTML preview in a few lines of code.

Jupyter can't save interpreter state - that seems essentially impossible
without adding state-saving and -loading code to every single library and
dependency.

~~~
cosmic_ape
>>Anything graphical or interactive

but ipython is already well integrated with matplotlib in the --pylab mode,
and is pretty interactive. Thats how I use it.

re:saving state - its technically a nightmare, I agree. I though it might be
possible at OS level - just dumping the whole process. There still would be
issues of what to do with open files, network connections, etc, but somehow it
seems that in many use cases that would be enough.

~~~
mkl
Yeah, I used matplotlib like that for years (at times all day every day), but
the notebook interface makes it much easier to keep track of lots of figures
and where they actually came from, so I switched to that 4-5 years ago. The
figures are also much more lasting, as they save in the notebook right next to
the code, so it's a lot easier to go back and make sense of old work,
especially with markdown cells documenting things right there too.

I think even saving state by dumping the whole process is unfeasible. What
happens if some dependency gets upgraded, e.g. for a critical security hole?
The problems seem unavoidable, so I think we're stuck.

------
nmca
Thoroughly deserved - Jupyter is helping push science forwards.

------
smortaz
A huge congrats from Azure Notebooks which is entirely built around Jupyter.
We actually started way before that by connecting the visual studio REPL to
Jupyter and the whole experience (technically and people) has been delightful.

------
zitterbewegung
Wow this is awesome! It got the Award quickly also I think Juypter hasn't been
around as long as other recipients.

~~~
carreau
As we say in the blog post, the Jupyter Name has been around only since 2014,
but the work started in 2001. So 17 years is a good check of time !

~~~
filmor
Though the part that people mostly identify with the project, the Jupyter
Notebook was introduced only in 2011. I remember that vividly as I was writing
my thesises at that time and spending quite some time playing around with it
:)

~~~
carreau
Though the current notebook is the 6th prototype, so there've been quite some
work before actually being made "public". Hope the notebook didn't distract
you too much from your PhD !

~~~
po84
If open source didn't distract from ph.d's, we wouldn't have had ipython! ;)

------
ecesena
I'm excited for this award, it's certainly well deserved.

In my heart, I wish the "data community" would show more care for security
(and, related, privacy), with deeper focus on features that simplify access
control, and guidelines on how to enforce "reasonable defaults".

I fear that Jupyter in many companies is becoming the next Jenkins, with
unconstrained access to all data vs all infra, and this will lead to more and
more incidents and leaks.

I very much hope that recognitions like this one will foster not only better
tools and support, but also best practice and security considerations.

But, back to the focus of the post, congrats on this success!

~~~
carreau
Thanks for your comment. One of the next focus area, where we are looking for
funding and help, is making sure the right restrictions and permissions are in
place.

We probably will put that in the context of GDPR/HIPPA/FERPA and follow these
guidelines to make Jupyter "Ready" for these framework. We can't say that
Jupyter it itself compliant, as you need to see in which context it is
deployed, but we want to make it as easy as possible for a team of researcher
with low budget, or a companies with 1000+ user to make it easy to deploy a
secure, auditable and safe Jupyter environement.

------
Karishma1234
Well deserved. There are few softwares that totally amazed me when I first
used them. (I used Jupyter for the first time last week).

I was like "Did they actually achieve this?"

~~~
gnulinux
I've been using Jupyter for some time now since it was strongly
recommended/almost-required in my school's classes. One thing I think Jupyter
team achieved is reliability of common interface. Like, regardless what my
data is, white noise, music, image, matrix, human face... I know I can easily
output it and get some sane representation.

------
robohamburger
I really like jupyter and I am looking forward to jupyter lab and where it
takes computing.

It is really great for solving one off problems or learning.

The jupyter code itself while verbose is pretty extensible also. I just put
together something that lets me connect to my spark kubernetes pod.

I think being able to customize jupyter and add new kernels (languages) is
where it becomes really powerful and awesome.

~~~
dswalter
Jupyterlab is here. I've been using it basically full-time since August, but
it recently went to Beta, so it's ready for use.

The major downside is that it disables JS by default within notebooks, so if
you're using Bokeh you'll have to install the jupyterlab extension, but the
innovations around files, downloads, views on notebooks, etc. are worth the
price of admission.

------
cozzyd
Tools like Jupyter (and Mathematica) don't really match my mental model. I'm
fine using them as a REPL (in which case they're like a bloated ipython or
ROOT prompt), but as soon as I go back and change something, I get confused
about the internal state.

~~~
carreau
Internal state is often a problem, look at project like stichfix nodebook, and
dataflow kernels, they make things a bit easier by re-executing cells.

On of the issue is you always have internal state as soon as you interact with
a data source or sink. If you read/write from a API, then rest is stateful.
Your file system is stateful... etc.

It's an interesting but hard problem, we'll be happy to have more help with.

[https://github.com/dataflownb/dfkernel](https://github.com/dataflownb/dfkernel)
[https://multithreaded.stitchfix.com/blog/2017/07/26/nodebook...](https://multithreaded.stitchfix.com/blog/2017/07/26/nodebook/)

~~~
gnulinux
Problem with that is, sometimes some cell in the middle is computationally
intensive and I don't wanna run it again. Just going back and changing one
function shouldn't run the whole computation.

------
oh-kumudo
Congratulations! Jupyter is such a gem. It has become such a critical piece in
the whole data analytics/ML landscape that I can't think of living without.

------
vienno
This is great. Without Jupyter, I doubt I would have started learning data
science a few years ago.

------
criddell
Do you have a favorite tutorial or simple use case that helped you grok it?

------
wlll
I just started using Jupyter a couple of days ago as I'm starting to learn ML.
I'm really impressed by it. Would love something similar for Ruby.

~~~
carreau
Good news ! It exists ! It's called Jupyter ! Just install one of the non-
python kernels[1], for example the Ruby one [2], and create a new Ruby
Notebook !

1: [https://github.com/jupyter/jupyter/wiki/Jupyter-
kernels](https://github.com/jupyter/jupyter/wiki/Jupyter-kernels) 2:
[https://github.com/SciRuby/iruby](https://github.com/SciRuby/iruby)

------
uptownfunk
Wish they would add variable inspection to Jupiterlab!

------
monkeydust
I am non programmer business guy who 'excels' a lot. Been teaching myself
pandas through notebook. Awesome combination.

------
rishabhparikh
Cannot imagine liking data science nearly as much as I do if Jupyter didn't
make it so easy to quickly test new ideas. Well-deserved.

------
pilchardbreath
The other "innovations" that ACM lists alongside were real innovations " Unix,
TeX, S (R’s predecessor), the Web, Mosaic, Java, INGRES " now they are handing
out awards for "copy commercial software but make it free" projects. Its funny
how copying is a bad thing in an essay but applauded in software (as long as
it has the right license).

