
Netflix open-sources Polynote, an IDE-inspired polyglot notebook - type_enthusiast
https://medium.com/netflix-techblog/open-sourcing-polynote-an-ide-inspired-polyglot-notebook-7f929d3f447
======
neves
I just find reproducible notebooks at the internet. It is really rare to find
them from coworkers. If they aren't trained as developers, it is almost
impossible. Their solution for this problem looks really efficient and is
really simple and brilliant:

> Writing Polynote’s code interpretation from scratch allowed us to do away
> with this global, mutable state. By keeping track of the variables defined
> in each cell, Polynote constructs the input state for a given cell based on
> the cells that have run above it. Making the position of a cell important in
> its execution semantics enforces the principle of least surprise, allowing
> users to read the notebook from top to bottom. It ensures reproducibility by
> making it far more likely that running the notebook sequentially will work.

~~~
type_enthusiast
Thanks for the kind feedback. It's a young project to be honest, but I'm
pretty proud of what we've done with only two contributors so far. With
community participation I think we could support many more languages pretty
quickly!

~~~
yomritoyj
I really have always wished for reproducibility. Thanks for taking up this
feature. How do you handle aliasing and references inside objects? Suppose I
have

    
    
        #Cell 1
        a = [1,2,3]
        b = (a,True)
    
        #Cell 2
        b[0][0] = 5
    
        #Cell 3
        print(sum(a))
    

Now if I change Cell 2 to

    
    
       # Cell 2'
       b[0][0] = 4
    

and execute, Cell 3's result becomes stale. Do you track such dependencies?
Would really love to read more about the underlying implementation.

~~~
type_enthusiast
If you mutate an object itself, we can't really track that. There's no magic
going on; you can break the state if you use mutable objects. It's less of an
issue in Scala where immutable data structures are the norm, but I can imagine
it would be disappointing in Python.

Currently it takes a shallow copy of the state output by each cell, meaning
every value is going to be a primitive value or a reference. If it's a
reference to mutable state, you're kind of on your own with respect to keeping
reproducibility. I felt like this was a good compromise between strictly
enforced reproducibility and practicality; if it turns out to be confusing we
could consider deep copying the state, or having an option to do that (I could
imagine it being pretty bad for efficiency in a lot of ML use cases, though).

~~~
cm2187
I am not familiar with those notenooks. What would be wrong with re-executing
all the cells below the one that changed?

~~~
steve19
That is usually a feature. The reason it's not the default everytime you
change a cell if that cells can contain long running calculations.

------
aargh_aargh
According to the article, the most interesting feature compared to Jupyter is
no hidden state - if you delete a cell, the variables it set are gone. Also,
you can mix languages - you'll be able to access variables filled by prevously
executed cells in another language.

Personally, I'm looking forward to trying out the SQL support. I haven't seen
an elegant solution for SQL notebooks in Jupyter, it was always second-class
via Python or some such. Or have I missed something?

~~~
capableweb
> Also, you can mix languages

Interesting. Judging by that it seems to be implemented with a JVM language
and a screenshot shows "Scala" as a supported language, I'm guessing at least
all the JVM languages are supported (personally hope for Clojure) but can't
seem to find a list of supported languages anywhere in the post or on the
website.

What languages are supported by Polynote?

~~~
lalaithion
Looks like just Scala and Python right now:

[https://github.com/polynote/polynote/blob/08f0751138e2991cf7...](https://github.com/polynote/polynote/blob/08f0751138e2991cf77e711344c8edcd8a976929/polynote-
kernel/src/main/scala/polynote/kernel/interpreter/CoreInterpreters.scala)

~~~
rhizome
"Duoglot?"

------
vilos1611
If anyone would like a docker image, I created one today:
[https://hub.docker.com/r/greglinscheid/polynote](https://hub.docker.com/r/greglinscheid/polynote)
[https://github.com/Vilos92/polynote](https://github.com/Vilos92/polynote)

------
airstrike
I like this as a concept, but the JDK / jep requirements are a bit of a turn
off, personally... I understand they want it to speak Spark but that's not
exactly how I would imagine it worked from the name or the "polyglot notebook"
description

------
zmmmmm
While the reproducibility problem is definitely a issue, I'm not sure it's
such a big issue that I'd switch to a whole different notebook solution for
it. For most notebook scenarios, running from scratch works fine to ensure it
reproduces. Apart from this one feature, BeakerX does all the same things and
fits a lot better into the existing jupyter ecosystem.

~~~
type_enthusiast
To be clear, we're not out to supplant Jupyter. Anybody who's happy with their
Jupyter setup will likely find little value in Polynote. But it has plugged
some gaps we've had in our Scala ML research team at Netflix, so we thought
others might see some value as well.

~~~
type_enthusiast
And there are lots of teams at Netflix that are investing in Jupyter as well!
Plenty of room for both options.

------
airstrike
Somewhat off-topic, but what's with the lambda replacing the "n" letter? I'm
no expert in Greek but I thought lambda was the equivalent to the letter
"l"...

~~~
type_enthusiast
The logo was hastily designed by an amateur (me). I figured most people would
figure it out, pedantic people would complain, and we'd all have a good time
:)

We've had some better options contributed in the past couple of weeks, but as
long as we're going to change it I didn't want to rush that. So we stuck with
my questionable typographic treatment for the blog post.

(Edit: autocorrect typo)

~~~
nsgf
Atm it reads 'polilote' in Greek. You might want to substitute 'λ' for ΄ν'.

------
gen3
It looks like the editor this uses is Monaco, the editor in vscode, that’s
pretty cool.

~~~
type_enthusiast
It does! Monaco is one of the many awesome open source libraries that made
Polynote possible. We'll be discussing that at Scale by the Bay; check out our
talk if you're going!

------
prestonh
It seems like the tool was mainly invented to deal with the issue of hidden
state in notebooks, but I don't honestly see what the big deal is. Jupyter
notebook is a tool with hidden state being a gotcha that you can learn how to
deal with extremely quickly. I've been a Jupyter notebook for several years so
haven't had this problem often in recent memory, but I've led workshops where
we teach users how to use the notebook. Inevitably hidden state issues come
up, but students very quickly learn that restarting the kernel is a necessary
part of the workflow and figure out when they need to do it.

------
eob
If anyone on the Polynote team is lurking: curious if this is a successor to
the great work done by NTeract, which you funded (thanks!)

That project experimented with a lot of interesting themes I see echoed here.

~~~
type_enthusiast
It's not a successor; nteract is a separate project (part of the jupyter
ecosystem) and is alive and well. Polynote was started mainly to support use
cases of our Scala-based ML engineering teams. It's a little bit apples and
oranges.

------
wodenokoto
I have a love hate relationship with how R studio deals with hidden state in
notebooks. If you want to export an .rmd file to pdf, you have to run the
whole notebook from start to finish in order, sorta proving that the thing is
reproducible before export (maybe there is some technical reason as well)

It's nice know that your report actually worked, and is not showing something
odd because of a hidden state, but sometimes you just want to print the darn
thing now!

------
jonhohle
I need to resist the urge to package this as a standalone app. I don't really
like the idea running a separate server and having an editor tied to a
browser, but wrapping everything in an app bundle with WebKit views seems like
a nice side project.

~~~
type_enthusiast
Why resist? That would be awesome!

The server use case is real, though. Users typically like to run it on a beefy
cloud machine with access to a Spark cluster.

------
xvilka
I wish someone would make Jupyter alternative, but native, without the need to
run webbrowser, css and bunch of JavaScript just for simple rendering task.
Something based on Qt, GTK, or anything native and crossplatform.

~~~
chrisjc
Excuse me if I'm not understanding your problem, but is this not doable in
Visual Code and Jetbrains?

[https://www.jetbrains.com/help/pycharm/jupyter-notebook-
supp...](https://www.jetbrains.com/help/pycharm/jupyter-notebook-support.html)
[https://www.jetbrains.com/help/idea/jupyter-notebook-
support...](https://www.jetbrains.com/help/idea/jupyter-notebook-support.html)
[https://code.visualstudio.com/docs/python/jupyter-
support](https://code.visualstudio.com/docs/python/jupyter-support)

~~~
xvilka
These are not native.

------
bodhibyte
Gave this a try and it looks very promising. It would be great if GraalVM was
integrated to extend the polyglot support to JavaScript, Ruby, R, in addition
to Java, Groovy, Kotlin, and Clojure.

~~~
type_enthusiast
We didn't make graal a dependency, but we are absolutely planning to support
graal languages in a plug-in. It's early days, though.

------
kccqzy
Maybe I'm not the right audience but why would notebooks need to have no
hidden global state, and be reproducible? I personally use notebooks as a way
to jot down things I would've tried in a REPL. Notebooks aren't meant to hold
pieces of software; they are a dump of my explorations. I have a hard time
understanding some of the requirements that went into the design of Polynote.

~~~
kovrik
Because principle of least surprise is a good thing for almost any
application?

------
whoisnnamdi
So interesting to see something like this right after vscode added jupyter
notebook support this past month, which I was excited to see given how poor
the editing experience is in standard notebooks, especially around intelligent
autocomplete.

~~~
type_enthusiast
There are lots of cool developments in IDE notebook support - IntelliJ just
dropped a plug-in for it as well. To be honest I'd be thrilled if an IDE
solution could fill all the gaps that Polynote's targeting (our work would be
done!) so I'm looking forward to seeing what develops.

------
sv123
Liking that built in data visualization editor... That be super cool in any
SQL IDE.

------
airstrike
Another somewhat off-topic comment: Someone please do this for stock prices /
SEC data / financial modeling, with the ability to output into PDF (or PPTX)
and you will conquer the world.

~~~
mkl
What do you mean? Jupyter notebooks can already work with any kind of data
(including fetched over the internet), and can already output to PDF.

~~~
airstrike
They output to a PDF with very limited formatting...

I mean something like these:

[https://www.sec.gov/Archives/edgar/data/826083/0001193125131...](https://www.sec.gov/Archives/edgar/data/826083/000119312513134593/d505474dex99c4.htm)

[https://www.sec.gov/Archives/edgar/data/826083/0001193125131...](https://www.sec.gov/Archives/edgar/data/826083/000119312513134621/d505474dex99c24.htm)

[https://www.sec.gov/Archives/edgar/data/826083/0001193125131...](https://www.sec.gov/Archives/edgar/data/826083/000119312513134593/d505474dex99c5.htm)

~~~
iamwil
Are these the outputs that you'd have to generate because your client has to
submit reports to the SEC? Or am I misunderstanding?

~~~
airstrike
No, these are outputs that the client pays us to generate for them. These
specific instances that I linked were from an example that was made public and
then filed with the SEC, but usually these decks remain private

------
armagon
I was really intrigued to see a tool designed to help people learn languages.
I was thinking spoken languages, though, not coding languages.

------
wiradikusuma
Does it have some sort of version control? Or maybe the internal
representation can be cleanly stored in Git?

~~~
type_enthusiast
The notebooks are just .ipynb files (Jupyter's format, though apparently it
doesn't like our notebooks very much...). So you can certainly store them in
git. We don't have integration yet, but it's on our roadmap.

------
Endy
Good idea here. If only Netflix would move to a 100% F/OS stack without
proprietary WebDRM.

------
pela
This looks interesting

