
Streamlit: Turn a Python script into an interactive data analysis tool - danicgross
https://towardsdatascience.com/coding-ml-tools-like-you-code-ml-models-ddba3357eace
======
adenverd
This looks really slick, can't wait to try it out!

If anyone is curious about other tools in the same space, our data scientists
use Dash[1] and plotly to build interactive exploration and visualization
apps. We set up a Git repo that deploys their apps internally with every merge
to master, so they're actually building and updating tools that our
operations, marketing, etc teams use every day.

[1] [https://plot.ly/dash/](https://plot.ly/dash/)

~~~
amrrs
Dash is awesome. I've been using Shiny in R for similar purpose. Do you have
any blog post or some more details around the deployment process and your use-
case of using Dash?

------
random42
Interesting project, but why does an open source developer tool needs browser
telemetry?

You should ask for telemetry permissions _before_ the process starts up (as
you do for email address), and keep the default as "No", instead of start to
send the data transparently unless non user friendly steps are taken by the
user.

------
iandanforth
This is a declarative programming model similar to React. I'm surprised the
analogy isn't drawn in the article.

"Streamlit assigns each variable an up-to-date value given widget states."

This line is interesting because it implies distributed state in each
component (widget). Alternatively this could be framed in centralized state
manager terminology.

"Each widget is provided with the current state of the application, and that
state is also available to your script."

If you adopt this mindset you can separate the concerns of state and
presentation. At first glance it appears that you need to extract state from
widgets at the same point as they are added to the page.

(Please correct me if I'm wrong.)

I might not want to have a widget added to a page until much later in the
script, but I want to have access to its state at the top of the script.

The value of the top level `props` parameter to a react component is it gives
you access to all state wherever you need it, and disentangles this state from
the arrangement of the page.

~~~
adrien-treuille
(co-founder of Streamlit here)

Ian:

Thanks for that comment. You're exactly right: Streamlit adapts a React-like
model. In fact, the connection goes deeper than the post describes. For
example, to make it efficient to run the same script repeatedly, Streamlit
does packet-level deduplication. If you generate a lot of data and send it to
the browser, only small deltas need be sent to update the UI.

We have a list of future blog posts we hope to write and one of them is
(cheekily) called "Streamlit is React for Python." ;) (Not quite true, more of
an imperfect analogy!)

So it made me really happy to see someone drawing that analogy already. Thank
you. :)

~~~
iandanforth
FWIW I did some things in this area a while back and ended up needing to
stream canvas state back to my Python scripts for image processing.

[https://gist.github.com/iandanforth/0ed987bfddf8205b8a23](https://gist.github.com/iandanforth/0ed987bfddf8205b8a23)

I hope that could be a part of this framework in the future! (If it isn't
already)

~~~
adrien-treuille
This is really cool. We've been thinking about these issues to create an image
masking widget in Streamlit. Would be interesting to connect when we start
down that path. If you're interested, please feel free to connect over
discuss.streamlit.io. :-)

------
simonw
I'm impressed.

Installing on my Mac to test this out was very straight-forward:

    
    
        cd /tmp
        mkdir streamlit
        cd streamlit
        pipenv shell
        pip install streamlit
    

Then I could play with the built-in demos by running:

    
    
        streamlit hello
    

So that was a slick intro - next step was I followed this tutorial:
[https://streamlit.io/docs/tutorial/create_a_data_explorer_ap...](https://streamlit.io/docs/tutorial/create_a_data_explorer_app.html)

And a few minutes later I had an interactive notebook-style interface for
playing with Uber pickup data in New York.

This is a really interesting product.

~~~
edmondo1984
How would you compare this with the experience you get with Jupiter Notebook?

------
ttul
I am a terrible data scientist. I look forward to this tool making me look
like I know what I’m doing.

------
yardshop
I gave it a try on Windows and ran into an issue, but found a workaround.

I'm using WinPython 3.6 on Windows 7. I did "pip install streamlit" and then
"streamlit hello", and had to allow it through the firewall, then got a 404
page.

The workaround is very simple, just use the provided http address and add
"index.html":

    
    
        http://localhost:8501/index.html
    

This link has more info:
[https://github.com/streamlit/streamlit/issues/244](https://github.com/streamlit/streamlit/issues/244)

~~~
adrien-treuille
For all those wondering, this bug has been fixed
[https://github.com/streamlit/streamlit/pull/331](https://github.com/streamlit/streamlit/pull/331)

------
77ko
Whats the easiest way to make public on the internet a small streamlit powered
analysis?

I saw there is a Streamlit for teams in the future (sounds expensive) and on
the forums they recommended to make a docker container and host it anywhere,
which is doable, but I'd love a way to be able to just put something up on the
internet for a short period of time, sort of how now.sh[1] works.

[1]: [https://zeit.co/home](https://zeit.co/home)

~~~
pokepim
For now I managed to host it for short while using EC2 and continous script
deployment which I described in detail here:
[https://medium.com/@pokepim/deploying-streamlit-app-to-
ec2-i...](https://medium.com/@pokepim/deploying-streamlit-app-to-
ec2-instance-7a7edeffbb54)

------
diskmuncher
I am not sure "rerunning the script from top to bottom" is a necessary
condition but for the code layout to reflect the expected behavior.

How would people compare this to Observable [1]? 1\. Javascript vs. Python 2\.
Client-only vs. server-required?

Does the market already give advantage to Python and server-required because
the data sets are too large and live on the server, and the users (data
scientists) prefer Python and the existing libraries there?

[1]: [https://observablehq.com/](https://observablehq.com/)

------
xiaodai
It's like embedding Shiny apps into RMarkdown

[https://bookdown.org/yihui/rmarkdown/shiny-
embedded.html](https://bookdown.org/yihui/rmarkdown/shiny-embedded.html)

------
kfk
I agree on the premise: yes eventually every analysis needs to become an app.
Also let me add: no, dashboards are not going to cut it, they don't offer
enough interactivity. I also love that the app here is a script and hence can
be version controlled with git. However, there is no description anywhere of
what happens when you need to scale with this. If you have to go from a couple
of testers to 100 internal users like it very often happens in analytics, how
does this react?

Also caching is a great idea but I would expect a lot of this logic to be
managed on the server side, or I am missing something and ML is different
here? I would expect to pipe as little data as possible back to the
application because I want the user to wait max 3-4 seconds for the app to
load at start.

~~~
Doxin
As far as I can tell the caching _is_ happening server-side. Pretty much all
the frontend seems to be doing is poking the back-end to re-run the script (or
get results from cache) and then getting back the diff to apply to the UI.

------
jonjlee
Apart from ML, this could be very useful for creating dashboards in the
healthcare setting. Dashboards are surprisingly had to make, deploy, and
maintain in the hospital. I've created many one-off webapps for different
research groups and have been dreaming of a tool that consolidates all the
front end work allowing me to concentrate how to process the data. Streamlit
looks incredible for this!

~~~
dvdhsu
Hello! I’m working on Retool ([https://retool.com](https://retool.com)), and
it does exactly what you said. Our focus is more on building internal
applications + dashboards, and we have a HIPAA-compliant version you can
deploy on-prem with no telemetry. I’ll try to reach out to you (but if I can’t
find your email, mine is david@). Thanks!

~~~
mellosouls
Fwiw your links at the bottom to Community, Privacy etc don't work on Chrome
or Firefox on my (quite old) tablet. It's like the whole menu is an image
rather than an actual menu.

At first I was suspicious but looking at the page source the links are there,
so you may want to revisit that if it's an issue for others.

~~~
dvdhsu
Thanks! Just fixed it. Looks like the problem was with one of our animations
being too big on mobile, and it covered up the footer. I really appreciate the
bug report!

------
ignoramus23
This is awesome! Is there a way to generate a standalone binary, e.g. as an
electron app? I'm looking for ways to ship small custom python/pandas data
analysis apps including data to non-technical users - but as a local
application.

~~~
zippie
You may want to consider building Streamlit into a standalone binary using
Static-X or pex. I use pex for standalone binary distribution of a fairly
popular python app [1].

[1] [https://github.com/johnj/salt-pex](https://github.com/johnj/salt-pex)

------
sandGorgon
This is spectacular. I have written about this many times on HN itself .

Jupyter -> internal tool/API is pretty much the holy grail of bridging data
scientists, business teams and engineering.

I hope this project doesn't die out. A lot of people would pay for this.

~~~
zapita
They have plenty of funding, so if it solves a real problem for people, I
expect they are here to stay.

------
pj_mukh
This is pretty neat! As this is an offshoot of some autonomous car project,
what would the support be for 3D data. If I wanted to see some LIDAR or
Pointcloud data in browser? Esp if its a tf-output.

~~~
krebby
Check out deck.gl, which is packaged with streamlit [0]. There is a
PointCloudLayer [1] and deck can read gltf.

Also take a look at Streetscape.gl [2] which is designed for visualizing AV
data

Disclaimer: I work on Uber's data vis team but not on AVs.

[0]
[https://streamlit.io/docs/api.html#streamlit.deck_gl_chart](https://streamlit.io/docs/api.html#streamlit.deck_gl_chart)
[1] [https://deck.gl/#/documentation/deckgl-api-
reference/layers/...](https://deck.gl/#/documentation/deckgl-api-
reference/layers/point-cloud-layer) [2] [https://avs.auto/](https://avs.auto/)

------
stekern
This looks really promising! I recently wanted to make an interactive GUI to
control the inputs to a GAN in order to generate images and visualize how
different inputs affect certain properties of the output images.

I ended up converting my Python models to TensorFlow.js and creating an ad-hoc
Vue.js app [0], but Streamlit could have been very beneficial here, especially
if you can just put nginx in front of it and serve it to the masses.

[0] [https://thispicturedoesnotexist.com](https://thispicturedoesnotexist.com)

------
mxwsn
Does anyone have any hands-on experience with this? It looks impressive. I'm
interested in contrasting this with dash plotly for python

~~~
bobosha
My thoughts exactly, this looks very similar to plotly Dash. Perhaps the
creators could share their thoughts on how this compares.

------
asimjalis
Can this be used from within the Jupyter notebook?

~~~
tvst
Not at the moment. There are a few reasons for this, but perhaps the most
fundamental one is that Streamlit starts a blocking server -- so even if you
could run it inside Jupyter it would pause your Jupyter session until you
killed Streamlit.

(Co-founder of Streamlit here)

~~~
julienfr112
Please, do not do that : not being compatible with jupyter is a feature ;-)

------
danielvf
I use Jupyter notebooks all the time for acquiring, cleaning, and exploring
datasets. Occasionally these notebooks mutated into tools for more than just
one off exploring. It's always felt a little awkward to use them for more day
to day data tasks. Steamlit looks amazing for these cases!

------
ariskk
This is truly lovely. I went from ‘pip install’ to reproducing one of our
internal dashboards in ~1 hour. One issue: Auth and ACLs seem to be part of
the paid/hosted version so it needs extra work to become viable for most
people

------
ppod
This looks excellent. I'm an avid RShiny user and can't wait to try this and
Dash. Is there an example for how to host (e.g. on aws or google cloud) and
make an app available online?

~~~
marmaduke
You could probably build a Docker image and set the command to run streamlit,
then run it like anything else.

~~~
ppod
Maybe a suggestion for the authors (if not already in the pipeline). Shiny
offers a button in RStudio that deploys an app instantly and for free to a
domain at shinyapps.io. They then charge for apps that require more data
hosting or more concurrent users. Extremely convenient, and pretty profitable
I'd imagine.

~~~
marmaduke
They've à Teams thing that probably is for this.

------
j0e1
This looks super interesting. I relate to the motivation of building something
like this- the endless cycle of creating a Jupyter notebook which becomes a
Flask app. I really liked the quick feedback loop for the visual components
and the fact that it is all in Python.

I haven't checked yet but a question that comes to mind is how extensible is
this framework. I can easily see how I'd want to make custom widgets.

~~~
tvst
Hi J0e1, I'm one of the founders of Streamlit.

Regarding extensibility, we totally agree: over time, many people are going to
want to write their own custom widgets. Which is why we're actually in the
early phases of designing a plugin system for Streamlit.

So stay tuned!

------
polm23
This is cool but it's already been on the top page twice this week...

[https://news.ycombinator.com/item?id=21127528](https://news.ycombinator.com/item?id=21127528)
[https://news.ycombinator.com/item?id=21126477](https://news.ycombinator.com/item?id=21126477)

~~~
yardshop
Maybe redundant, but I'm glad it was posted again because I didn't see it
those other times.

~~~
steve_adams_86
Likewise, and the timing couldn't be better for me. This is a very exciting
discovery.

------
pythonwutang
Looks great! Thanks for the hard work Streamlit team. Our team started using
Dash recently for an ML project and quickly got lost in callback hell and
switched back to a notebook. This approach is so pythonic and elegant. Looks
like it would handle our use case with much less code and callback related
head aches. I’m excited to share it with our team!

------
fleur-de-lotus
The demo is broken on Macos: "Streamlit failed to hash an object of type
<class 'code'>.,

More information: to prevent unexpected behavior, Streamlit tries to detect
mutations in cached objects so it can alert the user if needed. However,
something went wrong while performing this check.

Please file a bug... "

------
bobosha
can this be used for building web apps at production scale?

~~~
tvst
Hi bobosha

It depends on how you would define "production scale".

If you're talking about hosting a publicly accessible Streamlit app on the
internet, it's definitely possible but will require you to set up an
appropriate infrastructure around it: sticky load balancer, replication,
orchestration, etc.

If you're talking about hosting something for internal use by your company,
very often just a simple machine serving your Streamlit app is more than
enough.

That said, we're currently working on Streamlit For Teams, which is a paid
offering that will make it trivial to deploy Streamlit apps for these use
cases. If you're interested, you can sign up here:
[https://streamlit.io/forteams/](https://streamlit.io/forteams/)

(Co-founder of Streamlit here)

------
westurner
Cool!

requests_cache caches HTML requests into one SQLite database. [1] pandas-
datareader can cache external data requests with requests-cache. [2]

dask.cache can do opportunistic caching (of 2GB of data). [3]

How does streamlit compare to jupyter voila dashboards (with widgets and
callbacks)? They just launched a new separate github org for the project. [4]
There's a gallery of voila dashboard examples. [5]

> _Voila serves live Jupyter notebooks including Jupyter interactive widgets._

> _Unlike the usual HTML-converted notebooks, each user connecting to the
> Voila tornado application gets a dedicated Jupyter kernel which can execute
> the callbacks to changes in Jupyter interactive widgets._

> _\- By default, voila disallows execute requests from the front-end,
> preventing execution of arbitrary code._

[1] [https://github.com/reclosedev/requests-
cache](https://github.com/reclosedev/requests-cache)

[2] [https://pandas-
datareader.readthedocs.io/en/latest/cache.htm...](https://pandas-
datareader.readthedocs.io/en/latest/cache.html)

[3]
[https://docs.dask.org/en/latest/caching.html](https://docs.dask.org/en/latest/caching.html)

[4] [https://github.com/voila-dashboards/voila](https://github.com/voila-
dashboards/voila)

[5] [https://blog.jupyter.org/a-gallery-of-
voil%C3%A0-examples-a2...](https://blog.jupyter.org/a-gallery-of-
voil%C3%A0-examples-a2ce7ef99130)

Acess control and resource exhaustion are challenges with building any {Flask,
framework_x,} app [from Jupyter notebooks]. First it's "HTTP Digest
authentication should be enough for now"; then it's "let's use SSO and LDAP"
(and review every release); then it's "why is it so sloww?". JupyterHub has
authentication backends, spawners, and per-user-container/vm resource limits.

> _Each user on your JupyterHub gets a slice of memory and CPU to use. There
> are two ways to specify how much users get to use: resource guarantees and
> resource limits._ [6]

[6] [https://zero-to-jupyterhub.readthedocs.io/en/latest/user-
res...](https://zero-to-jupyterhub.readthedocs.io/en/latest/user-
resources.html)

Some notes re: voila and JupyterHub:

> _The reason for having a single instance running voila only is to allow non
> JupyterHub users to have access to the dashboards. So without going through
> the Hub auth flow._

> _What are the requirements in your case? Voila can be installed in the
> single user Docker image, so that each user can also use it on their own
> server (as a server extension for example)._ [7]

[7] [https://github.com/voila-
dashboards/voila/issues/112](https://github.com/voila-
dashboards/voila/issues/112)

------
floki999
This looks very interesting and addresses a very common use-case - thanks for
showing.

------
LukeB42
This looks excellent. Thank you.

Can we use asyncio to update multiple charts simultaneously / at arbitrary
intervals?

Wouldn't it be better if Jupyter absorbed this API for its dashboards?

------
sriram_malhar
Beautiful, beautiful. I look forward to playing with it.

------
rosstex
Python is fun again!

------
neximo64
Does anyone else get a 404 on running streamlit hello?

~~~
Doxin
It's a known issue it seems. Try going to
[http://localhost:8501/index.html](http://localhost:8501/index.html) instead
of just [http://localhost:8501](http://localhost:8501)

------
uptownfunk
Wow it’s great to see the Pythonians finally realizing they need a Rshiny for
python.. /s

~~~
stOneskull
i like little ribbings like this. like with friends and football teams. it's
part of the fun.

------
x775
This is super awesome!

------
stOneskull
this is great. thank you very much.

------
villgax
Dash anyone?

