Hacker News new | past | comments | ask | show | jobs | submit login
Bokeh – a Python interactive visualization library (pydata.org)
295 points by sonabinu on Feb 10, 2016 | hide | past | favorite | 66 comments



Since I have never really liked Matplotlib I would really like to learn Bokeh, but unfortunately its inability to export its visualizations in SVG or similar formats makes it kind of useless for me as a scientist wanting to publish my results.


Try my Veusz plotting app and library - http://home.gna.org/veusz - it has a nice interactive GUI, is python scriptable and extensible via plugins. It's based around Python, Qt and numpy. It can export to PDF, EPS, SVG and bitmap formats.


I would suggest Plot.ly which is now open source and can export to html (with a lot of JavaScript which generates the SVG)


Plot.ly looks better to me as well. Specially the 3D stuff.


But hey, you can save your work in a xkcd style:

    to_bokeh(fig=None, use_pandas=True, xkcd=False)
Taken from: http://bokeh.pydata.org/en/latest/docs/user_guide/compat.htm...


But hey, you can already do that with Matplotlib ;)

http://matplotlib.org/xkcd/examples/showcase/xkcd.html


If you're not wedded to Python, have a gander at Gnuplot. Despite its quirks, it's my favorite plotting tool.


+1 for gnuplot. Amazingly powerful and flexible, and exports to png, pdf, eps, latex, etc. Possibly outdated in some aspects but still unbeatable AFAIK.

That being said, this sounds nice for adding interactivity (sliders to vary parameters, that sort of stuff). Useful in presentations and the sort.


I severely dislike Gnuplot, but then again, I am wedded to Python. I'm not sure how versatile it is, at least from the slightly acquainted position I am.


FYI it's really dead simple to write new "bokeh" command line tools as Bokeh apps now. We already have several ("bokeh html" and "bokeh json") but we are definitely interested in make a gnuplot-like bokeh command line tool that you can just point at a CSV or log file and get a visualization right out. E.g. something like "bokeh graph foo.log -x time -y connections --tail" would create an automatically updating visualization of a growing log file. Or if anyone is interested in this functionality and wants to help speed along the development, please let us know!


I definitely empathize. It's a technically challenging problem, there are lots of competing important priorities, and we are a small team that can only humanly accomplish so much work at once. Our vision for Bokeh was to be able to create and share interactive data applications with a minimum of python and little-to-no JS and "web tech" required. We are closing in on that vision, so it is my fervent hope (and intention) that we'll be able to turn to this important priority soon as well.


They're working on it, probably in a month or two you'll be able to do it: https://groups.google.com/a/continuum.io/forum/#!searchin/bo...

At the current time you can export to png, or maybe export to svg in a roundabout manner, by turning html page into pdf, turning pdf into svg, etc.


A png isn't a vector format, so the important property of the SVG (or EPS, PDF) is lost. A vectorized bitmap is still a bitmap.


What resolution are you looking for in print? If you export your raster at 300 dpi, and enable lossless PNG compression, shouldn't it suffice for most print purposes?


Often, it's not about print. People sometimes take vector images of plots and squeeze a bunch of them on a single page, with the idea that whoever views the paper online can zoom in if they're interested enough.


It still has limited resolution and a much higher storage footprint. Plotting charts is the vector graphics use case par excellence. Having svg support is a no brainer.


Of course you're right that we do need to support SVG and vector formats, but it's also not entirely true that "vector >> raster" in all cases.

Raster does have limited resolution, but so does your screen or most output devices (yes, even scientific plotters).

It's also not necessarily true that it has a higher storage footprint; it depends on the complexity and number of glyphs, the fonts you need to embed for the LaTeX formulae, etc. etc. In large-data cases, raster can be a much more viable visualization transfer medium.


That's true, in practical uses a sufficiently high res raster is indistinguishable from a vector image. Still, it would be nice to have that option.


Working on SVG? What about EPS instead? For scientific publishing EPS is much better unless your visualisation is pretty much an high entropy one.


Please explain why EPS is better.


Most imaging for offset printing is done in PostScript, which can render EPS directly, so there's no format conversion. For web publishing that's not the case, and either is probably fine depending on the workflow.


I've never submitted to a journal that took SVG, nearly everyone I submit to takes EPS.


Reason is quite simple. EPS is based on PostScript. PDF specification contains a subset of PostScript. Embedding EPS into PDF is trivial and yields high quality results. For a printing publication workflow having EPS files at hand saves a lot of time and problems.

SVG is nice too, just needs few more intermediate steps.


This is just a throw, I've not tried. But Bokeh creates Canvas. There is a Canvas 2 SVG https://github.com/gliffy/canvas2svg JavaScript library already made.


"Bokeh is a Python interactive visualization library that targets modern web browsers for presentation."

A description of how one gets from Python to a web browser display would be nice. Is there a translation from Python to Javascript somewhere? Is there a Python web server backend? Are there dynamic visual updates or does this thing just generate a static output like a .png file?


There is a full-featured javascript runtime "BokehJS" which is designed from the ground up to be driven by remote (aka server-side) models, which are sent over the wire as JSON. The server-side models are generated programmatically via Python, R, Scala, etc. The cool thing is that a lot of the interactivity is completely native in BokehJS, so you can build interactive Javascript visualizations from Python, and have an entirely static HTML document that embeds a lot of rich interactivity.

See, for instance, Sarah Bird's excellent GapMinder example: http://nbviewer.jupyter.org/github/bokeh/bokeh-notebooks/blo...

When you move the slider, it generates JS events which drive model updates completely in the browser, which then update the objects that comprise the plot. All of that is built straight from Python, but there is no Python kernel running in the background.

Bokeh also lets you write your event handlers in Python, and reside on the server, and get called back automatically when the user interacts with the plot in some way. Check out this app for example: http://demo.bokehplots.com/apps/selection_histogram You can use the lasso tool to select some points, and that computes a new sub-histogram. Here is the entirety of the code for it: https://github.com/bokeh/bokeh/blob/master/examples/app/sele...

For more deep-dive on the architecture, you can start at this slide and walk through: http://www.slideshare.net/misterwang/bokeh-tutorial-pydata-s...

Or you can watch this webinar recording (start at the 19 minute mark): https://continuum-analytics.wistia.com/medias/f6wp9dam91


Are the visualizations downloaded at page load only, or can they be readily fashioned with websockets, etc to turn them into an interactive dashboard?

I've been considering running out lab experiments off of a webapp, but haven't found an easy enough solution.


Apart from rendering static html plots or plots with client-side JS callbacks, you could look into using the new bokeh server: http://bokeh.pydata.org/en/latest/docs/user_guide/server.htm...

It allows for building streaming visualizations or plots with using websockets (implemented using tornado).


you might be interested in http://demo.bokehplots.com/ which are all examples of bokeh apps that use websockets to allow you to run python functions based on user interactions with plots. the code for all the examples is linked from that page also.


> A description of how one gets from Python to a web browser display would be nice

http://birdsarah.github.io/europython-2015-bokeh/static/slid...

python (or r or scala) spits out json that is consumed by bokehjs


From the docs, under "Getting started":

> When you execute this script, you will see that a new output file "lines.html" is created, and that a browser automatically opens a new tab to display it. (For presentation purposes we have included the plot output directly inline in this document.)

This is based on an example calling an output_file() function, but the docs also mention output_notebook(), and I'm sure you can output to HTML as a string as well. I didn't spend too much time digging yet.


Yes - there's the ability to generate the raw html of a plot or a JS script and div that can be embedded in an HTML doc.

Source: http://bokeh.pydata.org/en/latest/docs/user_guide/embed.html


> A description of how one gets from Python to a web browser display would be nice.

Check out Jupyter notebooks.

> Is there a Python web server backend?

That's possible for dynamic charts, but for most usage, Bokeh saves all the chart's data inside the .html file (yielding sometimes huge files). See http://bokeh.pydata.org/en/latest/docs/gallery.html for examples.


my question exactly. What does this do that I can't do with highcharts?


i would say bokeh is more similar to d3 than highcharts (although my under-standing of highcharts may be off)

1) you can create any viz you want to - you have access to very low-level elements and can build up anything from them 2) it's liberally open source licensed

why i initially got sucked into bokeh over d3 is that: a) I prefer python b) I find it more intuitive (d3 always messes with my head)


In the Dev Guide docs, there is a section about why we built our own Javascript charting library from the ground up:

http://bokeh.pydata.org/en/latest/docs/dev_guide/bokehjs.htm...


I forgot the really important difference between bokeh and highcharts - you can throw 100k points at it and have it be fully interactive without your browser blowing up!


HiCharts has dynamically loadable data. I have a database that has 120M points. I just keep some pre aggregated data to populate the wide periods. As you zoom it it selects smaller aggregation sets. (Think; 1 month for 10 years, 2 weeks for 1 year, etc). If you have the data you only need about 20 lines of JS so support this with json. That said, bokeh requires little to no javascript and that's a huge advantage if you need to get multiple visualizations. It does, pretty much stock, what you can do with HiCharts without needing as much full-stack expertise.


The idea behind bokeh is that you don't need to know JavaScript and can use an api that's familiar to a lot of people coming from matplotlib


Great library, specially useful for doing streaming visualizations. I gave a talk last year at PyData Singapore about it, slides are available here - http://asankhaya.github.io/professional.html#PyDataSG


That's super cool.


Strange name, it refers to the way a camera lens blurs out-of-focus objects (especially highlights) into a disc shape [1]. I was expecting an image processing library.

Bokeh is widely discussed by photography equipment aficionados as one of the main distinguishing traits of different lens designs besides sharpness, distortion, etc.

edit: maybe it is an allusion to bubble plots?

[1] https://www.google.com/search?q=bokeh&tbm=isch


In the "Technical Vision" part of our docs, I speak to this a little bit: http://bokehplots.com/pages/technical-vision.html

A very large part of the vision is that we want to support accurate and useful visualization on large datasets. This is now encapsulated in the Datashader library (although it plays well with Bokeh itself): https://github.com/bokeh/datashader

If you are interested in this, we just did a webinar this week on Datashader, and demonstrate how to easily visualize billions of points through the browser, in a few seconds, on a single machine: https://continuum-analytics.wistia.com/medias/8zu9idwoym?mkt...


i believe the idea behind the name is that bokeh (python) is to data viz what bokeh is to photography - its a tool that brings the important data into focus.....


bokeh by definition is talking about the out of focus stuff, which is arguably the less important part of a shot.


Which brings in focus elements into sharper attention.


And a photo that is entirely bokeh?

http://i.imgur.com/siLB9C3.jpg


I've been playing around with this library and really like it. Python really is that much easier to use than R isn't it? Too bad most on the libraries are still in R. It's strange that there isn't a language written on top of R that is less troublesome to use or learn.


There are a lot of smart people that like Python. There are a lot of smart people that like R too. Bokeh is designed to be useful from both R and Python (and other languages as well).

Here is a recent presentation showing Bokeh (from Python) along with data-shading to visualize billions of points quickly: http://www.slideshare.net/continuumio/visualizing-billions-o...

Here is another presentation showing integration of Bokeh with R to visualize geographical data from another group: http://ryanhafen.com/blog/rbokeh-gmap

0.11 and above releases of Bokeh contain a server that provides a very nice application model that can be synchronized between browser and server for allowing data scientists to build interactive plotting-based applications in the browser. Here are some simple examples of that: http://demo.bokehplots.com/


Call the libraries with Rpy2, or build models using a DAG of functions and distributions with pymc 3


Thanks i'll check that out.


Fortran programmers find it much "easier" than other languages. You are arguing about a subjective matter here. The fact that there are so many R packages and users shows it is useful to a large enough subset of users in statistics/data science domain.


Speakers of English think it is easier than Swahili, but we know from language acquisition studies that Swahili is easier to learn in the large.

Your use of "subjective" is imprecise and overly relativistic. Just because something has to do with the human mind, or human experience, doesn't make that thing a priori subjective.


Ruby is so behind in this... :/ The closest thing I could find, which is only just a proposed enhancement in a github project, is considering adding ploty.js to Flammarion (a new Ruby GUI toolkit using the browser as working window), or maybe Ruby Processing to some extent?


Actually, we'd LOVE to get a bokeh.rb project going! Most of the power of Bokeh is in the BokehJS runtime, and in our M-VM-V-C architecture for visualization. The Python code is just generating a JSON representation of the plot, but it's relatively straightforward to do that from any language. You can see front-end implementations in other languages:

* R: https://github.com/bokeh/rbokeh * Scala: https://github.com/bokeh/bokeh-scala * Julia: https://github.com/bokeh/Bokeh.jl

The Facebook data science folks even implemented a small Bokeh wrapper in Lua, as part of their iTorch package: https://github.com/facebook/iTorch/blob/master/Plot.lua

So, you can see that you can get started with a basic wrapper, and then build up from there. The Lua wrapper is only ~850 lines. If you want to pop over to the mailing list or on our Github, we'd love to help you out.


Any chance of getting access to the old tutorials that consisted of scripts with the boilerplate written, but missing the key functionality? I was really enjoying working through them and digging into the docs, then all of a sudden they were taken offline and I was directed to the notebooks.


Bokeh is cool, but I've found that it's method for customizing appearance of graphs is a little unintuitive and clunky.

For example, have a look at this cursory analysis of my Reddit comments that I did late last year. Getting the graphs to look nice felt like pulling teeth at the time.

http://nbviewer.jupyter.org/github/Niksko/redditCommentData/...

That being said, I'm reasonably happy with the results. And that pannable, zoomable line graph is pretty fancy and is easy to set up.


"To use the Bokeh server with python 2.7, you also need to install Futures package."

Does this statement mean it supports Py3 and if you want to use it with 2.7 you need Futures? Or is it only Py2.X? [0]

[0] http://bokeh.pydata.org/en/latest/docs/installation.html


Bokeh can be used with both Python 2.7 and Python 3. Additional dependencies are needed to work with Python 2.7. In fact, the "write call-back functions in Python" capability uses a Python to JS compiler called Flexx -- http://flexx.readthedocs.org/en/latest/flexjs. Flexx is currently Python 3 only (though 2.7 support is coming for it).


thx @travisoliphant the requirements are not clear in the docs. This explanation makes more sense.


Thanks for the feedback. We keep a continuous and frequent effort to make documentation more efficient and straightforward .. Building clear is a constant/endless effort and feedback is very important.

To clarify, Futures is required for bokeh server as an extra dependency on Python 2.7 since it's not a battery included like for Python > 3.2.. if you conda install Bokeh it should be installed already for you. You can find more info about the bokeh server features here: http://bokeh.pydata.org/en/latest/docs/user_guide/server.htm... and here: http://bokeh.pydata.org/en/latest/docs/user_guide/cli.html#m...

Flexx, on other hand, is not required by bokeh server itself but instead to define python functions to handle interactivity that runs on the browser (no need of bokeh server). Flexx is used to convert python to JS. You can find more info here: http://bokeh.pydata.org/en/latest/docs/user_guide/interactio...


thx @fpilger, fantastic to see another Py3 ready application. It's clear now. One way to avoid this is either give a install write up an installation guide [0] that explicitly mentions the python version and dependencies.

This is one of the first questions I ask, "is this code Python3 ready?", so I look at the install requirements first. Maybe I'm an edge case. A short cut might be just to say in big words, Python3 ready.

[0] For example this PIL fork, Pillow (python3) https://pillow.readthedocs.org/en/latest/installation.html


If wanting to know up front if this is designed for the future or the past is an edge case, it's a pretty crowded edge. Python 3 was released eight years ago, and I will ALWAYS choose a competing technology (if it means a language other than Python, that's fine with me) before I'll let Python 2 play any role in a new project I architect.

The first thing I looked for on the Bokeh website was some sort of clear statement on the front page about it being Python 3. I don't have an immediate need for it, but if it's Python 3, I'll keep it in mind. If it's just that euphemistic "Python", it may as well be Cobol for all I'd care.

I thought I'd skim over these comments just in case someone asked the obvious question, and someone did. (Thank you.) And, yes, it is apparently Python 3 enough that we wouldn't need to use any Python 2 if we adopted it.

Very nice. Now I wish I DID need it, but maybe I'll do a side project of my own with it, just for fun. I'll be keeping it in mind in any case. I'll bet "Python 3" is mentioned somewhere on the site, but I didn't spot it right away. It might be worth making it a bit more prominent, since it is an important feature for those designing for the future.


"If wanting to know up front if this is designed for the future or the past is an edge case, it's a pretty crowded edge. Python 3 was released eight years ago"

good point: a quick search reveals that Py2.x is still in a lot of legacy code (dependencies -- the big one), so I always check first:

- https://python3wos.appspot.com/

- http://stackoverflow.com/questions/30751668/python-2-vs-pyth...

- http://hiltmon.com/blog/2014/01/04/python-its-a-trap/

- https://blog.newrelic.com/2014/01/21/python-3-adoption-web-a...

- http://www.randalolson.com/2015/01/30/python-usage-survey-20...


I wrote most of the docs, and my default day-to-day python is Python 3, so it's really just a matter of forgetting that "Python 3 ready" is something still needs to be mentioned at all. In fact we even have a python 3 only feature at the moment (python->JS compilation) But yes, just to clarify our CI tests run on python 2.7, 3.4 and 3.5


Also forgot to mention that Flexx support for Python 2.7 should be available soon.. :)




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: