Bokeh – Interactive web visualization library in Python

hhuuggoo · on Nov 21, 2013

I wanted to point out one important distinction between Bokeh and anything that's currently out there - we have a full blow python/js object bridge that synchronizes client side models with objects you can interact with in python. The significance of this, is that someone can select points on a scatter plot, and then you can retrieve the indexes of those points on the python side, and use that to further dive into your data.

It's worth mentioning that the IPython guys are implementing a similar json/python bridge to support the new interactive tools in the IPython notebook. Once that is up and running, we'll probably just piggy back off of that bridge, when you're running in the notebook.

kermatt · on Nov 20, 2013

As a "database guy" who uses Python for most things not bash, is this an approach for viz apps that would eliminate (most) of the need to muck about in JavaScript?

D3 and its children produce some awesome visualizations, but the bandwidth does not exist for me to begin developing apps in a language I don't have much experience in.

If something like Bokeh allows me to live mostly in Python, it becomes even more interesting.

paddy_m · on Nov 20, 2013

That is exactly the point of bokeh. It allows you to write python code and get browser based plots. Currently we have a python interface, but we intend to build interfaces in other langauges. Look at the examples gallery, which include the code needed to generate the plots. http://bokeh.pydata.org/gallery.html

A relatively simple plot (scroll down to see the code): http://bokeh.pydata.org/plot_gallery/correlation.html

The simplest example plot in the repo: https://github.com/ContinuumIO/bokeh/blob/master/examples/pl...

Dartanion7 · on Nov 20, 2013

This is really cool, but what about adding interactivity with the charts? My first thought when I saw the headline was that this python library just generates D3 code, but it seems to be generating some sort of a static SVG object.

hhuuggoo · on Nov 20, 2013

We're actually using canvas.

http://bokeh.pydata.org/plot_gallery/iris.html

That has a selection tool you can play around with.

bigreddot · on Nov 20, 2013

There are already some tools available, pan and wheel zoom, plot resizing, selection (on some plot types). Many more tool types are planned.

As for the architecture, BokehJS is built entirely on top of HTML canvas. The python bokeh library sends data and plot specifications to the browser, which uses BokehJS to render the plot and handle interactive tools, etc.

Dartanion7 · on Nov 21, 2013

Gotcha, thanks.

paddy_m · on Nov 20, 2013

Here is a linked brushing example in the IPython notebook. http://nbviewer.ipython.org/urls/raw.github.com/ContinuumIO/...

primelens · on Nov 21, 2013

For some reason I can't get any of the interactivity buttons to work.

EDIT: Ok, everything but the zoom works. How do you zoom?

bigreddot · on Nov 21, 2013

The current zoom tool is a scroll wheel zoom, not a box selection zoom. Guessing that might be the issue since others have run into the ambiguous description as well. It will be labeled more precisely in master in a few days and in the next release as well.

If that is not the issue, please file a ticket on GitHub!

primelens · on Nov 21, 2013

On a Macbook, the scroll wheel equivalent is the two fingered scroll and that doesn't seem to work for me (scroll the page).

bigreddot · on Nov 21, 2013

Do you have the tool selected in the toolbar above? If so, please file a ticket on GH about this. Most of the Bokeh devs are on OSX so it would surprise me if it does not work on OSX, but if there is a bug we want to fix it!

est · on Nov 21, 2013

Hope there's some kind of python server side push technology to update browser graphs in realtime.

pwang · on Nov 21, 2013

Yes: http://continuum.io/blog/painless_streaming_plots_w_bokeh

paddy_m · on Nov 20, 2013

To be more clear, there are two main parts of bokeh. First there is the js portion which uses canvas to create the plots. There is an object model that allows plots to be composed of multiple components (glyphs, data sources, axes, data ranges).

The python side produces json that represents the objects to be plotted. Python only writes a small amount of js to start the js running. For the most part python just produces json objects that the js side reads.

There could be alternate implementations of the python side that still use the same js rendering logic. You could even write a nicer higher-level js api that wraps the low-level component construction.

I talked about this at PyData NYC. Here is my notebook (which I am in the process of updating for bokeh 0.3) http://nbviewer.ipython.org/urls/raw.github.com/paddymul/bok...

gammarator · on Nov 21, 2013

Take a look at Vincent, which may give you what you're looking for (Python -> Vega, a d3 wrapper): https://vincent.readthedocs.org/en/latest/

jparmer · on Nov 21, 2013

If you're looking to make scientific D3 graphs with Python, never touching anything close to javascript, you can also check out the Plotly Python API:

https://plot.ly/api/python

It was designed foremost to make graphs pertinent for scientific and engineering applications: https://plot.ly/~alex/76/

(Disclosure: I'm a dev @Plotly)

beat · on Nov 20, 2013

As a photographer, I gotta say I really love the name. :)

For those who aren't photography nerds, "bokeh" is a Japanese word that means the out-of-focus areas in a photograph. Different lenses have different kinds of bokeh, and beautiful or ugly bokeh is an important dividing line between good and bad lenses.

slantyyz · on Nov 20, 2013

As someone who likes photography and analytics, I disagree about the name.

As you say, bokeh is about out-of-focus blur. That's sort of the opposite impression you want to present in a tool that's intended to give you "clarity" via its visualizations.

pwang · on Nov 20, 2013

Actual, bokeh is about the quality of the blur. Yes, blur can have quality. If you simply removed everything that was not in focus, you lose context and texture about the subject. If you use a pinhole camera and present everything in sharp focus, you lose the insight.

This is actually mentioned in the documentation: http://bokeh.pydata.org/#technical-vision

""" Photographers use the Japanese word “bokeh” to describe the blurring of the out-of-focus parts of an image. Its aesthetic quality can greatly enhance a photograph, and photographers artfully use focus to draw attention to subjects of interest. “Good bokeh” contributes visual interest to a photograph and places its subjects in context.

In this vein of focusing on high-impact subjects while always maintaining a relationship to the data background, the Bokeh project attempts to address fundamental challenges of large dataset visualization... """

slantyyz · on Nov 20, 2013

>> Actual, bokeh is about the quality of the blur.

Yes, you're right. But most people tend to treat out-of-focus blur synonymously with bokeh (the quality of the blur) -- they're related but not the same. In this particular library's case, I think they're talking about out-of-focus blur, not bokeh.

beat · on Nov 20, 2013

Yeah, I kinda glossed over that.

vph · on Nov 20, 2013

As someone who likes photography, I have to disagree about your reasoning. The clarity of your subject is as much about how much it is in focus as how much irrelevant things are out of focus. Portrait photos are often beautiful when the lens has good bokeh characteristics.

slantyyz · on Nov 20, 2013

>> The clarity of your subject is as much about how much it is in focus as how much irrelevant things are out of focus.

If you actually look at the sample output of the library, there's nothing out of focus, at least from the perspective of depth of field. In my opinion (and it is just an opinion), calling all de-emphasized data bokeh is a stretch at best. Blurring and de-emphasis using color and size are two different things.

>> Portrait photos are often beautiful when the lens has good bokeh characteristics.

Let's be clear -- while bokeh can enhance the beauty of a portrait, it doesn't make a portrait beautiful. Most people don't know the difference between good bokeh and bad bokeh (pwang's definition of bokeh in his response to me is very good), but they can usually identify a blurred background vs. a sharp background.

Many people tend to prefer a sharp subject against a blurred background, and that's usually enough for most people to consider a portrait beautiful even if the bokeh is quite ugly. Without getting into a long drawn out discussion of bokeh, you have to remember that there's also more to a beautiful portrait than the novelty of a blurred background.

pwang · on Nov 20, 2013

> If you actually look at the sample output of the library, there's nothing out of focus,

We are working on the semantic downsampling and perceptual integration aspects of visualizing large data. This currently lives in its own repo: https://github.com/JosephCottam/AbstractRendering

> calling all de-emphasized data bokeh is a stretch at best

It's really just meant to be an evocative metaphor... :-)

clebio · on Nov 20, 2013

If you're thinking in terms of the _spirit_ of bokeh, and understand 'semantic downsampling', you've got my vote. As someone whose tried to use D3, I like this a lot. I'll be experimenting with Bokeh now soon.

pwang · on Nov 20, 2013

We have a paper that we'll be presenting at SPIE VDA 2014 in February: http://spie.org/EI/conferencedetails/visualization-data-anal...

With the 0.3 release out now, I'm focusing on building hooking up the abstract rendering backend for the plot server, so just keep an eye out.

clebio · on Nov 20, 2013

Do you guys all work for Continuum? Need more hands? ;)

pwang · on Nov 20, 2013

Could always use more hands, especially in certain areas. Shoot an email to jobs@continuum.io and reference this post.

slantyyz · on Nov 20, 2013

>> It's really just meant to be an evocative metaphor... :-)

Gotcha - me not liking a name is just my own personal opinion. The library itself is interesting.

You can't please everyone all of the time. ;)

Osmium · on Nov 20, 2013

I didn't realise "bokeh" was a Japanese word. That's really cool. Seems like it uses this kanji 暈 [1] meaning "corona" or "halo" (in turn made up of the "sun" kanji, "car" kanji and "crown" radical–I'm not sure if there's a historical reason for this but it makes it easy to remember at least).

[1] http://jisho.org/kanji/details/暈

cvb386 · on Nov 21, 2013

Boke (meaning blur) is from a combination of kanji 暈 (bo) and hiragana け (ke). You'll notice that "bokeh" doesn't look like usual romaji (Japanese romanization) spelling, since it would normally be transliterated as boke. It was spelled bokeh with the extra 'h' to avoid accidental pronunciation like poke.

[1] http://www.luminous-landscape.com/essays/bokeh.shtml

Osmium · on Nov 23, 2013

Thanks for the info. I just realised too (thanks to a friend) that the kanji might originate because "corona" sounds similar to "kuruma" (meaning "car"), and that's why the "bo" kanji has the "car" kanji in it.

mkr-hn · on Nov 20, 2013

I thought bokeh was the aesthetic quality of the out of focus area, not the area itself.

Demiurge · on Nov 20, 2013

It's nice, I've heard of it a while ago. But I just had a crazy thought of combining this with UTFGrid for interacting with data points, but that's probably silly :)

hogu · on Nov 20, 2013

No - that's not crazy at all.

In fact we are working on (and open sourcing) similar ideas

http://www.youtube.com/watch?v=b0-4xtFeaT8

kermatt · on Nov 20, 2013

Are mouse hover interactions in the timeline (display value of selected point)? Don't see any references but otherwise this is a very interesting project.

bigreddot · on Nov 20, 2013

[Another bokeh dev chiming in] Additional tools like crosshair, data and color inspectors, box zoom, more types of selections (point, lasso, etc), and measurement tools are all planned.

jparmer · on Nov 21, 2013

Check out the Plotly APIs for hover - Here's an example: https://plot.ly/~alex/75/ https://plot.ly/api/python (Disclosure: I'm a dev at @Plotly)

rafeed · on Nov 20, 2013

This is badass. I wish there were something like this or Seaborn [1] for Matlab. Anyone know of anything similar that can make the ugly default Matlab plots turn into beauties like these?

[1] http://stanford.edu/~mwaskom/software/seaborn/index.html

hhuuggoo · on Nov 20, 2013

This doesn't directly answer your question, but in regards to seaborn, our goal is to support enough of the mpl API so we can get seaborn to work with bokeh. Either that or we would write our own ggplot interface. Which ever approach ends up being easier

ngoldbaum · on Nov 21, 2013

Yes! Is would be crazy awesome if I could do 'from bokeh import pyplot' or something similar.

jparmer · on Nov 21, 2013

The Plotly MATLAB API does exactly that: https://plot.ly/api/matlab

3327 · on Nov 20, 2013

I don't know if anyone else noticed but we owe DARPA's XDATA program a thank you note too for funding this project.

pwang · on Nov 20, 2013

They've been great supporters of this effort, as well as Blaze[1] and Numba[2].

[1] http://blaze.pydata.org

[2] http://numba.pydata.org

Keyframe · on Nov 20, 2013

What does 'large datasets' mean here? We are building a visualization service to abstract users from fiddling with d3 and other libraries. We want users to be able to use all of viz libraries out there with just providing data input and tweaking settings, so this looks interesting.

hhuuggoo · on Nov 20, 2013

Part of the 0.4 release is to incorporate the concept of abstract rendering - which means you render on the server, and then send the necessary information over to the client on demand. For example, if someone tries to scatter a billion points, instead of just drawing a useless point cloud, you would figure out where all the points fit inside your 512x512 canvas (or whatever size you have), figure out how all the points stack up, compute an alpha that is meaningful for that number of points, and then send the heatmap to the client.

You can easily imagine as similar approach for line plots which does selectively downsampling of datapoints in order to preserve interesting features in the plot.

And then we'll build interactors on top of that, so you can actually treat it like a scatter plot, even though it's a heatmap that's being sent to your browser.

So the answer is - large datasets, means, as large as our abstract rendering algorithm can handle on your hardware, so those data sets should be pretty big.

Keyframe · on Nov 20, 2013

Interesting, this is for our second phase then (we're launching soon, you'll know about it). We'll definitely look into it if we can provide an interface for bokeh as well then. Currently we're transforming user provided sheets (csv etc.) into json and tying them into viz on client side. Thanks for answer.

rcfox · on Nov 21, 2013

I have a project involving multi-gigabyte datasets of line plot data. With your 0.4 release, will it be possible to show down-sampled subsets of these plots, with the ability to pan/zoom around and get more data on demand without having it all held in memory?

hhuuggoo · on Nov 21, 2013

Well, we'll be able to do that without sending the data to the client, not sure if our implementation right now will work without loading the data into memory, though long term that is definitely the plan (we will leverage http://blaze.pydata.org/)

If you want to discuss further, please email bokeh@continuum.io

JosephCottam · on Nov 21, 2013

The Python version of Abstract Rendering currently would load it all into memory. The Java version is based on the same algorithms would not. It routinely handles multi-gigabyte files and lets us know that the core algorithm can scale. We're working on getting the Python implementation to scale as well.

MasterScrat · on Nov 20, 2013

For when is the 0.4 release planned? I would be really interested by this, having to visualize terabytes of data in the browser.

hhuuggoo · on Nov 20, 2013

January - but probably only support for abstract rendering for scatter plots and line plots. we'll have to roll it out incrementally, but the good thing is 90% of plots are scatters and lines =)

MasterScrat · on Nov 20, 2013

Perfect, thanks.

polskibus · on Nov 20, 2013

Has anyone used it in production ? I would be very interested to hear about interoperability potential with other platforms - some json-based protocol perhaps? In D3 I can just point it to csv and do whatever I need to. Is it just as easy in BokehJS?

paddy_m · on Nov 20, 2013

We will be adding stream datasources (which allow pulling from 3rd party jsonp feeds) for bokeh 0.4. We expect to release bokeh 0.4 in early 2014.

It is incredibly easy to use bokeh from python. The burtin example in the gallery reads from CSV http://bokeh.pydata.org/plot_gallery/burtin_example.html . Scroll down a bit, and you can see the code.

I am a bokeh dev at Continuum Analytics.

taeric · on Nov 21, 2013

I realize this is nitpicky, but it seems obnoxious to call these[1] "candlesticks" when it would make much more sense to call the page box plots[2]. Wouldn't it?

[1] http://bokeh.pydata.org/plot_gallery/candlestick.html

[2] http://en.wikipedia.org/wiki/Box_plot

T-A · on Nov 21, 2013

Nah: http://en.wikipedia.org/wiki/Candlestick_chart

taeric · on Nov 21, 2013

Awesome! I fell for the superficial resemblance, clearly. :) (I even searched for the term candlestick on the wikipedia page for box plots... )

liyanage · on Nov 20, 2013

This is really awesome. I just added it to my IPython Notebook Mac app: https://github.com/liyanage/ipython-notebook/wiki

pwang · on Nov 20, 2013

Cool! Thanks!

Let us know if you ever have any problems with it.

bigreddot · on Nov 20, 2013

Just for reference, here is the actual 0.3 announcement: http://continuum.io/blog/bokeh03

asselinpaul · on Nov 20, 2013

Can someone comment on how this compares with matplotlib?

pwang · on Nov 20, 2013

In the FAQ: http://bokeh.pydata.org/faq.html

""" Q: Why did you start writing a new plotting library, instead of just extending e.g. Matplotlib?

A: There are a number of reasons why we wrote a new Python library, but they all hinge on maximizing flexibility for exploring new design spaces for achieving our long-term visualization goals. (Please see Technical Vision[1] for details about those.) """

[1] http://bokeh.pydata.org/index.html#technicalvision