

Ggplot2 for python - martingoodson
https://github.com/yhat/ggplot/

======
Osmium
This is really exciting! I've always liked ggplot2 but sadly struggled with R
syntax (the only time I ever used it was for ggplot2, and found it really
confusing).

For anyone reading this, I don't think this is intended to replace e.g.
matplotlib which I suspect is more powerful overall. It's more for creating
good looking, publication-quality graphs. A lot of thought has gone into
ggplot2's default theme. A lot of influence from Tufte et al. if I remember
rightly. Though, sadly, it looks like this port doesn't replicate ggplot2's
graphs pixel-for-pixel, which is a shame.

Edit: Incidentally if anyone has a recommendation for a good graphing app for
OS X, I'd be all ears. I still haven't found a good one. I usually use Plot
([http://plot.micw.eu](http://plot.micw.eu)) which is great (scriptable,
produces good looking graphs) but has a bit of a clunky interface. I
personally find DataGraph's interface horrendous, even though its graphs are
good, and Excel takes forever to make anything remotely decent. matplotlib and
its ilk are fine but require custom scripting. I've been using xmgrace
recently simply because it launches quickly, but it's (understandably) not
retina and not native and just a pain really. What I'd give for a nice
graphing app...

~~~
mbq
Please don't blame R for the ggplot syntax -- it is bizarre on its own. Also
ggplot requires very specific form of input, and thus many scripts must be
heavily polluted with a nontrivial preprocessing routines.

~~~
hadley
Your pollution is my tidy-ness: [http://vita.had.co.nz/papers/tidy-
data.html](http://vita.had.co.nz/papers/tidy-data.html).

~~~
joebo
hadley, I couldn't pass up an opportunity to say thank you for ggplot2 and
plyr. Both have made R infinitely more useful to me and have been used
thousands of times by me over the past 3 years.

~~~
hadley
You're very welcome!

------
rcarmo
This is amazing.

FYI, this works out of the box with Mac OS X Lion's system Python and the
gfortran compiler from homebrew - that's the only thing I need to install
separately (Xcode and brew are required to install that, of course, but most
people doing any sort of coding seem to have them anyway).

Also, a tip for those of you wanting to do the same:

I'm running IPython notebook, ggplot and all required packages under my user
account without any hassles whatsoever thanks to this:

$ cat ~/.pydistutils.cfg

[install]

install_lib = ~/Library/Python/$py_version_short/site-packages

install_scripts = ~/Library/Python/$py_version_short/bin

install_data = ~/Library/Python/$py_version_short/share

Just set $PATH to include the install_scripts, run pip, and you're all set.
Everything just works, without the need to install another Python interpreter
or mess about with system components.

(edited for formatting)

------
hatmatrix
> I've tried other libraries like Bockah and d3py but what I really want is
> ggplot2.

I wonder if that's meant to say, "bokeh"... and I would be interested to know
the major differences. This library might be better than bokeh if there's no
dependencies on coffeescript and such. But I wonder in what way this is un-
pythonic, except perhaps in its imitation of R syntax. But ggplot2 is also not
very much like the rest of R either.

~~~
paddy_m
I'm a Bokeh developer. We have plans to make the coffescript situation a lot
easier to deal with, this fix should be coming soon.

------
dkroy
Is there a reason that the dependencies are not installed via the pip call for
ggplot? I ask this out of ignorance, not trying to be facetious.

~~~
gh02t
At times numpy/scipy/matplotlib have proven pretty troublesome to install with
pip. I think they have it working now, but it's still not 100% reliable.

~~~
jofer
The main problem is that they have system library dependencies that pip
doesn't handle.

~~~
andrewryno
I think all you need is build-essentials and python-dev.

~~~
jofer
Matplotlib depends on freetype, libpng, and libjpeg as well. Numpy can be
built without any external dependencies, but it will be a very slow
installation. You're better off with an accelerated BLAS library (e.g.
ATLAS/MKL/etc) and some sort of LAPACK library. Some of numpy's functionality
also needs a fortran compiler to build. At any rate, there are other system
dependencies beyond a basic compiler and the header files for python.

~~~
andrewryno
Ah I think the reason why I didn't think I needed BLAS/LAPACK is because I
install R on the server first which installs those as shared libraries.

------
marrone12
I'm really excited for this. I might finally switch to python now!

~~~
dlib
Indeed, there's some incredible work put into the Python data analysis
toolkit. Pandas is very impressive. However, no GGplot was holding me back
from switching but this and improvements to other graphing libraries really
make me want to use Python in my next projects.

~~~
rprospero
As someone who does all my data analysis in Python, I'm curious what it is
about GGplot that it's absence was a deal breaker for you? It's obviously
quite powerful, considering all the excitement this port has generated, but I
haven't seen anything explain what it provides over other plotting solutions.

~~~
has2k1
You need understand what ggplot is all about, i.e a structured/formulaic way
to think about and build graphs. Explaining it cannot possibly do it justice.
You have dive into it, even a little bit and the way you see graphs and plots
totally changes, you immediately buy into it and would never wish to go back.

I haven't really said anything to give you a sense of comparison. It's kind of
like PGs essay
([http://www.paulgraham.com/avg.html](http://www.paulgraham.com/avg.html))
about how the merits of higher level programming (specifically in lisp) can
hardly be explained to those who have not been exposed to it.

~~~
rprospero
The merits of higher level programming are hard to explain to a Blub
programmer, but the differences are trivial to explain. I can tell a C
programmer that Lisp has automatic memory management, higher order functions,
macros, and atoms. She may not know why those matter, but I can tell her they
exist. I can tell a Python programmer that Haskell has pure functions, Monads,
Arrows, and an advanced type system. He'll think those will make his life
harder, but he'll know they're there.

What makes GGplot different? I know that I won't understand why this
difference matters until I've played with it, but it will at least tell me
where to start playing.

~~~
peatmoss
Here's another stab at why ggplot2 is a Good Thing. In the easiest plotting
case, there is a function that builds exactly the visualization that you're
looking for. This is great as long as you don't need to do anything that
deviates from the normal set of barplots, histograms, linegraphs, piecharts
(shudder), etc.

But let's say you need something a little different. Maybe you want to add
additional dimensions to that dot plot. Maybe you want the dots' size or color
to be mapped to different aspects of your data. In a world without ggplot2 (or
more generally the grammar of graphics), you're pretty much stuck going to a
very primitive drawing system in which you're specifying the virtual path of a
pen, or working with basic geometric figures.

The grammar of graphics and ggplot2 occupy a sweet middle ground between being
able to simply pick an off-the-shelf visualization, and needing to draw the
whole damn works manually. And because the grammar really is consistent, you
can also play with different facets of your data and build completely
different charts to see which is better at presenting your thesis.

In short, ggplot2 rocks, Hadley/Leland rock, and a port of ggplot2 to Python
is nothing but good news for the Python community.

------
Blahah
ggplot is a beautiful thing - excellent job porting it to python. I only wish
something similar were available for ruby.

------
Fomite
Having written all my simulation code in Python, and having done all the
visualization for said simulations in R mainly for ggplot2 graphs, this is
_very_ exciting.

------
marcosscriven
This is fantastic - I remember the pain of sorting out R issues for some
finance analysts who were wedded to ggplot. So much better to be able to do it
in Python.

------
washedup
Great stuff. Thanks a ton!

------
sonabinu
Awesome!

