For anyone reading this, I don't think this is intended to replace e.g. matplotlib which I suspect is more powerful overall. It's more for creating good looking, publication-quality graphs. A lot of thought has gone into ggplot2's default theme. A lot of influence from Tufte et al. if I remember rightly. Though, sadly, it looks like this port doesn't replicate ggplot2's graphs pixel-for-pixel, which is a shame.
Edit: Incidentally if anyone has a recommendation for a good graphing app for OS X, I'd be all ears. I still haven't found a good one. I usually use Plot (http://plot.micw.eu) which is great (scriptable, produces good looking graphs) but has a bit of a clunky interface. I personally find DataGraph's interface horrendous, even though its graphs are good, and Excel takes forever to make anything remotely decent. matplotlib and its ilk are fine but require custom scripting. I've been using xmgrace recently simply because it launches quickly, but it's (understandably) not retina and not native and just a pain really. What I'd give for a nice graphing app...
The grammar of graphics approach attempts to identify the components of a graphic so we can specify them in a high level, abstract manner. This is also why the syntax is so unusual compared to most plotting libraries. (Grammar of graphics also deeply influenced protovis and D3.js.)
I've found learning the basics of ggplot2 to be a great investment in my productivity at data analysis. Basically you get to say "make a plot; x-axis is page views and y-axis is time and make it a bar plot with bars grouped and colored by user age" and you get something that looks great.
(edited to fix minor autocomplete typo.)
I started with the idea that graphs should be built in an object-oriented fashion out of widgets. The widgets are arranged in a tree, a page holds a graph or a grid of graphs and a graph has axes and plotting widgets, and so on. Each widget has a set of properties and formatting settings which can be changed. It's now gained more data manipulation abilities. Data is stored in named datasets, linked to external files or entered manually, or you can even capture it from sockets or external programs. You can easily write a plugin to load your own file format. There are a set of data manipulation plugins for filtering and so on (again you can write your own). When plotting, you can just enter expressions to investigate your data.
Most of Veusz is written in Python with PyQt. There's a bit of C++ code to handle the inner loops, but it's pretty responsive. The next release should support Python 3, too.
One nice thing is that the saved files are simply python commands to regenerate the plot, so it's easy for the user to automate making plots. You can also use it as a python module for plotting, using the same command interface as the saved plots use.
At that time QtiPlot was very actively developed by one guy who was living off license fees on his GPL'ed program. I was trying similar thing then (with program called fityk) and emailed him to ask how is he doing. In my case, it was interesting experience but after a few months and depleted savings I had to find a day job. He survived much longer with more popular program, but I suppose he also has a day job now.
Quite a nice project, though!
Obviously the situation is opposite when one considers, let's say, "analysts", i.e. people that expect to be able to completely ignore the form of the data and focus purely on the content -- for them your stack of tools is a pure gold, even with its trade-offs.
Some clarification about what you mean regarding efficiency would be useful - do you prefer arrays?
Maybe I'm alone in that, but I believe the power of R lies in its resilience and the fact that it allows me to write code that is equally obvious in what it does as how it does. The philosophies of one-true-data-structure and natural-language-like expressions doesn't play well with it, though I admit they are very useful for people accustomed to a declarative programming.
EDIT: oops, posted to the wrong person
Also, seems like that's a lot of dependencies for ggplot2. I'm wondering if this is really worth switching off of matplotlib. Sounds like all it's really doing is changing the method that you call ?
It is my second choice language these days after python, and the only one to use when you need a certain library.
FYI, this works out of the box with Mac OS X Lion's system Python and the gfortran compiler from homebrew - that's the only thing I need to install separately (Xcode and brew are required to install that, of course, but most people doing any sort of coding seem to have them anyway).
Also, a tip for those of you wanting to do the same:
I'm running IPython notebook, ggplot and all required packages under my user account without any hassles whatsoever thanks to this:
$ cat ~/.pydistutils.cfg
install_lib = ~/Library/Python/$py_version_short/site-packages
install_scripts = ~/Library/Python/$py_version_short/bin
install_data = ~/Library/Python/$py_version_short/share
Just set $PATH to include the install_scripts, run pip, and you're all set. Everything just works, without the need to install another Python interpreter or mess about with system components.
(edited for formatting)
I wonder if that's meant to say, "bokeh"... and I would be interested to know the major differences. This library might be better than bokeh if there's no dependencies on coffeescript and such. But I wonder in what way this is un-pythonic, except perhaps in its imitation of R syntax. But ggplot2 is also not very much like the rest of R either.
I haven't really said anything to give you a sense of comparison. It's kind of like PGs essay (http://www.paulgraham.com/avg.html) about how the merits of higher level programming (specifically in lisp) can hardly
be explained to those who have not been exposed to it.
What makes GGplot different? I know that I won't understand why this difference matters until I've played with it, but it will at least tell me where to start playing.
But let's say you need something a little different. Maybe you want to add additional dimensions to that dot plot. Maybe you want the dots' size or color to be mapped to different aspects of your data. In a world without ggplot2 (or more generally the grammar of graphics), you're pretty much stuck going to a very primitive drawing system in which you're specifying the virtual path of a pen, or working with basic geometric figures.
The grammar of graphics and ggplot2 occupy a sweet middle ground between being able to simply pick an off-the-shelf visualization, and needing to draw the whole damn works manually. And because the grammar really is consistent, you can also play with different facets of your data and build completely different charts to see which is better at presenting your thesis.
In short, ggplot2 rocks, Hadley/Leland rock, and a port of ggplot2 to Python is nothing but good news for the Python community.
"a structured/formulaic way to think about and build graphs"
This is what plotting using the "grammar" provides. Each plot is a like a sentence, for a sentence can be composed using some of the following - verb, noun, adverb, adjective, interjection, pronoun, proposition, conjunction. Alike, ggplot2 allows you to think of a graph in terms of layers, which layers can have different components that describe the "geometry" or "statistics" of the plot. Plus, there is a lot more.
You could probably describe your favourite plotting package as structured and formulaic, but an experience with ggplot2 would convince you otherwise.