
Effectively Using Matplotlib (2017) - jjwiseman
https://pbpython.com/effective-matplotlib.html
======
smabie
Matplotlib belongs to the worst category of software: very powerful and very
awful. Nothing makes any sense and it’s so profoundly unintuitive it almost
feels like I’m being pranked. But, of course, use it I must.

Pandas also comes off as an unintuitive joke, but my displeasure with it has
_mostly_ worn off. Matplotlib however makes me feel angry pretty much
everyday.

~~~
EForEndeavour
What data-analysis package do you prefer over pandas? I've used pandas since
circa 2015, and lately have been trying to become fluent in R and the
Tidyverse, but it's been hard so far to unlearn the deeply ingrained
python/pandas patterns.

~~~
wodenokoto
Start with R for data science (book by Hadley, available online for free)

Tidy verse assumes tidy data. If you are not working with tidy data, it is
unlikely to be a big help. Most data can probably be thought of as tidy.

Remember that any and every operation on a data frame returns a data frame, so
unlike chaining in Pandas, you never have to worry if a method you want to use
belongs to a series or a data frame, or if your method is returning a series
or a data frame.

Select() selects columns, filter() selects rows. This never changes unlike the
[] which means different thing depending on if it is used on a data frame
(which you are not guaranteed to be served after calling a method on data
frame in pandas!) on a series or using the .loc or .iloc methods.

There is no index, instead you just filter on rows.

Pandas comes with a ton of build in utilities which the tidyverse doesn’t,
mostly because R is already full of functions you can easily apply across
columns.

But particularly pandas date handling functions are really cool

~~~
EForEndeavour
That's reassuring: I'm (slowly) working through Hadley Wickam's r4ds online
book, which is just fantastic. Thanks for the tidyverse tips!

------
urschrei
Matplotlib is verbose, and has had an inconsistent API in the past (v3 has
improved this a lot), but if you need to produce publication-quality figures
using Python, taking the time to get comfortable with it pays off. I've been
using it to produce maps and data visualisations for years – when I finally
figure out how to make something look good, I put the notebook on Github, both
for my own reference, and for others:
[https://github.com/urschrei/Geopython](https://github.com/urschrei/Geopython)

------
paultopia
This post is great---even just explaining the difference between figure and
axis, and the multiple systems (and the wise recommendation to use the OO
system), and all the rest is gold---that stuff took me days of beating my head
against the wall and searching through the matplotlib documentation to sort
out.

Honestly, for 99% of uses Seaborn is great, so long as you remember to use the
latest version---for some reason, a lot of people seem to have 0.8.0
installed, and the api changed with 0.9.0.

For uses beyond what Seaborn can do, I think that the best strategy is just to
figure out a personal plotting language and then wrap that up into a personal
library so you never have to think about that again. That's kind what I've
done: I threw together a library to produce some basic figures that are
suitable for printing,[1] and now I never have to think about those figures
again.

[1]
[https://github.com/paultopia/plottyprint](https://github.com/paultopia/plottyprint)

------
blub
Matplotlib is one of the most user-unfriendly libraries I've had the
displeasure to use. The most effective thing to do is to not use it at all.

If you can get away with it use pandas' plot, seaborn, altair, etc.

~~~
nmca
Everyone I know at $big_research_lab uses it and hates it. But what are the
alternatives really? Altair plots are larger than you dataset, plotnine has
totally unusable docs and _insane_ behaviour in some places (look up what
plotnine's gg.ylims does; I'd bet it's caused more than one peer-reviewed
error.), and further seems to have basic operations like drawing a vline scale
in slowness with the size of your data. Plotly is commercial, raw d3 is
inconvenient from python. The situation is deeply unfortunate.

~~~
SJetKaran
what is wrong with plotnine's ylim?

~~~
nmca
Clips your data, changing the behaviour of smoothers etc, iirc
gg.cartesian(ylim=...) is the thing that people usually want.

------
jammygit
I agree with the basic premise: matplotlib is sort of lousy to use, and
annoying to learn, but it works and does everything you might need. There’s
something to say about software that solves a problem

~~~
rrosen326
I'd say the premise a bit differently: matplotlib is tricky to learn but once
you figure out how it works it's not bad.

I just went through this process myself
([http://kachess.k2company.com](http://kachess.k2company.com)) and this
outline would have been SO helpful. I learned these points the slow, hard way.

(And while it doesn't suck, it's certainly not fun and intuitive.)

------
knolan
I’m a long term Matlab user and I’ve been using Matplotlib more and more
recently. This is partially out of frustration with recent changes to Matlab
graphics and also a desire to use more open source tools.

Matlab plotting is extremely powerful and versatile. Sometimes the output
could be nicer but the interactive figure hierarchy is great. Matplotlib on
the other hand is, at least to me, a lot more clunky to work with. But it gets
the job done and the output often looks nicer and solves my gripes with
Matlab.

~~~
sgillen
I'm in a similar boat. I actually like Matlab's plotting a lot more than
Matplotlib, but I use python for everything these days. So for me I use
Matplotlib for really quick visualizations (usually just plt.plot()), and if I
want to do anything more fancy I'll dump my data into Matlab.

------
jackbrookes
What is the state of the art in Python data visualisation compared to ggplot2
in R? Over the last few years I have gone exclusively with ggplot2 because it
seems so intuitive and customisable.

~~~
claytonjy
Altair is getting pretty good. It's a bit like ggplot in that it's
declarative, though I wouldn't suggest dropping R for this any time soon.

[https://altair-viz.github.io/](https://altair-viz.github.io/)

~~~
koningrobot
Another option is to use plotnine, which is intended to be a ggplot2 clone. It
uses matplotlib under the hood, so if something isn't right you can tweak it.
That was the main drawback I found with Altair: your declarative code is
almost literally dumped to a json file and then rendered by a process external
to Python, so good luck tweaking your plot.

[https://github.com/has2k1/plotnine](https://github.com/has2k1/plotnine)

------
kzrdude
Another crucial tip if you do a lot of custom drawing, is to use collections
instead of calling draw functions per object. This radically speeds up
drawing. For example using PolyCollection to draw a big bunch of polygons,
then LineCollection, EllipseCollection etc.

------
jbay808
I've learned to love matplotlib and its OO interface.

I just wish that its documentation examples would consistently provide the OO
interface version of how to achieve each example, at least alongside the
state-machine version.

It's always frustrating to see an example image that shows exactly what I want
to achieve, and then click on the code for it and it's using the other
interface, and I have to try to guess the equivalent OO commands. Which are
always slightly different, like set_ylabel instead of ylabel...

------
thewhitetulip
Isn't Seaborn famous for this very reason? matplotlib is a bit difficult to
write code in but seaborn makes it easy.

~~~
denhaus
Seaborn is nice for making “standard” plots. but if you need to customize your
graphics somewhat, you’ll find yourself needing to use MPL in addition to
Seaborn anyway.

------
throwawayhhakdl
I don’t mind matplotlib but I highly recommend trying to use seaborn over it
for anything.

Specifically seaborn’s catplot (for categorical), lmplot, swarmplot, pairgrid,
and facetgrid.

The seaborne gallery really has an extra level of expressiveness that you
might not have considered as an amateur visualizer and you can make some very
nice things.

Matplot lib runs underneath it so you’ll need to learn all of the adjustment
functions: lim, figsize, ticks, etc. but I think it’s fine overall.

Charts are hard because there’s more depth than people realize and if the
library wasn’t deep you’d be unable to express that depth.

