
Comprehensive Guide on Data Visualization with Pandas - min2bro
https://kanoki.org/2019/09/16/dataframe-visualization-with-pandas-plot/
======
lawlorino
I really am not sure how "comprehensive" I would call this, after glancing
over it it looks like one of the million currently existing basic pandas
plotting guides.

~~~
abhishekjha
What should one do if after following those million guides it still doesn’t
stick? I always end up googling what I want and hit SO. Is there something
wrong with me or does numpy and pandas seem more difficult than they should
be?

~~~
lawlorino
I really wouldn't worry about it - I've been using Python for data work for
the last 5 years or so and I have to look stuff all the time. Eventually the
basic stuff sticks but it's like any kind of coding, I don't think anyone ever
hits the point where they hardly ever have to Google stuff.

One useful tip I can suggest is to create a repo for useful code snippets, so
if you ever find yourself doing something new that you think you might need
again, just spend a bit of time commenting and describing it and add it to the
repo. That way instead of having to spend time searching you'll hopefully
remember doing it before and be able to find it easily.

~~~
Foivos
After googling the same things every other week, I started using a snippets
application (in my case it is SnippetsLab) where I wrote a nice description
and keywords for every snippet. Life is so much easier that way.

------
whitehouse3
Lately, `pip install pandas` is my first step after making a new virtual
environment. Its read_sql and read_csv methods are magic. The resulting
DataFrames are just like DataTables in C#. And for complex joins and
aggregations, I can DataFrame.to_sql into an in-memory SQLite database.

Pandas feels like the wrong tool for this job. I don't use multi-indexes or
any statistical methods. I don't chart anything.

But it's so darn convenient. If the time comes to optimize I can `import csv`
directly and improve performance. But nothing beats it for prototyping.

Are there better options in this space?

~~~
maest
Genuine question - would you be willing to spend money for a better version of
pandas?

Better in some generic sense of lighter, faster, better API.

I share your implied concern that pandas can be quite large and I personally
disagree with a lot of the design decisions when it comes to the pandas API,
but building an alternative tool would be a full time job. Unfortunately,
there is no mechanism to support Python library developers and the expectation
is for Python libraries to be free.

I'm curious how many people would be ok paying for a Python library.

~~~
_coveredInBees
I think that would be an uphill battle and very hard to succeed financially. I
agree with you regarding the API being a mess, but pandas is so heavily
entrenched in the datascience space (in Python land) that it is almost
impossible for a free replacement to take over, let a lone a paid library.

~~~
whitehouse3
And pandas is valuable particularly because it has so many users. I'd worry
that a paid product would stagnate over time, or change to meet the needs of
its largest customers, leaving me behind.

I go out of my way to support open source projects. Closed source would be a
much harder sell.

------
Dowwie
Other graphing libs for jupyter notebooks include:

    
    
      - Bokeh
      - Plotly
      - Seaborn
    

These libraries were built to improve upon matplotlib or each other, weren't
they? Yet, people continue to reach for "the original" ¯\\(ツ)/¯

~~~
lazzlazzlazz
Bokeh and Seaborn are very, very thin skins over matplotlib, and honestly
don't improve the user experience at all. Only Altair has changed things for
me.

~~~
wodenokoto
Bokehs rendering backend is bokehJS

[https://bokeh.pydata.org/en/latest/docs/dev_guide/setup.html](https://bokeh.pydata.org/en/latest/docs/dev_guide/setup.html)

------
psv1
The matlab plotting syntax got transferred to Python through matplotlib and
got very deeply ingrained - was first, got popular, built into pandas and
statsmodels, foundation of seaborn etc. Recently I saw a snippet of Julia code
that uses "pyplot", I assume because people find it familiar and convenient.
That API just refuses to die.

------
psv1
The pandas wrappers around matplotlib are convenient but for anything that
needs customising, you'll need to reach for the full matplotlib API anyway.

~~~
min2bro
Most of the things like ticks and lims are covered which is above basics. But
if you are looking for annotation or animations then you need coding in
matplotlib though

------
mayankkaizen
Does anybody here think that Pandas API design is ugly and inconsistent? It
feels like hack after hack.

~~~
WilliamEdward
I really only use Pandas for DataFrame structures. Doesn't really bother me if
the rest of it is bad.

------
lazzlazzlazz
I strongly recommend Altair ([https://altair-viz.github.io/](https://altair-
viz.github.io/)) as an extremely Pandas-friendly alternative approach to data
visualization. It's the first library that has successfully "hidden" the ugly,
gnarled matplotlib layer underneath for me. It also looks killer.

~~~
prepend
I like Altair, but it still has some annoying missing features like the
inability to caption and footnote charts. Or the inability to format filters
like sliders.

It makes nice looking charts in html/d3, but is a hassle to save a real image
because it requires chrome or Firefox. Which happens to not work in my CI
environment.

So at least matplotlib can save png without needing a bunch of stuff.

~~~
piccolbo
Add lack of support of polar coordinates. But every lib started immature, and
altair is quite new. I think they could use a few good PRs though.

------
Tycho
I recommend reading the official docs. I think they improved the plotting
interface recently, and I learned a lot from reading through the guide:

[https://pandas.pydata.org/pandas-
docs/stable/user_guide/visu...](https://pandas.pydata.org/pandas-
docs/stable/user_guide/visualization.html)

------
teodorlu
Here's a dataset link that doesn't require registering with Kaggle:

    
    
        https://www.teodorheggelund.com/static/world-happiness-report-2019.zip

------
scarby2
I had no idea what Pandas was (other than the plural of the cute fluffy
creature). I was really hoping this was going to talk about how to use panda
images to visualize data!

------
tempodox
The name “Pandas” is really misleading, especially in the title of this
article. It's missing the phrase, “No animals were harmed in the making of
these graphs”.

------
zer0faith
It shows how to generate the different charts but it doesn't show you how to
save them as a image.

~~~
psv1
Use this -
[https://matplotlib.org/3.1.1/api/_as_gen/matplotlib.pyplot.s...](https://matplotlib.org/3.1.1/api/_as_gen/matplotlib.pyplot.savefig.html)

------
j88439h84
Pandas should listen to the unix philosophy a bit and remove its plotting API.

------
codeulike
I would have thought if you focused on bamboo leaves as the main visual motif
then that would keep their attention.

------
kyberias
Leave those peaceful furry creatures alone, please.

~~~
min2bro
What do you mean exactly here?

~~~
lmkg
I assume it's a pun about pandas.

