

Let's make Data Visualization accessible to anyone - cyrillevincey
http://insights.qunb.com/data-visualization-101

======
rm999
>use a pie chart or a donut. We know it's very controversial (data
visualization mullahs tend to yell after anyone using a pie chart) but it's
actually pretty usefull in a business environment.

Translation: it's for people who (you think) want pretty plots instead of an
actual understanding of the data. The article's repudiation of "mullah" (?!)
opinion is a little anti-intellectual IMO - these people tend to have a lot of
experience and therefore understand common pitfalls better than a lay person's
intuition.

 __Pie charts do not work as intended in general __, but may work in specific
cases like the one the article shows. Pie charts have low perceived precision
because people are bad at estimating radial distances. If in the article 's
example potatoes were 60 and rice 58, the pie chart would not convey that rice
is lower than potatoes - they would look the same. A bar chart or stacked
chart does not lose this information because humans perceive location with
very high precision. There are several other pie chart failure scenarios, like
too many categories, or very imbalanced categories, or very balanced
categories. Overall, it's a very non-robust visualization method. It may look
good in your presentation now, but when you get updated numbers you may find
it becomes a mess.

~~~
cyrillevincey
Maybe pie charts are not robust, but as I'm saying in the post, it's actually
the best - if not the only - method to represent the contribution of the Top N
contributors to a global measure. And only this. It's not meant to let the
reader compare the relative measures of different items. If you wish to do so,
I fully agree with your comment: use a bar chart instead.

~~~
scrollbar
I default to using treemaps for viewing composition of a whole
[http://en.wikipedia.org/wiki/Treemapping](http://en.wikipedia.org/wiki/Treemapping)

~~~
cyrillevincey
Treemaps are very powerful during the data crunching phase, but they are not
ok for a final deliverable, as they ask way too much to the reader's brain to
get understood.

~~~
scrollbar
Makes sense. I'm curious though, do you mean that even with the same number of
categories for each it's too difficult? ie. instead of using hierarchical
groupings and all categories no matter how small, use only the top ~10 and
bucket the rest into "Other" similarly to a pie chart or bar.

------
couchand
Interesting stuff, but definitely a 101-level course. A more in depth
treatment would discuss the trade off between non-data ink and embellishment,
which is one of the most difficult problems in good data viz.

Some more specific thoughts:

ALWAYS represent more than one data series if possible: otherwise you're
wasting space. But make sure your thesis is still sensible. It's very helpful,
for instance, to plot a series of interest against a backdrop of the greater
population.

A histogram is NOT the best way to show the average value. A box and whiskers
plot shows the average as well as the bulk of the population in a one-
dimensional form that can be extended simply for comparison across categories
or time.

Also always remember that the data is king. Line smoothing is only a good idea
if you're not using it to give the impression that you have more data points
than you really do.

All in all their visualizations provide great examples but they could do more
to share the nuance that went into designing them.

~~~
cyrillevincey
Indeed it's a 101. Thanks for your comments. The only one I would really
disagree with is "always represent more than one data series". That's really
tough to represent multiple series on one single chart and manage to keep it
immediatly readable.

~~~
couchand
Certainly a chart should be readable, but I think it's okay to ask for a few
moments of careful study. I'm much more interested in a chart that provides
some context or basis for comparison, if not with several series than with
small multiples or a representation of the general distribution.

------
polskibus
Great intro to data-vis! Does anyone have more related links to share?

Personally, I found this very useful:

[http://complexdiagrams.com/properties](http://complexdiagrams.com/properties)

* this is not a shameless plug!

~~~
cyrillevincey
Thanks for this comment. For additional links I can suggest this other post I
wrote: "Good old Excel as the ultimate data visualization tool"
[http://insights.qunb.com/good-ol-excel-is-the-ultimate-
data-...](http://insights.qunb.com/good-ol-excel-is-the-ultimate-data-
visualization-tool-in-most-cases)

