
Three charts are all I need - mh_
http://37signals.com/svn/posts/3388-three-charts-is-all-i-need
======
JumpCrisscross
I think I see a Pareto law of data visualisation: most of the data
visualisation needs of a _field_ can be satisfied by a small number
(hypothesis: N = 3) of visualisations.

Lines for time series, histograms for frequencies, and panels for everything
else is a good heuristic for _web analytics_. E.g. in quantitative finance a
scatter plot is my go-to visualisation.

------
saraid216
Paying this forward: <http://www.flickr.com/photos/amit-agarwal/3196386402/>

~~~
sesqu
For what it's worth, I disagree with that chart. It breaks conventions, makes
assumptions, and has an eclectic selection, so use it only for suggestions.

------
swanson
Food for thought, here is a visualization I've made in an app for tracking
team mood: <http://i.imgur.com/PUUYGKI.png>

It's pretty easy to spot what days were good, bad, and in between. You can
start to see patterns (Every other Tuesday seems to be better, why is that?
Oh, that's when we had donuts!).

If you were to apply the Three Chart rule, this would be a line chart since it
shows change over time. I find the calendar visualization much easier to
interpret than a line chart in this case.

~~~
danso
No, I think this would fall under the Table category, as it, and most table
charts, exhibit the quality of "small multiples"

------
bengillies
Those three may be fine (and indeed all you really need for representing most
data) in some fields. In other fields though, they're entirely unsuitable.

A lot of people have mentioned scatter plots.

I'd quite like to know how you'd represent relationships between things using
only a histogram or line chart.

I realise that it's covered somewhat by the "95% of all cases" statistic, but
really, as with most things, if all you're doing is visualising relationships,
then that statistic is likely way off.

The general point is a good one - use a visualisation that's appropriate for
representing your data not one that's appropriate only for looking nice, but I
think the message is lost somewhere in amongst the rhetoric.

------
mshron
Assuming that you're comfortable with 2d histograms (hexagonal binning is the
standard) I completely agree. Otherwise _scatterplots_ are pretty crucial.

~~~
yummyfajitas
Unless you are comfortable with tweaking opacity, jitter and point size until
you get a graph that is representative, don't use scatterplots. A 2d histogram
with a decent color scheme is far more reliable and easy.

For dense data, scatterplots can obscure more than they show:

[http://www.chrisstucchio.com/blog/2012/dont_use_scatterplots...](http://www.chrisstucchio.com/blog/2012/dont_use_scatterplots.html)

Even an author who was well aware of the problem made the same mistake:

[http://garyrubinstein.teachforus.org/2013/01/09/the-50-milli...](http://garyrubinstein.teachforus.org/2013/01/09/the-50-million-
dollar-lie/)

~~~
tmoertel
Using a 2D histogram when there isn't overplotting is just throwing away good
data. Why do it unless you need to?

~~~
yummyfajitas
Overplotting is common, and "you" (generic internet guy who posted a graph on
his blog, not you personally) probably don't even know how to spot it.

Secondly, the data you throw away is usually just sampling noise. Most of the
time the interesting object is the underlying probability distribution -
individual points are only useful to infer that.

In the cases where individual data points are actually of interest (e.g.
<http://cl.ly/GvnM> ), go ahead and use them. But they are terrible default
choice.

~~~
tmoertel
Scatterplots are only a "terrible" choice when the underlying data set is
concentrated; in all other cases they are superior. Do you not realize that
using a 2D histogram on a sparse data set is every bit as foolish as using a
scatterplot on a concentrated one?

Whether scatterplots make a "terrible default choice," then, depends on what
you believe about the distribution of data sets out there. Based on your
advice, you must believe that most people have data sets that are dense.

For a _lot_ of people, however, that's not true. For them, "plot density, not
points" is the terrible default choice. But you're telling them to make it
their default anyway.

------
kunalb
I disagree with this post: there's another use case for visualizations --
consuming data as fast as possible to take action (eg firefighting). I work on
tools for engineers, and I've found that a rich, non-standard visualization
that people can learn to parse quickly after a tiny bit of initial
acclimatization -- but showing them exactly what they need can help a lot.

~~~
Chris_Newton
_a rich, non-standard visualization that people can learn to parse quickly ...
showing them exactly what they need_

I do a lot of UI work, also mostly for technical users, so I’m interested in
both effective visualisations and efficient interactions built around them.

I’ve found that customised and/or contextual visualisations and interactions
can be very effective if, but _only_ if, they fit how the user thinks about
the situation better than any of the standard alternatives.

Put another way, if you’re dealing with a solved problem, using the solution
that everyone already knows usually works best. But if you’re dealing with
something new and different, and you can’t build what you need cleanly using
existing tools, then creating a new kind of tool often gets better results
than cobbling something together with the wrong tools for the job.

The hard part is that creating more appropriate tools generally requires
understanding your users’ mental model(s) of the situation and the actions
they need to take, and even a seemingly small mismatch between what a user
expects and what you actually give them can really hurt when the user doesn’t
have familiar conventions to fall back on.

In practice, I’ve had some success building UIs around a small number of
specialised visualisations and controls (typically making up a single main
screen/page) but using only mainstream presentation like tables and histograms
for supporting features, but every project is different.

~~~
kunalb
True -- any visualization that doesn't get the message across is just about
useless.

Luckily we have very tight feedback loops -- and I can push new versions every
couple of hours so I can customize it really fast to suit their needs/mental
models.

------
SatvikBeri
I mostly agree with the article, but disagree with the idea that unusual
visualizations are harder to understand. Tuning your chart to your data often
makes it easier for your audience to understand. For example, if you want to
communicate the detailed difference between how well two models predict the
performance of teams in the NFL, an ROC chart showing both models is usually
the clearest way to present[0]. If you want to model different scenarios, a
segmentation chart may be better. If you just want average conversion rates,
bar charts may be more helpful.

That said, I certainly agree with using the same chart for things like weekly
updates, etc. Creativity for creativity's sake is pointless-the point of using
different types of charts is to communicate a complex message as simply as
possible.

[0]: [http://blog.optimalbi.com/wp-content/uploads/2012/12/ROC-
cha...](http://blog.optimalbi.com/wp-content/uploads/2012/12/ROC-chart1.png)

------
ChuckMcM
Interesting thesis, I would like to see a cage match between Noah and Edward
Tufte :-) I tend to favor Tufte's thesis that humans can extract more
knowledge per millimeter of paper space out of a chart than they can out of
text. The trick of course is constructing that visualization. Most folks are
facile with words, while fewer can work as effectively with shapes.

I like the notion that if you cannot see at least four things in a chart then
the chart isn't doing its job. It makes me ask the question, "What is the
context in which this chart is expressing information? Can I show that?"

~~~
Q6T46nT668w6i3m
I think Noah is echoing the work of W.S. Cleveland and Edward Tufte—both of
whom advocate for using a small number of well-understood methods (e.g.
tables, histograms, line plots, scatter plots, and so on).

------
trjordan
I'm a fan of histograms above nearly all else. When I need to show change over
time, I actually prefer heatmaps:

[http://deliveryimages.acm.org/10.1145/1810000/1809426/gregg3...](http://deliveryimages.acm.org/10.1145/1810000/1809426/gregg3.png)
<http://queue.acm.org/detail.cfm?id=1809426>

It's like a scatterplot, but a bit better at showing collapsed data (i.e.,
there's a lot of data at that point on the graph -- is in more or less than
that other jumble of Xs?).

~~~
tmoertel
Histograms are great for visualizing relative density, but they have problems
when items don't sort neatly into bins. They also make it hard for viewers to
accurately estimate the portion of the distribution that falls within
arbitrary ranges, say the percentage of smokers aged 24 thru 35. Plotting the
empirical cumulative distribution function solves these problems.

For a numeric random variable _X_ , its CDF _F_ ( _x_ ) gives P( _X_ <= _x_ ).
So if _X_ gives the age of smokers, the answer to our earlier question is just
F(35) less F(23). Plotting _F_ over all values of _X_ , then, lets us not only
"see" the shape of the distribution but also answer range questions: just
lookup two points on the graph and subtract.

Some examples:

<http://docs.ggplot2.org/0.9.2.1/stat_ecdf.html>

~~~
sesqu
The problem with the ECDF is that it's harder to get a feel for, being
cumulative. I spent a few days once playing with N-bin histograms, and still
feel those should be the default (with non-compact ends, maybe gamma
distributed) over smoothed hacks.

------
kfcm
And completely missing, the old reliable pie chart. Used when showing
attributes (percentage, counts, etc) of a whole.

And then there's the scatterplot. Very useful in regression models for best
fit.

Charts and plots are tools. To limit yourself to just three is using a
screwdriver for a hammer, or C# for any programming task.

Understand the available tools and use the right one for the job at hand.

~~~
defrost
> the old reliable pie chart.

? Weird. In the past 35 years working in numerical analysis of multitudes of
data sources from toilet flushes in a city of a million people to stock
movements to multichannel radiometrics, to cloud data from LIDAR, to cot death
incidences, etc. I've _never_ once used a pie chart (or, in fact, worked with
anyone that's used them).

I understand they are popular when lying with statistics and in power point
displays to non technical suits, but they have pretty limited use in
understanding data or presenting layered attributes.

Scatter plots are somewhat useful, when combined with a means of indicating
densities, such as heatmapping; box and whisker plots that show central
densities, means, medians and extents of ranges are useful.

But Pie Charts? Professionally they're the joke setting in Excel . . .

------
showerst
Slightly more advanced guide that I like -

[http://giveupinternet.com/2009/01/16/chart-of-the-charts-
cha...](http://giveupinternet.com/2009/01/16/chart-of-the-charts-chart-
suggestions-a-thought-starter-pic/)

They hit the most important thing about chart making right on the head: What
are you trying to communicate?

------
nfm
I really hope that the term "Infauxgraphics" gets some more traction.

------
lubos
he forgot pie chart. I found that no matter how rational one can be about
visualization, single pie chart can make world of a difference to a client.

edit: thanks for downvotes, I really appreciate how people can't recognize
sarcasm. I'm very big on Tufte but it's easier to just give client damn pie
chart than re-educate him on all aspects of data visualization.

~~~
sputknick
Agreed. pie chart is best when the total set will not change in size such as
something that totals to 100% will always total 100%.

~~~
saraid216
A pie chart is best when you're trying to convey a minimal amount of
information, such extremely inexact proportions–like "roughly a third" or
"about half"–, within an extremely small dataset, capping at around ten items.

------
antidaily
boring.

~~~
MSM
What's wrong with being boring?

If I can completely understand it without giving a hint of effort, that's
perfection.

