
Why Nobody Understands Your Visualization - alanthonyc
http://petewarden.typepad.com/searchbrowser/2010/05/why-nobody-understands-your-visualization.html
======
corruption
Back in 2003 I tried introducing many of these techniques to our customers to
look at their data in new ways. No one, not even the phd's among them had a
clue how to use the visualizations.

We investigated and found that very few people even understood our basic
charts (time series etc). An even bigger issue is that ~ 1/100 customers had
the ability to take the numbers from a chart and use the numbers to change the
way they did things. In other words few people could understand the charts,
and even fewer could understand the charts and convert that into actionable
information. To reiterate, these were basic time series charts.

We ended up writing a system to interpret the charts for them in plain
language (making action easy), and highly annotating the charts with colors
and indicators so they became able to be interpreted without any cognitive
load.

I've completely gone away from all of the advanced visualization tools given
my experience, and before introducing them again I would need strong evidence
that it's actually working for our customers.

------
barrkel
This reminds me of abstract reasoning questions from IQ tests. You know the
kind of thing: you're often given the first three diagrams in a sequence, and
you have to infer the logic behind the sequence and choose the correct fourth
diagram from a multiple choice; or perhaps you need to find the odd one out.

But such questions are usually not really abstract at all; rather, they rely
on learned cultural ontologies for classifying shapes and transformations, a
language for describing circles, ovals, squares, rectangles, rotations through
right angles, mirroring and symmetry. To my mind, these tests largely measure
conformance to a cultural perspective on the world.

~~~
holygoat
Phrased another way: IQ tests are for testing how well you do at IQ tests :)

------
lotharbot
Ultimately, a visualization is an attempt to communicate some pattern(s) in
the data. Visualizations can suffer from the same issues as other forms of
communication:

\- _inadequate information_. Readers can't draw conclusions from information
you don't include. Commonly missing information: labels, scale, definitions,
context for the data.

\- _information overload_. Readers can't figure out which specific conclusions
you're trying to communicate if there are many possible conclusions. Highlight
the things about the data that you want to bring out using words, colors,
arrows, or visualizations of subsets of the data.

\- _muddled presentation_. A blurry graphic is as difficult to understand as
writing filled with misspellings and poor grammar. A busy graphic is as
difficult to understand as a long, babbling diatribe filled with unnecessarily
complex vocabulary. An ambiguously labeled graphic is as difficult to
understand as writing filled with ambiguous pronouns.

\- _inability to dig deeper_. Readers can't figure out if your conclusions are
sound if they can't look at the details. Interactive visualizations (based on
the complete data set) allow users to see how each bit of data fits into the
overall picture, and to test your conclusions on subsets of the data. If all
you present is a static graphic based on a restricted data set, it's harder to
check if you've cherry-picked data.

\- _unfamiliarity with the language or conventions_. Someone unfamiliar with
programming will have a hard time understanding certain HN posts, even if
they're well written. Someone unfamiliar with bar graphs will have a hard time
understanding a visualization based around them, even if it's a good
visualization.

\- _boring subject matter_. You can make the greatest visualization in the
world, but if the audience doesn't care enough to look at it long enough to
understand, your point won't get communicated.

------
hamilton
The author misses a large point, and it sort of confounds his other arguments.
While form and representation are absolutely important, so is context. People
cared about his facebook visualization because the subject matter was of
immediate interest to us, given the current news cycle. A visualization has to
be part of a compelling story.

Not all visualizations are great, and not all well-made visualizations express
a compelling narrative. And not all visualizations, regardless of how well-
made and compelling, are right for a target audience. Relevance is different
than form; let's not confuse them.

~~~
petewarden
I agree with you that your underlying data has to be on something people care
about. It's necessary but not sufficient, since the same visualization without
labels and coloring sank without a trace when I released it a week before.

------
blogimus
That car data plot is an example of a multidimensional parallel graph. Too bad
he didn't go into more depth on this since he showcased this visual
representation in his blog post. Some more info you get your feet wet is here:
<http://filer.case.edu/~dbh10/eecs466/report.html>

Parallel graphs were a starting point on multidimensional al visualization.
They are useful only to a certain point of complexity, then they are
counterproductive as it takes more time to analyze the graphs than to view a
number of simpler graphs side by side or in series. Newer techniques involve
radial visualization, such as

[http://www.infovis-
wiki.net/index.php?title=Radial_Hierarchi...](http://www.infovis-
wiki.net/index.php?title=Radial_Hierarchical_Visualization)

Something to consider with your animations is the cognitive load of animations
of complex visualizations. See the following paper:

<http://geoanalytics.net/GeoVisualAnalytics08/a15.pdf>

~~~
ivanzhao
It's proper name is "Parallel Coordinates", popularized in the 90s.

Pro: \- Spacing saving. Cartesian space is expansive as the X/Y axis can
quickly take up the entire 2D plane. Turning the axis parallel to each other
saves a lot of space hence the possibility for multi-dimensionality.

Con: \- Messy. A normal dot in X/Y scatter plot is "stretched" into a line. So
a lot of crossings/overlaps. \- Order of the axes matters

Interactivity can help reduce the visual complexity, through axes reordering,
or highlighting (also called "brushing"). Also, after little bit training,
people can easily develop the ability to recognize patterns in Parallel
Coordinates, just think how you learned to read the regression lines in
scatter plots : )

~~~
VMG
The actual plot in question is interactive. You can select regions on each
axis.

------
teaspoon
I've been on the lookout for a blog chronicling bad visualizations. Something
like Regretsy for charts and infographics, but perhaps with more constructive
critiques. Surely something like that must exist?

------
ivanzhao
there's a world to steal from the cartography and typography traditions, not
only because these are already established "visual languages", but also their
subtlety and respect to the viewer.

it is a shame that today's infograph designs are mostly going after
3D-rainbow-eye-candies, screaming too much of the designer him/herself.

------
btilly
A great book to help you come up with the appropriate visual representation
for what you're trying to say for your given audience is _Back of the
Envelope_ by Dan Roam.

It ends with a somewhat unfortunately chosen example, but the actual advice is
quite good.

------
chipsy
I love that automobile visualization featured at the top, because it seems so
promising, only to become increasingly disappointing as you try harder and
harder to make sense of it. The low performance only makes matters worse.

~~~
Herring
The biggest problem is it's best read right to left.

~~~
jleader
There's also an odd reversal near the middle; number of cylinders is roughly
correlated w/ displacement, which is roughly correlated w/ weight, which is
roughly correlated w/ horsepower, which is roughly... oops, inversely
correlated w/ 0-60 acceleration time. Hence the lines looking like they got
twisted in a bunch between the horsepower axis and the acceleration axis.

I think interaction is a very under-utilized aspect of visualization; if that
car visualization hadn't had such abysmal performance, being able to play with
it might have been very informative. I think interaction is more powerful than
time (but probably harder to design effectively).

As it was, I noticed several interesting new (to me) points:

\- Quantization of number of cylinders is obvious, but I didn't expect
displacement turn out to be so quantized

\- The highest acceleration cars didn't have the worst fuel efficiency

\- The oldest cars (1970) appeared to include the ones with the worst fuel
efficiency.

I've been trying to figure out a better way of laying out the
weight/horsepower/acceleration axes, given that weight*acceleration=horsepower
(with the addition of some noise due to different measurement techniques,
etc.)

~~~
Herring
Well you're gonna have that twist anyway, unless you flip a couple columns. I
sort of prefer having it in the middle. But then i didn't really mind the
visualization so maybe I'm the wrong person to ask.

------
haupt
So far it seems that every state I click on has "Megan Fox" in the interests
list. I wonder if he cherry-picked his data. :P

~~~
lotharbot
She apparently _is_ that popular, at least on Facebook. She has about 6.7
million likes.

For comparison, the Bible is at 2.2 million, Kobe Bryant is at 2.4 million,
Starbucks is at 7.2 million, and Obama is at 8.4 million.

~~~
haupt
According to the Facebook statistics page [0], there are over 400 million
users on the site. The same page states that about 70% of those users are from
outside the US. That suggests that 30% of Facebook users (about 120 million)
reside in the US. Let's assume that the entirety of her fan base is within the
US. That means her fans only represent (at the _most_) approximately 5.59% of
US Facebook users. I wish I had data per state so I could pin this down more
thoroughly (wrt to ratios of Megan Fox fans to state populations), but going
on just what I have seen so far I am admittedly suspicious of the gentleman's
data.

[0] <http://www.facebook.com/press/info.php?statistics>

~~~
petewarden
The gentleman himself here!

You're right to be suspicious, there is a bias towards the more popular pages,
since they show up more frequently on people's crawlable public profile pages.
Only 20 liked pages are shown for each person, and FB apparently pick the most
popular.

------
ahoyhere
This can be summed up as being because "you want to make infographics only
because they're trendy, but have no idea what your're trying to say, or why,
or how to say it effectively."

Fixing labels, simplicity, interactivity, animation, etc., won't solve the
problem with the whole conception being wrong, useless, or boring.

