Hacker News new | past | comments | ask | show | jobs | submit login
I hate stacked area charts (2011) (leancrew.com)
369 points by dmitrig01 on June 11, 2014 | hide | past | web | favorite | 64 comments

I agree with the problem but disagree with the solution. I'd much prefer removing the stack completely and using overlapping plots, like this:


In a chart like this no data is lost in presentation. You can easily answer questions like "when did Android overtake iOS in marketshare?" and "is Windows Phone marketshare growing or shrinking?"

Exactly. Stacked charts in general are misleading because they imply volume as a metric to be evaluated, when in any sort of number-over-time chart, volume is the last thing you need people concentrating on.

Unless it is a mekko chart in which case the volume conclusion is totally apt.

I wasn't aware of Marimekko charts[1] (well, not by that name anyway) before reading your comment -- but are they commonly used with a time axis? In the examples in [1] they seem like a pretty straight forward visualization of volume?

Off the top of my head I can't think of an example where a time axis wouldn't normally be used to visualize variation over time...? Perhaps quarter earnings as contributing to yearly earnings?

[1] http://www.fusioncharts.com/chart-primers/marimekko-chart/?M...

I've seen Mekko charts, or modified "mekko-ized" waterfall bar charts used with time axes. Blame management consulting.

One problem with overlapping plots here is that it doesn't convey that we are dealing with parts of a whole. Both the stacked area chart and stacked column chart naturally relate the growth of one component to the shrinking of another.

Relative comparison between slices of the stacked area or column chart is a difficult cognitive function, however, and is imprecise. I'd have to disagree with the article's author on the value of the stacked bar chart, as, while the step changes are more discrete, point-to-point comparison is still encumbered. The even more un-sexy clustered bar chart is yet more appropriate.

A solution, when the designer needs or wants to reflect total volume of that whole, is to underlay the lines (in this case) with an area plot representing the total volume over the same time interval. In this way, the two pieces of information (individual performance and total size) are encoded in distinct and easily understandable ways.

This pattern is good only when you effectively compare 2 variables; you easily see that when one shrinks, other grows, and when does it happen. When you have more variables, it automatically becomes too complex.

I don't like percentage plots. If one thing goes up by 1 point, then the rest must necessarily go down by 1 point (distributed between them). I think that's a bit confusing. I'd much prefer absolute values.

[PDF] http://sohcahtoa.org.uk/pages/files/fredas-tea-stall-three-c...

4 small groups of students (pairs or triples) repeated enough times to cover whole class. Each group gets one of these three charts or the table. Plenary at end: the ones who got the % bar chart often have totally different sentences from the other three groups.

I tend to give the % bar chart to the more confident students.

Personally, show me the numbers. Wins every time.

Isn't that exactly how it's supposed to work?

Yes, but human being seem to struggle with the concept, regularly assuming that a decreasing percentage necessarily implies a decreasing absolute number as well.

If they aim is to communicate a true representation of reality then you need to take that into account.

Consider that (unless the absolute value varies by more than an order of magnitude) you can see relative values very easily on an absolute graph, but you can never see absolute values on a percentage graph. It would be interesting to know, for example if the loser is still growing due to getting a smaller portion of an increasing market.

There are situations like survey research where percentages matter a lot and absolute values not so much.

Definitely agree with this. I don't design a lot of charts but I look at a lot of them and traditional line / bar graph charts are just so much easier to read and understand.

A lot of these new 3d type charts just come off looking like a flat 2d chart with layers and end up confusing the heck out of me.

I agree. The presented solution mitigates one of the problems, but does nothing to help you compare one of the sandwiched stacks over time.

Stacked charts are not very good at presenting information where there are more than two entities. Overlapping line charts do a much better job of presenting trends over time.

One of the other problem is when you have more then 5+ line, it becomes really hard to track the different lines. I have found out that a possible workaround is to show the path data on mouseover:

http://muyueh.com/see/nuk/efficiency.html (Better turn down the sound)

Using the button group in the left, you can zoom to different scale.

I can still see an advantage in stacked charts when a key piece of information is a sum of components. In the example on site, the data seems to be a proportion, and so there's really no purpose to stacking the lines because the sum is constant.

One could still depict totals as separate lines on a stacked chart, but for subset sums, a stacked area chart helps show how the sum is composed.

Unless you are color blind.

The chart in the linked example uses circles for all the markers. Giving each series a different marker shape (circles, squares, triangles, etc.) might help differentiate between them. Also, aren't there choices of colors that maximize readability for color blind people (as in the chart shown here[1])?

[1] https://en.wikipedia.org/wiki/Color_blindness#Design_implica...

I was going to leave this as a comment on his blog but apparently comments are closed now.

Stacked bar charts are better, but only a little better.

Line charts overlaid on top of each other are most clear, to me.

Like this: http://i.imgur.com/kyvpiax.png

Line charts overlayed are fine, but they struggle when you have many series or many time intervals. Not that they are much better, but they are better.

To me they are far and away the worst, probably because I'm color deficient.

I'm also color deficient, but disagree.

Usually, I'm looking for rate of change of one item at a time, not all of them at the same time. After that, I'll compare them.

Stacked bar charts, IMO, seem skewed in who gained the most over a period, but not the individual gains. I have no idea what green really ended up with by the end of it because it just kept getting pushed towards the top. I have to constantly go to the left and compare ranges with the previous stack.

Can you recommend an Excel addon/color chooser or website so that persons with normal vision can test their charts and color choices for the four known kinds of color deficients?

It would of cource help you, if the chart lines have different line patterns (dottet lines).

http://colorbrewer2.org has some good palettes for data presentation which they describe as "colorblind safe". I assume this means for all kinds of deficiency.

> four known kinds of color deficients

I forgot to link to this picture on Wikipedia: http://en.wikipedia.org/wiki/Color_blind#mediaviewer/File:Sa...

Is that not a problem with stacked charts?

No, because they are stacked. Overlapping is a problem because you have a bunch of colors that very nearly look the same crossing over each other.

With stacks it's much easier to track along based on contrast alone.

While I agree to some of the points made in the post, I don't think the author provided a better alternative. Stacked bars are not any better for looking at how the percentage change on the green one.


Here is my solution:

* It retains the line aspect of it, which is essential as we are talking about a trend here.

* It easily allows you to see it both in stacked and overlapped lines. Each one of them is good at communicating clearly a certain point about the data.

* It also mitigates the problem in nostromo's comment, where a whole bunch of lines could overlap without a clear view on a single point of interest.

That doesn't translate to print (a problem with any dynamic content). Or even, say, to static screenshots.

Who's still printing these days? :) It's safe to say that iPads and smartphones have made it easy to consume content anywhere needed. Interactive documents are definitely where the puck is headed.

It happens more often than you'd think, and as I noted, even static screenshots or images won't reflect the dynamic nature of that graphic.

Those are both issues you could address with small multiples, a Tufte method.

The stacked bar chart is no better. Yes, it doesn't have the same distortion of making green look smaller towards the right, but it is very hard to see the slow growth.

Just use a normal line chart.

It's not perfect, but it's definitely better.

Yes, it's better, but using "better" stacked bar charts is not the advice I want people taking away from an otherwise valid complaint about stacked area charts. Volume is a [colorful] distraction.

Here's a recent paper about how to compensate for this very illusion, by Heike Hoffman and Marie Vendettuoli: http://users.soe.ucsc.edu/~pang/visweek/2013/infovis/papers/...

Rabble rabble rabble.

The solution described is incorrect. Area charts and column charts are used to display different types of information. Area charts (and line charts, similarly) are used to display continuous data - data that must pass through point B to get from point A to point C. Column charts are used for non-continuous data.

Some examples:

You should use an area chart (or just a line chart) when tracking your weight. If you weigh yourself on Tuesday and you weigh 150 lbs, and then weigh yourself on Thursday and you weigh 155 lbs, because of how weight works, you can assume that between Tuesday and Thursday your weight traveled through all the points required to get from 150 lbs to 155 lbs.

If you're tracking the amount of hours you sleep every night, you should use a column chart. Just because you get 6 hours of sleep on Tuesday and 8 hours of sleep on Thursday doesn't mean you got 7 hours of sleep on Wednesday. The data isn't continuous.

For market share, that data is continuous - you can't get from 10% market share to 15% market share without passing through all the percentages in-between. Therefore, a column chart is very much the wrong way to display that information.

Area charts differ from line charts though in that they emphasize the cumulative integral under the line. In your weight example, in general the total of your weights on each day multiplied by the number of days is not a meaningful statistic (except perhaps to the operator of the elevator in your building who is tracking cumulative strain on the support cables). An area chart makes sense if the product of the x-axis and y-axis units make sense.

Data continuity is a secondary concern; there's no reason not to have a stepped area chart, which is the union of a set of adjacent bars representing discrete measurements. And sleep hours per day do accumulate nicely over time, so an area-emphasizing plot does make sense (sleep hours per day * days = sleep hours). It lets you look at two parts of the chart and assess 'did I get more sleep in this period or that period'.

One of the things I dislike about stacked area charts (or stacked bar charts) is that are in many cases are used to show percentage breakdown over time.

The problem with representing percentage breakdown over time this way is that it visually eliminates the size of the sample, be it size of market, number of users, page visits, etc. It is visually implying that the same size has stayed the same over time.

Take these two charts representing the breakdown of smartphone shipments by manufacturer: http://s831.us/1kmFK28 and http://s831.us/1kmFrEz

The first displays the percentage of market by manufacturer over time. In this chart, Apple's performance looks mediocre.

Then look at the second graph that displays the number of units shipped by manufacturer. Here the amazing growth of the smartphone market is visually captured along with the breakdown of which manufacturers are driving or benefiting from the growth.

There are things stacked charts can show you much more clearly than other things. For a particular process I'm working on, I have a timing breakdown rendered as an area chart (not normalized, ordered by total time) that showed something I'm not sure would have been visible any other way.

My ascii art skills are failing me, and this is going to be hard without visual aids, but I'll try...

Overall, there was a significant and stable jump from best case to worse case (not worst case). What was interesting was that a chunk of time the size of that delta always fell in a single region but it was not always the same region but always roughly the same amount of time from the start. Since the process is small scale and kicked off at varying times, this means it was something asynchronous but triggered by our activity (or activity we were responding to).

I've always felt like I had some deficiency when it came to viewing stacked area charts. I rarely liked them but couldn't put my finger on why (not that I really gave it much thought). This article's explanation of our eye's tendency to see thickness rather than the vertical distance is spot on. I find the stacked columns way easier to read, although if you had a lot of data points I agree with several of the other posters that a boring old line chart is going to be the easiest to read.

Gauges and 3d area charts are other examples of misleading visualizations that take up a lot of space without deriving much meaning. Why do we keep seeing them? They're flashy and make my dashboards look like the one my competitor is using! That's just the world we live in.

After studying data visualization, it's surprising how most popular dashboard widgets/visualizations are relatively bad at encoding data versus simple things like sorted tables.

we use them on internal dashboards and they can be extremely useful. One use case is keeping track of bad orders, we work on the dropship model so the top line is our total bad order percentage and each area is the bad orders from a certain supplier. When the goal is to lower the top line we focus all our efforts on the largest areas.

Hahaha. The solution described works if you have t=6. Try having t>>25.

The perfect use case for stacked area charts (but not pinned to 100%) is when there are many categories, but one is very predominant, and you're interested in (a) the sum of all quantities and (b) the participation of the principal category.

Example: a country is mainly reliant on hydropower for domestic electricity; there are years where it doesn't need any other source (it's still cheapest....). So we want to track (1) the evolution of domestic consumption and (2) the share of hydropower (3) which years it hasn't been enough.

I think this more illustrates a problem with trying to generalize solutions. The final stacked area chart with green at the bottom works perfectly for this data set. Even the author's proposed solution fails to emphasize the growth of the green section.

To be fair, its 'mis-use' that is the problem here. Like any tool, the right one should be selected for the job. There are people out there that generalize this same logic to "charts" without qualification (they prefer tables).

Not meta enough. When was the last time you designed the chart for a book or other static media?

I'd like to think that in this age there's no need to force a static, unexplorable view of the data to the user at all.

Print still exists, you know.

Talking about fancy visualisations, you should definitely have a look at horizon graphs


Nice, but I think it would take some time to get used to the "overplotting" thing. Sometimes it's better to just take up a bit more space for clarity's sake.

Making a stacked graph interactive can help like this one I prepared earlier:


You don't have to stack it like that:


I hate stacked charts, period. I would much prefer a line graph - and with absolute values rather than percentages.

Stacked charts are no better when trying to compare categories other than the one closest to the x-axis against each other.

Color me dumb, but that chart/graph made perfect sense to me as it was.

Yeah, that's a fair point. These charts are misleading. Then again, if you generally trust charts for an easy interpretation of data, you're probably going to be misled anyway. Perhaps I express an unpopular opinion, but the only way to really get a handle on the data is to read the data. (I find it to be a rare occurrence that a chart exposes --- rather than oversimplifies --- relationships in the data. They're mostly okay for a cursory glance, but not for more. In this example, they're not even okay for a cursory glance.)

How would I determine if the pattern was linear or logarithmic, whether an observation was an outlier, or whether two series were related?

A tiny bit ad hominem, but I take it you are not a statistician. The first thing we do is to graphically illustrate the data, because human brains are so good at recognising patterns.

It is important to experiment with several variations of display, also including and excluding outliers and different series, to see if there are biases that mislead you.

Only then do we come up with a mathematical model to fit.

Actually reading the data line by line can be extremely misleading - because you can only compare a few numbers at a time, which fails to give you an insight into the overall variation.

I am in fact a statistician. I work in statistics, mathematics, and computer science. I am often severely frustrated by the oversimplifying and misleading natures of conventional charts.

I have seen good charts. Yes, they exist, and I actually see them quite often. It just so happens that the vast majority of charts and graphical illustrations I see are omitting details at best and dangerously misleading at worst.

This is very true.

Reading Tufte carefully, one realizes that (a) he's read more Derrida than he's willing to admit (there are entire unattributed quotations, probably by accident) and (b) great charting is an art form, perilously predicated on someone being both quantitatively educated and visually gifted.

There is no silver bullet but the table, and even then, Simpson's paradox is always ready to bite you.

Interesting. Me too, but I have been more into actuarial work for the last five years. One thing I have noticed is that they love tables of numbers. Maybe I need to reconsider.

How do you feel about waterfall charts? (presumably with the table attached)

How do you

Reading numerical data has its own drawbacks. E.g. your mind may perceive too much significance between 389 and 412. Even worse if any kind of error bars are involved.

> only way to really get a handle on the data is to read the data.

But charts are way easier to grasp than numbers. If you want to see the difference between the pace of adoption between two systems, it's much more telling to see it graphically than through numbers.

Registration is open for Startup School 2019. Classes start July 22nd.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact