Hacker News new | comments | show | ask | jobs | submit login
Edward Tufte's "Slopegraphs" (charliepark.org)
235 points by charliepark 2201 days ago | hide | past | web | 38 comments | favorite

Long lost in that they're not quite as useful, not as simple to discern info, and don't function in the same ultra-compact way the sparklines work.

But Charlie, damn fine post on the subject, seriously -I enjoyed the formatting and style as much as the content.

I don't understand the comparison. It seems like slope graphs and sparklines serve very different purposes. Both of the Tufte graphs in the piece (the "receipts of government" graph and the cancer survival rates graph) would not have conveyed nearly as much information if they were sparklines. The cancer graph is particular is magnificent.

The first set of bullets following the taxes graph clearly outlines what this type of graph excels at. I don't think those reasons overlap very neatly with sparklines; which are better at showing trends in time series data in a very small space. To me the key point about these slope graphs is that they show the relative rates between subjects very clearly. Sparklines do not do that as well, indeed they are often pretty compressed along the y-axis.

I don't think the parent poster was trying to say that someone should use a sparkline instead of a slopechart. I believe they were saying that they weren't impressed with slopcharts to the same degree as they were impressed by sparklines. If that is their sentiment, I agree with it fully. I found most of the examples non-intuitive. I had to really study it to try to understand what the chart was showing. That said, before I would say yea/nay, I'd like to try them with a dataset that I understand really well to see what they look like.

I thought the two examples that were actually created by Tufte were great, they made perfect sense and expressed the data well, while the other examples were kind of a mess. Partly this is because they weren't clear analogues to what Tufte made, trying to correlate two different pieces of data instead of showing how one piece of data changed over time.

Thank you! I really appreciate that a lot.

You know, I've somewhat wary of self-posts but I'm very glad you posted this. An interesting Edward Tufte article and gorgeous blog design - that's just great.

Nothing to be wary about when people post their own things. If you spend time writing something and want to share it, more power to you. Votes will determine whether it's of interest to other people.

If you don't like the jaggies in your HTML Canvas pictures, you can make the canvas size be twice as big and use CSS to set its size on the screen back to what you want.

Then you get nicely antialiased graphics.

Don't go crazy on the expansion though or iPhone users will hate you.

You should also be careful to drop your endpoints of horizontal and vertical lines on the xxxx.5 boundaries (in screen space) to keep nice sharp lines, otherwise you get a two pixel wide smear.

Awesome! Thanks for that tip. It hadn't occurred to me that that was possible, but, yeah, that makes sense.

I might pick my Canvas experiments back up, then, as R and the other packages are outside the scope of my day-to-day work, but I could see a JS implementation being useful. Thanks again.

I recommend you try playing with Protovis. Pretty nice for use cases like this.

Or D3, the child of protovis and SVG.

To be pedantic, Protovis did also use SVG; it just had a layer of abstraction that somewhat obscured this, whereas D3's lower-level syntax makes it feel like you're manipulating SVG directly (making it a bit more intuitive to learn if you're already familiar with SVG).

I'll +1 the recommendation though. D3 is (unsurprisingly, coming from Mike Bostock: http://bost.ocks.org/mike/) brilliant.


How exactly can you do this? I'm doing some WebGL demos and I've taken to multisampling at the shader level because I couldn't figure out a way to cause the canvas to become larger without actually taking up more screen space.

Is that similar to this hack for faster canvas rendering on iPhone http://29a.ch/2011/5/27/fast-html5-canvas-on-iphone-mobile-s... or is it completely different?

Wow. That is a near perfect example of the concept of a clever hack. Thanks for sharing.

The cancer survival rates graph is magnificent. Super easy to understand, very information dense, and shows some very interesting trends. I never really thought about how different the slope of some of those diseases could be. That some cancers give you just about the same chance of being dead in five years as 20, and other drastically different odds.

I agree, although it would be nice to compare a version where the baseline risk of death is partialled out. It would be really nice to see when the line for a particular cancer flattened out, indicating a person's prospective risk after n years was no greater than baseline.

I thought the most enlightening part of this article wasn't about the slopegraph itself, but on visualization in general. To me, this was a very effective message about the care necessary to avoid imposing an interpretation on the data.

By selecting the two scales used, the designer of the graph -- whether intentionally or not -- is introducing meaning where there might not actually be any.

For example, should the right-side data points have been spread out so that the highest and lowest points were as high and low as the Switzerland and Mexico labels (the highest and lowest figures, apart from the US) on the left? Should the scale been adjusted so that the Switzerland and/or Mexico lines ran horizontally? Each of those options would have affected the layout of the chart. I’m not saying that Uberti should have done that -- just that a designer needs to tread very carefully when using two different scales on the same axis.

Not in a web device/timeframe combination to give this a proper look, but definitely like what I briefly had a chance to read. Already saved to revisit later.

For anyone else that enjoyed this and is not aware of Parallel Sets: http://eagereyes.org/parallel-sets

Personally, I've found few good real-world use cases for PSets. But I've found it to be helpful for experimenting, even though I rarely use it for any end-product charts.

[edit: One last thing before I'm out of range. I'm sure most folks interested in this are familiar, but if not... BumpCharts: http://junkcharts.typepad.com/junk_charts/2005/07/in_praise_... ]

One minor point in an otherwise excellent post:

  > Another difference I should note: This type of forced-rank chart
  > doesn’t have any obvious allowance for ties. [...] In Fry’s case,
  > he uses the team with the lower salary as the “winner” of the tie.
  > But this isn’t obvious to the reader.
I wonder if this isn't random. It appears to be the case for the 91-71, 90-72, and the 69-93 ties. In the 80-81 and 80-82 ties the higher salary team is on top.

Good catch. In the Visualizing Data book, Fry notes that he's using the lower-salaried team as the tiebreaker, and I took that as fact and didn't check the data (sigh; rookie mistake). I'll point that out in the article when I update it.

Great article---I once used the "slopegraph" in a paper to show how the types of gov't contracts (e.g., fixed-price versus cost-plus) is different at each step of the litigation process:


One of the most enjoyable posts I've read in a long time. Great topic, analysis and beautiful site. Thank you Charlie.

Thank you! I'm so glad you enjoyed it.

A nice interactive feature would be: for the line graph that a slope graph is a "zoom" of (according to charlie's explanation), be able to slide or drag over the line graph and have the slope graph for the current focused period appear, either over or next to the line graph depending on available space.

Back in 2004, Edward Tufte defined and developed the concept of a “sparkline”. Odds are good that — if you’re reading this — you’re familiar with them and how popular they’ve become.

Really? I've definitely read about sparklines, but I've never seen one used naturally, as it were.

Me neither, does anyone have evidence of how popular they've become?

They're used extensively in Google Analytics and WebTrends


(Google analytics product pages seem to be down for me)


No links at hand but I've seen many sparklines in financial websites articles next to GOOG or AAPL.

They are built into Excel -- you can show them in a single cell.

Interesting post. I think that two-column comparison tables are similar to the slopegraph and (if designed well) can convey the information almost as well, particularly if they have different coloured rise and fall arrows (with number of places). The bonus is that its a simple formatting since nothing breaks its line it can be done with text and inline symbols, instead of as an image chart. The loss is that the connection between elements is not as strongly emphasized, but its fairly easy to tell anyway.

I don't have examples off-hand, so I hoped I've explained what I mean well enough.

The post says "There’s absolutely zero non-data ink."

The names of the countries are printed twice. As far as data is concerned, one set contains none. The second printing may help readability, but readability isn't data.

You can tell one set isn't necessary because the cancer graph at the bottom leaves them out of the middle columns.

The bullet point at the bottom "Include both the names of the items and their values on both the left-hand and right-hand axes" could add "If your goal is zero non-data ink or your space is tight, you can leave out one set of names."

Alternatively, you could keep the double printing of the names and leave out the lines connecting them, they too are just for readability. Of course, you can't remove both the lines and the second printing of names.

Well, if you remove the lines, you give up any information besides start-end rankings, in which case you might as well just have 2 columns of names next to each other. eg. look at Tufte's cancer survival rate graph, where it's not just a straight line from start-end but multiple segments, each end-point being a fresh dataset, datasets staggered over time.

I meant erase the lines but keep the placement of the text in the columns, not just the order of the labels, but the actual positions of them, so that you could connect them with lines again if you were so inclined.

Good point. Thanks for noting that. I'll add that in.

What is that bold font called? I keep seeing it everywhere lately.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | DMCA | Apply to YC | Contact