Mind you, this is a good statistics lesson. My teenage students might get quite interested in a 'discussion' of the relative merits of various kinds of phone, before representing the data and then comparing it to UK.
US Mobile phone market share by OS and maker
iOS (Apple) 34%
Windows Mobile 3%
Windows 7 1.3%
I find the rounding of the larger market shares to nearest percent a bit worrying given that Palm is being listed as a discrete total. The 'rounding error' on Apple's % could be bigger than HTC/Windows 7 for instance
In any case, I think the correctly drawn rectangular chart conveys the significance in a single holistic glance, while the table above requires quite a few "memory registers" for comparing the sets and their parts.
I take your point about the correctly drawn/direct proportion version of the graphic being more quickly assimilated. I just think that with such a 'dynamic range' of data (0.1% compared with 0.2% in one category compared with 17% vs 14% in another) any kind of graphic will require very high resolution to convey the overall picture. I suspect that may have been the reason for the non-proportionality in the original Nielsen graphic.
For instance, I never knew Palm Inc. made Windows Mobile devices. I had to DuckDuckGo it, I've never even seen anyone use one. And apparently Palm stopped building them in 2009 .
Also, it's shocking (to me) that Windows Mobile still has double the installed base as Windows Phone does — it's not like Windows Mobile was a great smartphone platform, given its Pocket Internet Explorer was comparable to IE5-6.
The 9% installed base for Blackberry doesn't surprise me, and I imagine that long after RIM doesn't make phones anymore, there will be users who are still tied to its services. I foresee it going the way of IBM.
Again, this version might be proportionally more accurate, but it drastically reduces the amount of information. Nielsen's is the best so far, by far.
If the problem is not showing all the information, then find a totally different type of graph. Or put them into multiple graphs, one showing the 4 biggest players, then an "other" segment. Then show another graph of the "Other" segment and show the proportions in that graph.
If they pull the line that it was a mistake on the artistic front to get the text to fit into the boxes, I'll file that under criminal neglect.
> This looks suspiciously like logarithmic scaling which,
> as XKCD readers will attest, is a pretty typical
> technique for representing statistics which have
> differences measured in orders-of-magnitude.
> it's not as if Nielson only bumped Nokia, RIM and
> Microsoft's column widths.
I've now updated it so that you can click through to a larger version of the image.
Also, since Livejournal seems to be having a go-slow, here's a link to the two images:
And here is 9to5Mac with an even better one:
Update: any chance you could get two graphs made up? The first only showing the biggest six players, and an "Other" that is the rest of the market share, and then another which then takes the "Other" results and plots the market share again in a totally seperate graph?
The main problem was being able to still display readable text, which should lead to using another way of visualising the data instead of leading to using an incorrect graphic.
I also looked into using a stacked bar chart to display the information, more information (and code for generating plot, though I didn't put too much time into making it look pretty) here:
1) Everyone underestimates the number of stupid individuals in circulation.
2) The probability that a certain person be stupid is independent of any other characteristic of that person.
3) A stupid person is a person who caused losses to another person or to a group of persons while himself deriving no gain and even possibly incurring losses.
4) Non-stupid people always underestimate the damaging power of stupid individuals.
5) A stupid person is the most dangerous type of person.
The whole point of these diagrams is that one can look at it and immediately get a feel for the proportions dictated by the data. If is look like a third, it should be a third. If you have to change that, then its the wrong diagram. Or a deliberate attempt to misrepresent.
I'm betting that some one wanted to use the new funky type of diagram because its new and funky, instead of an appropriate one.
> Update 7/16: The original graphic in this post included a chart depicting U.S. smartphone manufacturer share which did not scale proportionately. While all data points in the original post and graphic remain accurate, the post has since been updated with a correctly scaled image.
To compound that by appearing to have proportional sizes, but being completely wrong, is humorously incompetent. Reminds me of those joke maps showing NYC as dominating the US.
Research I've seen suggests the exact opposite -- that people are great at estimating area of rectangles, and terrible at estimating area of circular or triangular shapes. They know which circle is bigger, but don't have any idea by how much bigger.
That said, pie chart slices work for simple data not because of area, but because all you have to compare is the simple length of the circumference segments. That's a single dimension, and easy to compare.
See "Pizzas: or Square? Psychophysical Biases in Area Comparisons", http://groups.haas.berkeley.edu/marketing/PAPERS/PRIYA/p5.pd... for how people lean on a single dimension for size or area comparisons.
Unfortunately, as this cell phone OS chart is trying to support comparison of multiple sets and subsets, circumference segments alone are inadequate to convey the relative sizes of the sets.
The more complex the information, the more the usefulness of the nested rectangles versus pie chart slices becomes clear. Imagine for example a visual representation of drive space usage by directory and subdirectory.
Here is a drive space chart using nested rectangles:
For comparison, here is an attempt do do the same using pie charts:
In fact, if you use both apps, you'll see DaisyDisk is not able to give you the "grand perspective" in a single view, it requires clicking to zoom in on a slice it then expands to a full pie to drill down.
I'm not suggesting circles of different sizes instead of rectangles, which is what this study is about. While interesting, it's not relevant to my statement.
> That said, pie chart slices work for simple data not because of area, but because all you have to compare is the simple length of the circumference segments. That's a single dimension, and easy to compare.
Big whatever to taht. It's easier to compare a pie chart than rectangles of varying orientations, which is my point. I don't really care if it's because it's a linear measurement vs an area derived from a linear measurement. Let's be pedantic, shall we?
With these rectangle/tree map things, I never know what to think: well this one is wider, but this other one is taller; you have to do multiplication just to compare 2 market shares.
Why do the titles on HN keep getting neutered?
I wasn't praising the headline because it was 'clever' though. I think the question it raised highlighted a bigger issue that still hasn't really been examined. People are focusing on the importance (or not) of accurate charts. I'm more interested in understanding why a company that exists to provide a 'complete understanding of what consumers watch and buy' is misrepresenting data.
For example, try actually making sense out of the second diagram on that site. It's far harder to read than the first. You can actually only make sense of it by cross-referencing with the first diagram.
What kind of logic is that?
The question IMO is not whether it is visually misleading but whether it conveys information in a usable form, and whether the information it conveys in that form is misleading, and I don't see it.
If we adopt a visually perfect line doesn't that mean all graphs must be linear (no plotting on non-linear scales, even though this is useful in some contexts), start with an origin of (0,0) and so forth? What percentage of graphs do you see that obey these rules?
The Nielsen graph clearly represents 51% Android as about 33%, apple's 34% as one quarter and the Blackberry als one-fifth. If you prefer readability of the labels over more-or-less-correctness of a graph which suggests it covers a full 100% and it's subdivisions you should maybe just use a table. As in Excel. A 30-31 x-scale is misleadin but showing 10% as 20% though is more than sloppy because you do show the "origin". This has got nothing to do with visually perfect.
While making for bad UX, it doesn't change the truth of the information at all, and it would be easy to fix these issues.
Edit: Looks like someone already has: http://9to5mac.com/2012/07/13/nielsen-needs-to-work-on-their...
One in three smartphones is an iPhone.
That’s obviously the wrong tradeoff. Never outright manipulate your data to make it more readable! If you can’t make your data readable enough without manipulating it you just can’t present your data that way. Period. Find a better way.
(I do not think there is any malice involved, though. Just pure stupidity. I mean, look at the amount of people around here arguing for readability over correctness. If they are out here, some are also working for Nielsen.)
Part of my job involves electrical drafting, and though I'm positively anal about my drawings, if I manipulate the design to make the drawing more clear and presentable, it's a snowball's chance I'd ever be excused.
In my case, the risk is starting a fire in a power supply; in theirs, manipulating the market death of a product, however many dollars that might entail. Professionals have higher standards in their field, precisely because we trust them with expert information and give more weight to their decisions. Stupidity is no excuse in this case.
Since 50% isn’t really a very important threshold (though that could be argued) I would very much argue against using a pie chart. (I prefer areas or lengths to angles.) Plus, a pie chart wouldn’t make it easy to add platform subdivisions.
[edit: If you wish to say a log scale is inherently misleading in this context ... go for it. That's different then saying the data is manipulated.]
[Edit2: The area is not meaningful. The widths are meaningful if there is a total ordering and the scale is labeled. ... I do however agree I am probably way overestimating how obvious a [log(1+cumulative percentile),log(1)] mapped to the xaxis is. Also the chart does scream compare areas, and those are strictly meaningless unless comparing within the same OS.]
If the X axis is a log scale of market share, what would happen if Apple and Android both had 40% market share? Both bars would overlap.
If X is cumulative market share, the bar width would depend on order and the two hypothetical 40% companies would have different widths.
If each bar has area proportional to the log, how would that work? The logs of market share are negative unless there is an arbitrary constant in there. Also the vertical breakdown doesn't make any sense in that case, because the areas of the vertical blocks don't add up to the OS total. Also small market shares would have negative width?
So can someone explain how log scale could even theoretically work here?
At least since the appearance of data sets that lent themselves towards such visualization. For example, how better to compare the growth of various platforms, from TRS-80 days to the iPad while the entire industry grows exponentially?
Edit: But I guess that's really market magnitude over time, not quite what you asked.
1. The numbers are right there for all to see.
2. Nielsen's business is not to make academic-grade charts. Their version is far, far better for their customers' needs.
The latter is just a bare assertion. Got proof? I'd bet not. Which is why you had to say "far, far", hoping that people would just go along with you.
Personally, I'd think that Nielsen's business is to make sure their customers know what's going on. That's not an argument for running the second graph; it's an argument for making a third graph that conveys the correct intuition. Or just to publish a table of numbers.
In my experience, that is not the case. The information they provide is generally used by middle managers in large companies in internal powerpoint decks with the intent of waging intra-company warfare. The use of the the info is highly political and opinionated, not rational and academic.
So, yeah, it would be preferrable to put out an immaculate chart with perfect proportions, good design, and clear text. But often it's just easier to cram the words in and make it fit. The bottom line is that the intended target of these charts just does not care about these details. They have an agenda of their own, and will use the Nielsen data to advance it. For Nielsen to spend time and money obsessing over these sort of things woud go largely unappreciated.
Is it great? no. Even good? no. Does it meet their customers' standards and needs? Yes.
This would be extremely short sighted thinking.
I'm a big beleiver in the art of not doing work that's unnecessary, but the art is in knowing when it matters. When not doing the work directly contradicts your brand's supposed strengths publically that's a problem.
Nielsen's brand is built upon a reputation of high quality and detailed demographic data. This is the basis on which customers buy data from Nielsen and what gives that hypothetical middle manger's powerpoint slide some weight. "This is from Nielsen so we can trust that it's good data not some up-and-to-the-right chart I tortured out of our data."
Events like this damage that brand. The damage may not manifest itself directly in sales up front but long term if the weight of "this is from nielson.." is gone then even in the cynical case where all the customers are clueless Nielsen will lose out to another data provider that has the right reputation.
This is apparently a very offensive line of reason here, judging by the downvotes I've acquired for pointing this out. I find that to be interesting in its own right.
If the case is that they know the data is incorrect/misleading but they use it anyway to advance a cause that sounds like the kind of internal politicking that cripples many larger businesses' ability to make good decisions. So IMO that's worse than just being clueless and I no don't think it's very ethical.
Now maybe you're saying that it's possible to get the right idea from a misleading chart and get some value from it. That's true, but it's also true that you can get the wrong idea from it and make a poor decision.
ps. didn't downvote you btw.
Look what happened here: if either RIM or Microsoft were relying on the data to be under-analysed (not saying they were - just an example!) then it would have blown up in their face fairly spectacularly. And Neilsen is a reputable firm!
Try looking up the Mindcraft Windows NT vs. Linux benchmarks also.
I also think your "nobody cares about the data" argument is weak. But I suspect you know that already, and were just trying to argue your way out of a hole so I won't belabor it.
A statistics and data-driven company like Nielsen should be ashamed, double so if they haven't published a correction.