On a related note, we could take another simple step toward improving the way people communicate about statistics: ban point estimates in news reports. Using a point estimate, I can accurately claim that there's a 50% chance heaven and hell exist. This is technically true because my confidence ranges from 0% to 100%.
I get that people often need so-called "killer facts" — quick statistics that make an impression in press releases — but confidence intervals are more honest and they don't take much longer to communicate. Instead of allowing people to say that "x% of users prefer...", let's say that "x-y% of users prefer."
I get the point you are making, but confidence intervals are completely the wrong solution. They do NOT mean what you probably think they mean (where "you" is a generalized "you the reader").
"What is going on here is that the commentators are assuming we live in a noise-free world. They imagine that everything is explicable, you just have to find the explanation."
They know full well that there isn't anything behind it. The problem isn't the news organization, it's the media consumers. People want stories to understand the world in easy bite sized pieces. Unfortunately what affects us doesn't come with a simple story line, it's just one damned thing after another. But no one would sell any news with "Stuff happens, we have no idea why. More stuff to happen later."
>The problem isn't the news organization, it's the media consumers.
I've always been somewhat disheartened by comments like this. People will blame consumers for how media is but then say this about technology:
"If I had asked people what they wanted, they would have said faster horses."
At some point someone sold the idea of sound bite consumable news to customers and convinced them that it was good. the way that we know that is because news had never looked that way until cnn and networks like it started.
consumers either know what they want or they don't know what they want... but I hardly think that issue is cut and dry
>At some point someone sold the idea of sound bite consumable news to customers and convinced them that it was good.
What makes you say this? This seems pretty false in my opinion.
It seems much more likely that sound bite consumables are the outcome of testing/feedback and iteration. It seems entirely possible to me(if not probable) that consumers(for the most part) don't know what they truly 'want' in terms of news.
We live in a world of(largely) instant gratification; the fact that 'news' would have to adapt to fit that lifestyle makes perfect sense to me.
> People will blame consumers for how media is but then say this about technology: "If I had asked people what they wanted, they would have said faster horses."
What point are you trying to make here? I'm having difficulty parsing your meaning. Humans, when they aren't versed in the underlying operations of a system, WILL make broad assumptions(about said operations) and WILL have irrational expectations and 'wants' based on those assumptions.
What would my grandmother(or even mother for that matter) ask me to improve about her spyware-ridden current-gen laptop? "Make it faster." If I asked her how to improve the news on the other hand I guarantee "make it faster" would not enter the conversation.
Soooo, what's your point here? That consumers do want 'faster' news? And why would a consumer's desire for more succinct news imply that they are okay with less accurate news...?
Obviously the issue isn't cut and dry, everyone is a consumer of the news and we all have our own opinions. That said, I have a hard time disagreeing with the parent comment. Have you watched the news recently? 'News' stories are NOT given time based on their importance to the world as a whole; and for good reason. When a media company(whose purpose is to provide fiscal return) is met with positive feedback for stories pertaining to popular culture, or pandering to popular opinion, they will invariably continue to publish in that vertical. The result is a 'news' which consists mainly of stories which are either racially/politically charged or major events(read: actual news) which are strung along for weeks(natural disasters, terrorists actions, etc...)
I guess I just don't like the way people talk about consumers as single decidable entity. It ends up being a wiping boy to distract from core problems. Example:
"Our technology succeeded that's because consumers are idiots, but we sure showed them a better way!"
or
"Our technology produced an offensive cultural phenomenon, that's because consumers are vultures who force us to bend to their will"
One of these statements can true but not the other... but frankly both of them are false as markets are infinitely more complex than boiling down to a mass generalization about consumption...
Errorbars are usually 68% (1-sigma), unless stated otherwise. In physics, confidence intervals are usually 95% (2-sigma), though 90% is becoming increasingly common.
I was talking about error bounds expressed as a confidence interval, e.g., in political polling. Not error bars used in various plot types (e.g., box or whisker plots), which are based on standard deviation.
As I noted, it's a convention, and really, the CI should be explicitly stated.
That world seems pretty pointless to me. What should the public do when tomorrow's high is forecast at 30±3 degrees but it's actually 35 degrees? Why was the hedge an improvement?
If the temperature range is, say, close to freezing, or some maximum safe temperature for an activity or crop, then the error bound tells you how certain the forecast is, and whether or not an adverse condition might be experienced, for which you might want to take precautions (cover plants to protect them from freezing, ensure that they're watered for heat, say).
That's not true at all. Look at the example I asked about -- "what should we do when the forecast says the high temperature will be 27-33, but the high temperature is, contrary to forecast, 35?"
A minor amount of imagination can transform this question into "what should we do when the forecast says the low temperature overnight will be 2-6 degrees, but it's actually -1? If you think the forecast will tell you "whether or not an adverse condition might be experienced", you're likely to do something stupid. A more sensible approach would be "if it's frost season, cover the plants".
Weather forecasts already treat the odds differently depending on (a) what they are, and (b) what events are being predicted. A 70% chance of rain tomorrow might be reported as a 70% chance of rain, but a 10% chance of rain is more likely to be "overreported", say as a 40% chance. They do that because people don't want accurate numbers -- instead, they get mad when the forecast suggests that they don't need to prepare for unfavorable weather, and unfavorable weather inconsiderately happens anyway.
A more sensible approach would be "if it's frost season, cover the plants".
If you're a homeowner and you've got a small garden or some potted plants, covering them in straw or bringing them indoors isn't a major concern.
If you're a farmer with a crop that may need to be harvested, smudged, or sprayed with water (ice coating protects some crops), with a nontrivial investment of time and materials, the distinction is meaningful.
If you're a homeowner with a small garden, losing the plants to frost isn't a major concern.
If you're a farmer with a crop that may need to be harvested, smudged, or sprayed with water, losing everything to frost is the kind of thing that might drive you to bankruptcy. Say there are 30 days over the course of winter where the overnight low temperature is forecast near but above zero. By your assumption, the forecast is using 95% confidence intervals. What are the odds of your crops dying if you rely on the weather forecast?
Again, just because the forecast says 3±1 doesn't mean the low temperature will be at least 2 degrees. If your crops will be killed by frost, and the forecast says 3±1, you need to protect your crops. It's really not that different from 3±6; you'll do the same thing given either forecast.
In the US the CEW (census of employment and wage) is a data set that is primarily aimed at economists or modelers. The startling thing is how these data-sets became mainstream. I work in economics and for years no one was particularly conscious of when these figures are released(non quantitative people i mean) but now I am often surprised how many regular people know that the first Friday of the month is unemployment data day. The danger here is that modelers planning to work with the data are well aware of the signal to noise ratio and take it into account with everything they do. Unemployment data in the US at a granular level is also heavily redacted for a variety of reasons which makes this problem even worse.
2. The second issue that I would have loved to have seen this article elaborate on is what are our core metrics as a society. This is something that we still haven't nailed yet.
I have friends who work at the government institutions the produce this data and they can tell you that most of this work is still done with a very mid 20th century mindset.
On the other hand I know people who work with large business data-sets such as supply chain data from Amazon Walmart and others. This data is far more robust and could be used to give a much truer picture of how the economy is doing but it is currently locked up under the veil of corporate secrecy. if gov't statistical agencies could figure out how to work with this data we could do significantly better.
This is particularly evident during presidential election season, when the mainstream political news will do things like show a real-time opinion poll accompanying speeches or debates. The article hits the nail on the head at the end: this phenomenon isn't going anywhere unless people start getting a whole lot less gullible.
While I think the articles most point is reasonable. The choice of stocks as an example is a poorly one. Because prices are reacting in large part to news. To the degree that the news presents something material and unsuspected it influenced peoples trading. Of course good news doesn't always lead a stock to increase, because sometimes most investors are expecting better news. Certainly there is a lot of randomness, but for heavily traded stocks the impact of news is real and substantial. Unexpected good news most often leads to an increase in stock price, and unexpected bad new most often leads to a decrease.
Google's share price changes the same day as an announcement. The BBC attributes the change to the news as though they are monotonic. But who is to say that the news didn't stop the share price falling twice was far?
Also, every trade requires two parties, so one side thought they were getting a good deal.
I'm also amused that the share traders are called "investors" rather than "gamblers".
An example: a phenomenon that I think has come into play gradually over the past 3-5 years: national news broadcasts lead stories are based on the weather.
"3rd highest temperature seen in 5 yrs"
"4th coldest winter since 2001"
"2nd highest rainfall in March in the past 12 years"
etc etc
PS on a secondary note: shame on the news services for leading with weather so often. I guess it's laziness. No need to spend time, resources, expertise on investigating actual news.
Does anyone remember the unemployment rate dropping just before the 2012 Presidential election (and then rising again immediately afterwards)? These numbers can't be trusted, and are nothing more than a tool for party control of government.
Which statistic are you interpreting this from? The Bureau of Labor Statistics' publications? In glancing at their unemployment rate estimates, it looks like the overall trend throughout 2012 and continuing into 2014 is negative, with some noise in between. Though the methods used to calculate unemployment statistics are frequently discussed (such as their inclusion or exclusion of workers who give up searching for a job), it seems like you are interpreting the noise for some mischievous activity (ironically, in a way similar to what this post discusses), for which you provide no evidence.
October spiked month over month only to be reversed. And the 1st relase numbers show a "uptrend" for aug, sep, oct while the final numbers show a distinct "downtrend" for same.
that is simply untrue. If you spent any time trying to understand the data and the way it is produced you would understand why your comment is so outlandish. The BLS that produces these figures is basically a cult of numbers. they spend a huge amount of time correcting and reworking out of fear of this type of contamination and rigorously work to keep those numbers accurate.
I get that people often need so-called "killer facts" — quick statistics that make an impression in press releases — but confidence intervals are more honest and they don't take much longer to communicate. Instead of allowing people to say that "x% of users prefer...", let's say that "x-y% of users prefer."