

A Wave of P.R. Data - samclemens
http://www.niemanlab.org/2014/12/a-wave-of-p-r-data/

======
minimaxir
This is an issue I've been having with the dataisbeautiful subreddit on
Reddit. I submit a lot of original charts and data analyses there
([http://www.reddit.com/user/minimaxir/submitted/](http://www.reddit.com/user/minimaxir/submitted/)),
but the submissions that trend on the subreddit are usually either political
or reminiscent of the TIL subreddit but with pretty pictures." Which is
annoying, but eh.

~~~
bduerst
At least that community is pretty good at calling out the B.S. in the
comments. I can't say that they prevent it from rising to the front page of
reddit, but they usually point out the flaws in the assumptions being made.

Sometimes they even go so far as to be pedantic about graph or color choice,
but I digress.

------
dworin
In the PR industry, this is called "mediagenic research" and it's been around
for decades. The idea is that if you want earned media coverage (what people
in marketing/communications call the news), you need to have a story. For
times when you don't have a great story, finding some interesting data gives
you an excuse to make one up.

Part of the reason it's successful is that people are genuinely interested in
it, and will read the articles. They don't particularly care if the
methodology used to determine which city has the best sex, or which
presidential candidate would perform best in an alien invasion (both real
examples) is sound. They just like the story.

Before stories were going viral on the internet, marketers did the same thing
through local newspapers and radio. That's why so many studies rank cities or
metro areas - because local papers want to report a local story.

Most of this is pretty harmless. Nobody is making public policy decisions
based on the data, the people reading it think it's funny or interesting, and
it gives time-starved media companies simple content to push out.

------
IndianAstronaut
I used to work on a team which would collect data for PR purposes for the
company. The data is often very heavily skewed towards only what data is
available to the company. This is far from a proper sample of the population.
For example, instead of '24% of people do x' should be '24% of people that use
our site do x'.

Zero statistical methodology is ever used to make sure the data is accurate
for the broader population.

Also, just because some data is associated with someone with a PhD, doesn't
mean it is any more accurate. Many PhDs and professors are put on part time
pay rolls of these companies so as to give the bs data an air of accuracy and
authority.

------
aresant
"There are three kinds of lies: lies, damned lies, and statistics."

I love that quote and it's attribution a 19th century British prime minister
(1) illustrates how long this "trend" has been around.

So when the author of the article states

"Nobody can say exactly when the trend first started, but in 2014 we saw the
first major outbreaks of bogus data distributed by private companies just so
it would go viral online"

I think that's a bit hyperbolic.

Old hat, fun to write about though.

~~~
qooleot
Sure, statistics in marketing has been around for hundreds of years. I think
his key point was alluded to with "just so it would go viral online" as being
the new and growing theme.

The trick is getting the public at large to spread the message, and the goal
not being the message itself but rather a higher pagerank and traffic. Before,
the statistic itself had to be meaningful, i.e. "users of our weight loss
supplement lost 43 pounds in the first month". Now, the statistic just has to
be "shocking" but not actually sell anything.

------
porker
I hoped the post was going to reveal - tada! - an AI algorithm that would
detect and filter out these stories.

That would make reading the web a better experience.

