

Big Data Is Booming but Big Results Are Lacking - dvfb
http://allthingsd.com/20130520/in-media-big-data-is-booming-but-big-results-are-lacking/

======
dizzystar
I think the article hit the nail on the head: data phobia.

It is a very tough sell to tell people that their intuitions are wrong,
especially when certain practices and beliefs are long entrenched.

It is hard for me to describe to anyone the resistance to anything you present
if you haven't been there. The expectations for accuracy are far beyond what
can be realistically accomplished. It is humiliating and frustrating to see
people who have no business analyzing your work comb over every last number
and if ONE number is wrong, the whole plan tumbles.

In my experience, the communication issue is teaching sales, management, etc,
that the point is not to play whack-a-mole and "fix" everything, and that the
data is not meant to be used as a hammer (IMO), but that it is simply there to
point the company in a direction, or at least, show where things could be
improved and encourage good directions that already exist.

I believe that the expectations do not align with reality at this point.
Everyone is looking for some mythical Fountain of XYZ, and it simply does not
exist.

"Lies, Damn Lies, and Statistics" is so ingrained in our conscience that the
expected reaction to Big Data is a knee-jerk mistrust to whatever is
presented. It is a serious issue, and we that have to analyze data have to be
cognizant of the fact that we are pushing back against a century (centuries?)
of idiots who have used statistics to lie, justify false information, and push
agendas.

~~~
JPKab
This reaction can vary heavily depending on the traits of the executives, and
unfortunately, the kind of personality that gets to the top in embedded
corporate hierarchies tends to be people who are excellent at the image of
productivity rather than true productivity. To these people, there is nothing
more threatening than objective facts. They can manipulate and hide from
reality with the subjective. The objective is a new world where they are
exposed for the frauds that they are.

The U.S. Federal Government is packed with these types.

------
kyllo
In the corporate world, this is because the purpose of enterprise data is less
to drive decisionmaking, and more to justify decisions that the executives
were going to make anyway, which will usually just be whatever has the
greatest benefit to their own careers.

In the public world, though, it seems to me like big data has had pretty huge
results. The most commonly cited example probably being Nate Silver predicting
the 2012 US presidential election results correctly for all 50 states using
big data sources and techniques--this degree of predictive statistical
analysis was previously unheard of in politics.

~~~
buzzwordjunkie
The dilbert strip in the article was hilarious

<http://dilbert.com/strips/comic/2012-07-29/>

Is there some kind of gene which predisposes people to throwing around
buzzwords? How is it that human behaviour gave rise to "big data" and "the
cloud"?

~~~
JPKab
The "gene" you speak of is a desperation to prove relevance when a person has
a sneaking suspicion in their heart that they don't add value.

Ever notice how the people who throw around the most buzz words are those who
think of, and refer to themselves as "big picture" thinkers? They have
conveniently tricked themselves into thinking that focusing on and mastering
any true skill is a distraction from understanding the broader landscape in
which they operate.

------
fpp
Fears and phobias never helped coping with reality.

The stream of data that we and our environments create(d) is an reality since
the moment we made them digital (vs. analog) and started processing them with
computers. As soon as you are able to process the data you are often just one
switch / flag away to also store it.

When telephone systems and backbones started to become digital on a broader
scale in the 1990, the at that time already existing surveillance or data
collection expanded massively because data became accessible and usable on a
large scale. First huge processing farms were built for the FBI, NSA ea. and
the congress appropriated hundreds of millions for them. For the next version
of those server farms U.S. congress already provided dozens of billions 4-5
years ago. Follow the money and you can find out since when this is going on.

With every step our world is becoming more digital these "virtual images" of
us will become more complete and we will become more dependent on them. Ask
yourself how often you are still using a paper map today vs 10 years ago, soon
there will be a generation of people living that will only navigate with
digital maps and GPS leaving a trace of their every movements - for them it
will be the norm and they will not know another way. If you prefer a more
positive image think about people living with a heart disease and how many of
them will survive because of 24/7 surveillance.

One key element with all that data often overlooked is that once we depend on
it within many areas of our lives, falsification of "data elements" or
blocking access to "data services" has substantial impact on the person
herself. Thinking about finding a new job so that you can pay your rent -
almost every step within that is already digital. And faking email
conversations, phone interviews etc today only requires limited resources for
"someone" having mandatory access to the mail servers and internet
infrastructure. Putting incriminating material on computers of your business /
political / life "opponents" might already be a drag and click activity for
some of those. Falsificating / sabotaging financial transactions have been
reported from various political activists since years and are for years
already part of Hollywood folklore / films. In short - soon "some" will be
able to completely change our real lives by "working the data" - if we are
economically successful, whom we meet, what we think about others / products /
politics, if we die from diseases or not...

It would be too easy to say that what happens with all the data about us, our
activities and interactions is a matter of what society we are living in, if
it's a true democracy with civil rights or an oppressive state. This per se is
an illusion. It is denying how the majority of people are living their lives,
they want to be part of a community, be safe, have no problems, enjoy
incentives (or things you can buy). And it is denying how governments work,
because the sole existence of such a feedback / control / surveillance /
oppression mechanism is changing society itself and the way we are governed.

Think along the lines - what can be done will be done. And if powers of "some"
in our societies continue to be expanded day-after-day - it certainly will.

------
ronaldx
The example given of Big Data use is that Netflix allegedly checking its user
data to decide whether to buy House of Cards.

It's not evident that such Big Data was useful in the way described. Anyway,
the data described could easily be collected - likely at the same cost and
accuracy - in an old-fashioned focus group type of way.

Deciding to buy House of Cards required insight - outside of the data and
statistics - to ask the right questions and come to a conclution.

Has using 'Big Data' here led to a more accurate or better value results? Is
this even really what 'Big Data' is? Absolutely not clear.

'Data phobia' might be an issue but is not the major issue. The issue is using
phrases like 'Big Data' which don't actually mean anything: this phrase alone
doesn't create value for business.

------
shin_lao
Big Corp, that are known to have actual big data are also hardly known for
being early adopters... It will take much more time than expected to see data-
driven companies and I actually think the finance industry will be the first
to do real, sensible big data analysis.

It will also take something more than Hadoop and the like to do something
real... Don't like to be that guy, but please, stop rewriting yet another
query language and try to write efficient engines instead. ;)

~~~
nostrademons
I worked at a financial software startup from 2005-2007, and the financial
industry was _already using_ big data pervasively. Hell, LTCM failed in 1998
because they were doing big data analyses of small spreads and forgot about
some of the holes in their analyses, namely what would happen if a "black
swan" event moved currency prices out of their historical norms. Quant hedge
funds like DE Shaw, Citadel, and RennTech have been around since the 80s.

Heck, back in high school one of the math competitions I won was sponsored by
INFORMS (the Institute For Operations Research and the Management Sciences),
and I asked my dad "What's operations research?" and he said that it's where
people with math Ph.Ds go to make big bucks. Companies like FedEx, Safeway,
and WalMart have relied on their massive amounts of operations data to do
things like minimize transportation costs or ensure that they're stocking
items with maximum demand for decades.

The thing is, everybody who gains a competitive advantage from big data has
reason to keep that fact secret, so that their competitors don't start doing
the same thing. What's changed now is that the media itself is facing
competitors that use big data themselves to make themselves more relevant than
traditional media, and so suddenly it's a Story Of Consequence. It's not that
big data is the next big thing - it's that big data was the _last_ big thing
that you are only hearing about now, and suddenly a bunch of folks that had
never heard of it are now trying to play catch-up.

------
tribeofone
For me this is the key: "when you ask your data the right questions you can
find hugely valuable insights"

Many people treat data as the Oracle of Delphi hoping to just make an offering
(investment) and wait for knowledge to be dropped. This idea that you can just
shift through endless data and pull out insights is just plain wrong. Frist
start with the decisions you can or have to make, then look at the data to see
if it gives you insight into the best choice.

------
unono
They should be outsourcing it to kaggle, problem solved.

