
Dangerous New Chris Anderson Theory: We No Longer Need Logic - nickb
http://whydoeseverythingsuck.com/2008/06/dangerous-new-chris-anderson-theory-we.html
======
mechanical_fish
This is a big exaggeration. _Dangerous?_ Because it might lead to an outbreak
of bad science and poor conclusions? What's one more drop of water in that
ocean?

If, in fact, Anderson's idea turns out to be half silly and half banal, the
world won't even notice. Folks are too busy squaring the circle, constructing
their perpetual motion machines, and printing out new labels to paste on their
creationist tracts to take much notice of yet another half-baked idea.

~~~
j2d2
I'm not sure what to make of this. I think Hank is taking the idea that
correlation is good enough _too_ seriously. Since I've been playing with
correlations at home and learning how people find correlations in huge data
sets, I've found that a correlation actually _is_ good enough for most
purposes. Think about what the correlation is used for. It makes suggestions
on things to buy or things you should explore. It's making suggestions. A
movie suggestion is exactly that, a suggestion. It's not an exact science and
you can't have a 100% guarantee that a consumer will be into something
_enough_ to buy it. It's the same thing as trying to predict the market. The
correlations can provide a useful service for getting information to people
about things they might like. Hopefully the service makes it easy to buy the
product and _then_ the picture seems complete.

------
pchristensen
Kevin Kelley wrote a less critical, more constructive response to the Petabyte
theory here:

[http://www.kk.org/thetechnium/archives/2008/06/the_google_wa...](http://www.kk.org/thetechnium/archives/2008/06/the_google_way.php)

Gist: "My guess is that this emerging method will be one additional tool in
the evolution of the scientific method. It will not replace any current
methods (sorry, no end of science!) but will compliment established theory-
driven science. Let's call this data intensive approach to problem solving
Correlative Analytics ... In the coming world of cloud computing perfectly
good answers will become a commodity. The real value of the rest of science
then becomes asking good questions."

~~~
apathy
_The real value of the rest of science then becomes asking good questions._

It were ever thus. Oddly enough Picasso appears to have been one of the first
to recognize the relevance of this in the computer age -- " _Computers are
useless. They can only give you answers_ " -- long around 1968.

I fire up Maxima (Macsyma) all the time to do mindless derivatives, integrals,
sums, and series, so that I can get frustrated as hell working on less
tractable problems. The symbolic rearrangements (the ones that can be done
without reference to the more complicated reality at hand) are just
bookkeeping, and computers are awesome for that.

If raw data (and lots of it) were sufficient, things like dynamic programming,
heuristic decomposition, and approximation algorithms (complete with
correctness bounds) would be pointless. It's not and they aren't.

nb. Don't take any of the above to mean that the province of "asking
interesting questions" is somehow restricted to scientists, or artists, or
anyone else. But it _is_ probably the highest hurdle towards discovering
something worth pursuing. The antecedent execution thereof is rarely easy,
either. Tools help -- a lot -- but by themselves they cannot somehow pull the
future into the present. For the time being, at least, intelligence
amplification still trumps artificial intelligence. (I hope it stays that way
for a while, being human and all)

~~~
serhei
"Computers are useless. They can only give you answers."

Such as 42.

------
mstoehr
It may be "dangerous", but basing causation just on correlation (or
statistical data in general) is actually a pretty prominent idea:
<http://plato.stanford.edu/entries/causation-probabilistic/>. Chris Anderson
obviously glosses over many tricky details, but Judea Pearl among others have
written extensively on it. Ultimately, all these ideas can probably be traced
back to David Hume (at least here in the West).

Hank's argument that Chris is suggesting that we don't try to find spurious
correlations is rather simplistic since a probabilistic theory of causation
can account for "spurious" and "real" correlations quite well. Indeed, the
only reason why we know that "spurious" correlations exist is because they
eventually show up in the data (i.e. the economist finds out that the stock
market does not predict sun spots). Thus, a statistical definition of
causation (more or less one that only uses a notion of correlation), can
actually be quite robust.

------
bayleo
I enjoyed the article Hank. As someone working on developing data-driven
machine learning systems for marketing purposes, it made me think twice about
some of the conclusions I am able to jump to based only on correlation.
Certainly, these methods are quite useful in my field and the other less
precise social sciences. In the hard sciences, however, advancing theory
without a model seems preposterous save perhaps for a supportive role in
special cases.

EDIT: almost feels like a republic (model) vs. democracy (data) political
discussion

------
aswanson
Next Hank will tell me the sun doesn't revolve around the earth, as is 100
percent consistent with observation.

------
giles_bowkett
This is NOT Chris Anderson. This is Berkeley researchers from the mid EIGHTIES
and a lawyer in 1913, for the love of God.

<http://en.wikipedia.org/wiki/Bayesian_Networks#History>

This is like if somebody finds P Diddy's version of "Every Breath You Take"
and goes, "wow, I didn't know Diddy wrote that song."

If you haven't been watching Bayes, you don't know what powers the biggest
success stories on the Web. Bayes _is_ the tech innovation story of the Web.

Ironic that a site run by Paul Graham would have this problem.

~~~
icey
Ironic that a site run by Paul Graham would have which problem?

The title of the submission is just the title from Hank's blog. So, wouldn't
be a problem with his blog, and not news.YC?

