

The Google Way of Science - The Growing Importance of Data - nickb
http://www.kk.org/thetechnium/archives/2008/06/the_google_way.php

======
sah
This whole idea strikes me as extremely confused. Data mining is just
observation. Any extrapolation from data is a theory. Anderson and this author
seem to be surprised that you can get so far with an incomplete theory that
doesn't explain everything. That seems obvious to me -- remember classical
mechanics?

Here's a particularly bad example:

 _"When you misspell a word when googling, Google suggests the proper
spelling. How does it know this? How does it predict the correctly spelled
word? It is not because it has a theory of good spelling, or has mastered
spelling rules. In fact Google knows nothing about spelling rules at all."_

"Spelling rules" are a heuristic extrapolation just like the one Google is
making, but are probably less accurate! Why is a set of rules designed to be
memorable and useful to a human more of a theory than rules that you have to
write down, and need a computer to use?

~~~
ntoshev
The author defines a theory as something simple enough to be understood by
humans.

If you drop this requirement, machine learned theories are a big deal
nonetheless.

~~~
sah
Sure, they are. But I think it's confusing to characterize machine learning as
an alternative to the scientific method; it's an _example_ of the scientific
method.

------
lutorm
I also doubt that there will have to be no theory. While you can analyze huge
datasets and look for correlations, you probably need a theory to give you an
idea of _what_ data you should collect. Imagine if you tried to build the
Large Hadron Collider without any particle physics theory to guide you in the
design? How do you decide how to build it and what you should look for? If you
build it on a hunch, you might stumble onto some really interesting things,
but in most cases it will probably be an expensive collection of nothing
special...

------
tyn
"There was no theory of Chinese, no understanding. Just data. (If anyone ever
wanted a disproof of Searle's riddle of the Chinese Room, here it is.)

Shouldn't this be a 'proof' instead of 'disproof'?

~~~
hugh
Well,

(a) You can't prove or disprove a riddle

(b) The Chinese Room isn't a riddle, it's a thought experiment

(c) You can't prove or disprove a thought experiment

(d) You might be able to prove or disprove the point that Searle was trying to
make by using the Chinese Room, except his point was about consciousness (or
even Consciousness), not about machine translation, so ultimately the
parenthetical statement seems to be pretty meaningless no matter whether you
make it "proof" or "disproof"

~~~
tyn
Searle expands on consciousness in his chinese room paper but the main point
of the thought experment is that you can have a conversation that sounds
intelligent without having any understanding at all of what you say and what
you are being told (in other words: passing the Turing test does not imply
intelligence).

~~~
hugh
You're probably right, it's probably more about intelligence than
consciousness (I haven't read it for a long time).

Anyway, it's certainly not about machine translation.

------
pixcavator
The author should have applied the "Google's spell checker" to his own
article: "this emerging method... will compliment established theory-driven
science".

