

Statistics vs. Machine Learning, fight - zjj
http://anyall.org/blog/2008/12/statistics-vs-machine-learning-fight/

======
tel
Statistics has a pretty interesting stigma in science. _It's something you're
not allowed to question._ Generally.

You can question if someone's models are right. You can question if their
controls narrowed the experiment well enough. You can question if their
interpretation makes any sense. What you can't question is if something like a
t-test, LLS, or ANOVA is the best way to pull meaningful parameters out of the
data. Just look at how much resistance Bayesian methods face in publication.

This is a fundamental friction that statistics has to overcome as long as its
still called statistics. I think of ML as the parts of statistical research
that escaped through the window opened by Shannon back when he invented
information theory. It's a bird now, free to invent its words for the world
and try crazy stuff that the religion of "Statistics" could never accept.

This isn't to say that Statistics isn't growing. The article itself does a
good job pointing out just how similar recent Stat has been to ML. However, if
you see a research paper in some of the more core, less data intensive
sciences that dares to drop "SVM", "Ridge Regression", "Clustering", "Bayes",
or god forbid "Machine Learning" itself you see scowls: isn't your data
_normal enough_? Why do you need to do something fancy when I can work out the
z-score of that result right here on my pocket calculator/slide rule.

( _Lets go ahead and concede that ML certainly has a lot of broken yet
overhyped parts which helps form the nucleus of an argument not to infect
scientific knowledge with some untested infrastructure. Growing pains._ )

It's a classic fight between tradition and innovation with all the usual
arguments available, but what makes this different is that such a huge
community of people who thrive off the image of really, _really_ knowing
things pretty much take the frequentist methodologies as an unimpeachable gold
standard. Things can get dicey when you start to ask what the actual meaning
of a p-value is, how we really know anything about "estimators", why people
work so hard not to use computers. _It works! Stop asking questions._

------
jibiki
"What differs most is the teaching style. CS has far better lecture notes. Of
course, the stats people wrote a very good book; but better lecture notes win
because I can access them later and send them to people for free."

So very true. When I was a freshman in high school, we had mandatory "study
hall" periods where we just had to sit in a room and be quiet. I spent most of
them doing crosswords and sleeping, but I also spent a lot of time looking at
printouts of these notes:

<http://rutherglen.ics.mq.edu.au/wchen/lnentfolder/lnent.html>

It's really important to realize that not everyone who is interested in a
field has access to a library full of relevant publications.

------
patio11
And AI is the red-headed stepchild caught in the crossfire, where any approach
that actually produces worthwhile results is promptly excommunicated from the
field.

"If I create a program which successfully predicts which humans are
trustworthy and which are untrustworthy, would that be AI?"

"Are you kidding?! That would revolutionize the field! That's harder than
winning the Turing Prize! That's so far ahead of anything ever done it is hard
to even imagine what it would look like!"

"The algorithm exists. Its output is called a FICO score."

"Bah, that isn't AI."

~~~
JulianMorrison
Better examples: the face recognition in a camera, the route finding in a GPS
map, and Google.

