

How deep learning on GPUs wins datamining contest without feature engineering - doobwa
http://blog.kaggle.com/2012/11/01/deep-learning-how-i-did-it-merck-1st-place-interview/

======
etrain
The author points out something about these Machine Learning contests and
Machine Learning in general that I've noticed for a while - feature selection
tends to dominate learning algorithm selection. It's good to see that there
are modern academic methods for feature discovery that seem to be on par with
(or better than) a domain expert manually selecting features.

~~~
karpathy
Yes, but just as with normal feature engineering, don't make the mistake of
thinking that these methods are fully automatic work by magic. There is no
free lunch.

A common criticism with these methods is that they merely shift engineering
from features to parameters that specify the architecture. There are many
choices to be made: The exact number of layers, number of neurons per layer,
the connectivity, sparsity parameters, non-linearities, sizes of receptive
fields, learning rates, weight decays, pre-training schedule etc etc etc.
Perhaps even worse, while you can use intuition to design features, it is not
as trivial to see if you should be using a sigmoid, tanh, or rectified linear
units (+associated parameters for each) in the 3rd layer of the network. And
maybe even worse, these parameters can actually have quite a strong effect on
the final performance.

These are still powerful models and we are learning a lot about what works and
what doesn't (and I'm optimistic) but don't make the mistake of thinking they
are automatic. For now, you need to know what you're doing.

~~~
doobwa
I agree these methods still require a fair amount of expert knowledge and
intuition in order to make the various choices you mention. On the other hand,
Bayesian optimization can prove useful for exploring such a space. A recent
paper (<http://arxiv.org/pdf/1206.2944.pdf>) used Bayesian optimization with
GPs to find hyperparameter settings for a deep convolutional network. The
resulting hyperparameters gave state of the art performance, beating those
chosen by an expert (Krizhevsky, the researcher who recently won ImageNet).

~~~
danger
Agreed. Here's a previous HN discussion on that topic:
<http://news.ycombinator.com/item?id=4281630>

------
username3
[http://webcache.googleusercontent.com/search?q=cache:http://...](http://webcache.googleusercontent.com/search?q=cache:http://blog.kaggle.com/2012/11/01/deep-
learning-how-i-did-it-merck-1st-place-interview/&hl=en&prmd=imvns&strip=1)

------
doobwa
Given that pharma is a massive industry and that drug discovery often costs
around 1 billion dollars, the top prize of $22,000 seems awfully low. Will we
start to see larger prizes, or will startups take this technology and monetize
better than academia currently does?

~~~
jklio
With Geoffrey Hinton involved as a supervisor I expect they were on the
bleeding edge for other reasons anyway and just decided to scoop up some extra
cash as well. I've not looked closely but Kaggle does seem to be a little like
99designs though.

~~~
FrojoS
The Heritage Health Prize is $ 3 million! No exactly 99designs regime. [1]
<https://www.heritagehealthprize.com/c/hhp>

~~~
dbecker
The heritage prize is $3 million dollars if a certain threshold score is met.

I'm on the team currently in 1st place, and I don't think there is any chance
that any team will meet this threshold.

So, the final prize will be $500,000. Still, not 99designs.

~~~
FrojoS
Interesting. Thanks for the clarification!

