Hacker News new | past | comments | ask | show | jobs | submit login

Usually you throw everything and see what sticks. There are some obvious trends like if you're doing image classification, a convnet will beat just about anything else. There are problems which are particularly well suited to e.g. random forests and cases where you might want to use a simpler model which will run faster at the expense of a bit of accuracy.

In terms of neural nets, people have more or less done the hard work for you. Typically you'll want to take an empirically well-performing network like VGG, ResNet, Inception, etc and then re-train the top few layers. There is ongoing work in the field to try and train the structure of the network as well.

If you look at Kaggle, the vast majority of winners do so using ensembles. In a recent example - Iceberg or Ship - the winning team used a bit of feature engineering and apparently over 100 ensembled CNNs:

> We started with a CNN pipeline of 100+ diverse models which included architectures such as customized CNNs, VGG, mini-GoogLeNet, several different Densenet variations, and others. Some of these included inc_angle after the fully connected layers and others didn’t.

https://www.kaggle.com/c/statoil-iceberg-classifier-challeng...

It's an ugly kitchen sink approach, but it works.




> Usually you throw everything and see what sticks.

most practitioners start with the simplest possible learner, then gradually, and thoughtfully, increase model complexity while paying attention to bias/variance. this is far from a "kitchen sink" approach.


Certainly that's what most sensible practitioners do. I somewhat doubt that most people follow this to the letter every time.

It's a nice theory, and it works intuitively with models where you can ramp up complexity easily (like neural nets). It's less obvious if you have a "simple" problem that might be solved with a number of techniques. In that situation I don't see why you would be criticised for trying say any of SVM, random forest, logistic regression, naive bayes and comparing. Pretty much the only way you can categorically say that your method is better than others is by trying those other methods.

The simple approach actually came up in the iceberg challenge. The winning team won because they bothered to plot the data. It turned out that the satellite incidence angle was sufficient to segment a large number of icebergs with basically 100% reliability. So they simply thresholded that angle range and trained a bunch of models to solve the more complicated case when there might be a ship.


This was one of the things Andrew Ng really hammers on in the coursera course. This alongside separating out the training set from the cross validation set for tuning parameters went a long way to dispel some of the "magic" in how you iterate towards a sensible model.




Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: