Hacker Newsnew | comments | show | ask | jobs | submitlogin

So, this is a good post and everyone should be aware of these. But kernels ("non-linear" svms) are not a panacea. If you use kernels, then your model will need to remember some non-trivial number of the data points (the "support vectors" that define the decision surface). SVMs are nice in that they need a relatively small number of points compared to the size of the dataset. Moreover, classification is now linear in the number of points you kept around, and training is potentially quadratic. If you use a straight-forward linear classifier, you only need a single weights vector.

Second, SVMs---especially with a non-linear kernel---are much trickier to tune than logistic. SVMs are very sensitive to choice of hyper-parameters (user specified tuning weights), which means repeatedly retraining. And if you use too powerful of a kernel, you will overfit your data very easily.

For these reasons, my advice is to start with logistic, and then if you're not satisfied, switch to SVMs.

(As an aside, it's totally possible to kernelize logistic regression. Without suitable regularization, it'll be even worse with regards to the number of datapoints you need though.)




Applications are open for YC Summer 2015

Guidelines | FAQ | Support | Lists | Bookmarklet | DMCA | Y Combinator | Apply | Contact

Search: