Hacker Newsnew | comments | show | ask | jobs | submit login

So, this is a good post and everyone should be aware of these. But kernels ("non-linear" svms) are not a panacea. If you use kernels, then your model will need to remember some non-trivial number of the data points (the "support vectors" that define the decision surface). SVMs are nice in that they need a relatively small number of points compared to the size of the dataset. Moreover, classification is now linear in the number of points you kept around, and training is potentially quadratic. If you use a straight-forward linear classifier, you only need a single weights vector.

Second, SVMs---especially with a non-linear kernel---are much trickier to tune than logistic. SVMs are very sensitive to choice of hyper-parameters (user specified tuning weights), which means repeatedly retraining. And if you use too powerful of a kernel, you will overfit your data very easily.

For these reasons, my advice is to start with logistic, and then if you're not satisfied, switch to SVMs.

(As an aside, it's totally possible to kernelize logistic regression. Without suitable regularization, it'll be even worse with regards to the number of datapoints you need though.)




Applications are open for YC Winter 2016

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | DMCA | Apply to YC | Contact

Search: