

Introduction to Support Vector Machines - rvlasveld
http://rvlasveld.github.io/blog/2013/07/12/introduction-to-one-class-support-vector-machines/

======
joshuaellinger
My ML PhD friends say that SVM are popular in academic circles in part be
cause you can prove things on them. They are in the NN camp when it comes to
solving problems.

I like the basic idea. It feels like a nonlinear relative of PCA (Principal
Component Analysis). I also haven't seen NN used for outlier detection.

~~~
tel
SVM comes from a theory called Vapnik–Chervonenkis dimension which talks about
ways to balance best learner performance against likelihood of actually
achieving that theoretical optimum. It's very nice theory.

The nonlinear stuff is called the "kernel trick" and is essentially a
recognition that an SVM's operation is depends on the data exclusively through
measurements of how far apart the n^2 data points are from one another. Once
you realize that you can twist the definitions around so as to insert whatever
kind of non-linear notion of "distance" you like. Many ML algorithms can be
augmented by the kernel trick.

Finally, SVM's geometry allows you to think of it as a relatively nice
optimization problem. Compare this to many other ML algorithms which have only
heuristic optimization methods. It makes the algorithm surprisingly fast.

------
therobot24
this may end up as an empty thread as SVMs are pretty simple and very well
known...i'd like to say that i enjoyed the post - one hobby of mine is reading
the same mathematical/technical material through different
writers/perspectives

~~~
wslh
How is SVM doing comparing to the Neural Network renaissance with deep
learning?

~~~
mjn
It's always hard to get a birds'-eye view of these things, but from my vantage
point, I think SVMs are much more widely used in applications, but have been
surpassed by deep learning as the current hot research area in machine
learning. There is still SVM research, but it's plateaued compared to the
frenzy of the late 1990s and early 2000s, while deep learning has taken off in
the past few years. SVM has instead entered the realm of "standard ML tool",
as an off-the-shelf technique with robust and well-documented implementations
that can be applied in a reasonably straightforward way.

------
elchief
SVMs are great for text classification applications. They can handle a
stupidly large number of columns, and tend to be quite accurate.

Thorsten Joachims has some good papers on them (GScholar).

------
zmmmmm
Thanks for writing this. I love the illustration of the concept with the
video. I have little knowledge or background in the area so it is very useful.

I got confused where you introduced the objective function though. Is the
reader supposed to understand where that comes from or are you just presenting
it as a fact? It also doesn't make sense to me that the function is minimised
over 'b' but 'b' does not appear in the function?

~~~
rvlasveld
The minimization function is indeed "as a fact"; in the sense that it is
(somewhat) the formulation of Cortes and Vapnik
([http://link.springer.com/article/10.1007%2FBF00994018](http://link.springer.com/article/10.1007%2FBF00994018)),
which is the main formulation for Support Vector Machines.

The value of 'b' is minimized and the value if found in the first constraint.
The value "b\||w||" determines the offset of the hyperplane to the origin and
'w' is the normal vector of that plane.

