Mean Shift Clustering

nicolewhite · on May 26, 2015

Nice. I like the visualizations.

I don't understand the author's reasoning for preferring mean shift over k-means due to k-means needing the number of clusters as an input. It seems that in mean shift, choosing the kernel bandwidth parameter is just as arbitrary in that it requires domain knowledge. In k-means there are tests[0] for choosing an appropriate k. Is there a similar strategy in mean shift for choosing an appropriate kernel bandwidth parameter?

[0] http://en.wikipedia.org/wiki/Determining_the_number_of_clust...

mattnedrich · on May 26, 2015

While choosing a kernel bandwidth may be arbitrary for certain situations, it does have some nice properties. The bandwidth informally defines how "close" points need to be to be considered similar, which can work well for certain problem domains where this can be easily determined.

Like most clustering problems, if you can't choose a reasonable set of parameter values based on some domain specific information, it is largely trial and error. Of course, there are several metrics that can be used to score certain clusterings (e.g., parameter values) over others, as described in your wikipedia link.

The other nice advantage that mean shift has over k-means is that it does not make any assumptions about the cluster shape. K-means assumes spherical clusters. Mean shift allows for clusters of any shape, since it is driven by density.

sytelus · on May 27, 2015

You can tune any free parameters by defining a loss function. For example, you loss function can panelize two close points to be separated in two clusters and vice versa. Once you define loss function, it's just matter of finding optimal value - although at extra cost.

One recently invested method of clustering called QuickCluster doesn't require any parameters and inherently operates on loss function. It's pretty fast in practice: http://www.cs.yale.edu/homes/el327/papers/CorrelationCluster...

yaroslavvb · on May 27, 2015

With mean-shift you can get non-convex clusters

nicolewhite · on May 27, 2015

Right. I understand those advantages. I was curious about the comparison of input parameters specifically.