

K-Means Clustering and Art - mdwrigh2
http://0xfe.blogspot.com/2011/12/k-means-clustering-and-art.html

======
tylerneylon
I wrote some similar code in python recently -- I think color visualizations
like this are both fun and potentially useful for certain image manipulations.

This includes a readable 36-line implementation of k-means clustering that
could be shorter if one wanted to play some code golf :) I used a pie chart
layout, with pie slices proportional to their corresponding cluster sizes.

Code:
[https://github.com/tylerneylon/imghist/blob/master/imghist.p...](https://github.com/tylerneylon/imghist/blob/master/imghist.py)

Sample images: <http://blog.zillabyte.com/post/11193458776/color-as-data>
<http://blog.zillabyte.com/post/13141231882/hue-histograms>

If anyone else is interested in this stuff, Austin A made a great suggestion
on the original post to use the L _a_ b colorspace.

------
gburt
I also did something similar to this recently, but in PHP. I did grouping
similar colors (although I rendered mine as tiled squares), as well as a few
other things like grouping by brightness. Some neat results, for sure. I can
put the code up for it if anyone is interested.

Here was the quick k-means implementation I threw together if anyone wants to
play with it (my whole library licensed GPL).

[https://github.com/gburtini/Learning-Library-for-
PHP/blob/ma...](https://github.com/gburtini/Learning-Library-for-
PHP/blob/master/lib/unsupervised/kmeans.php)

It could definitely use some serious cleaning up (and I will probably OO-ize
it when I get a chance -- or I'll take pull requests), but it definitely
works.

------
mturmon
In general, you can toss any set of numbers into a clustering algorithm, and
it's kind of interesting to puzzle over the structures that come out. The more
you know about the domain, the more interesting it tends to be.

PCA can be the same way. You toss images or whatever in, and out come either
eigenvectors or principal components of the images. Either way it's often
interesting to domain experts.

------
nmb
Worth noting that Google Chrome also uses K-Means to select the color of the
stripe below a website thumbnail: [http://www.quora.com/Google-Chrome/How-
does-Chrome-pick-the-...](http://www.quora.com/Google-Chrome/How-does-Chrome-
pick-the-color-for-the-stripes-on-the-Most-visited-page-thumbnails)

------
Sharlin
Yes, k-means clustering is well known for its use in color quantization (for
instance, reducing the color depth of a 24-bit image to a 8 bit paletted
representation that most faithfully captures the original.) Another popular
algorithm is median cut which uses an k-d tree to recursively subdivide the
color space based on the median color values of the pixels in the source
image. Just about any image manipulation program that can output paletted
images probably uses one of these algorithms.

------
seanp2k2
OK, so I don't have much problem domain knowledge here, but couldn't you
optimize the cluster size based on algorithmic bounds on variation within the
cluster?

------
diiq
Wonderful. I am envisioning a live pair of images, palette and picture, so
manipulations of the palette altered the picture.

~~~
Newky
I could be wrong on this, but we learned about K-means clustering this year in
college, as far as I know, the K random exemplars which you use for K-means
clusters obviously reduces the number of colors in the image to a certain N (
N <= K). This would mean that any live manipulation of the K colors would
actually only modify the image which consists of only N colors.

The net result would be a live image like you suggest, but one with much less
detail. Still very interesting though.

~~~
gburt
You compute 'centroids' which denote the centers of the clusters, but you
don't have to change the values of all points to the centers of the clusters.

In other words, you can maintain the detail in RGB space (as this author has)
while reorganizing things in location space by their k-means clusters.

~~~
Newky
I understand this, but does this mean that it is bi-directional in that a
change to one of the palettes would reflect in the image. This is what I
understood from the comment. If so, how does this work? Sorry for any
misinformation in my comment

------
bfrs
Does anyone know how k-means ranks as an image segmentation technique? How
does it compare to say watershed, meanshift, globalPb, etc.?

