Since reading it I've noticed how many friends have it on their bookshelves.
Here's a link: http://oreilly.com/catalog/9780596529321
Free pdf download, probably not a one-flight book:
side note: Nat, did you intern at SGI in the late 90s, as the self-titled "armchair programmer of the apocalypse"?
The errata page: http://oreilly.com/catalog/errataunconfirmed.csp?isbn=978059...
Also, here are some additions for the online learning column:
* Online SVM: http://www.springerlink.com/index/Y8666K76P6R5L467.pdf
* Online gaussian mixture estimation:
One more thing: why no random forests? Or decision tree ensembles of any sort?
The course unfortunately couldn't cover all material on all algorithms, so the cheat sheet basically reflects my own knowledge rather than what's possible. I've referenced the Online SVM and Online Mixture model though, thanks for those.
Also, I'll have to look into stochastic gradient descent!
Some methods say online learning isn't applicable. As pointed out elsewhere, objectives for K-means and mixture models could be fitted with stochastic gradient descent. In general there is always an online option. For example, keep a restricted set of items and chuck out ones that seem less useful as others come in.
(Aside: I have a very introductory lecture to machine learning on the web: http://videolectures.net/bootcamp2010_murray_iml/ — not for anyone that knows the methods on this cheat sheat!)
Good point about using cross-validation to learn K, I forgot about that. I added this to the cheat sheet.
Also regarding online learning methods, I was probably a bit quick to dismiss certain algorithms as not supporting online learning; in coursework we unfortunately didn't have time to delve into all aspects of all algorithms. I've rewritten the Online column as "To be added." for those online methods I'm not familiar with (yet). Someone else is, of course, free to fork it on Github: http://github.com/Emore/mlcheatsheet
Otherwise, this is awesome. Hopefully you will add to it, and make it available in web form.
I've changed the title to "Algorithms for Supervised- and Unsupervised Learning", which is definitely more appropriate. Initially the cheat sheet only contained linear classifiers, hence the misleading title.