Everything you need to know about Machine Learning in 30 minutes or less

monk_the_dog · on July 5, 2012

If you already have a little exposure to machine learing, let me recomend an interesting review paper [1] on random forests: http://research.microsoft.com/pubs/155552/decisionForests_MS...

It isn't everything you need know in 30 minutes, but it's a concrete coverage of lots of topics in machine learning in under 150 pages. Here's why I'm recomending this paper:

* The algoritm is easy to understand.

* It can handle classification, regression, semi-supervised learning, manifold learning, and density estimation. The paper gives an introduction to each of these topics as well as a unified framework to implement each algorithm.

* It can handle categorical data and missing data [2]

* It gives as good results as other state of the art algorithms.

* The paper is well-written and easy to understand for someone without a deep background in machine learning.

[1] It's mostly a review paper. Using random forests for density estimation is new.

[2] This review paper doesn't cover categorical data or missing data.

droz · on July 5, 2012

http://lasa.epfl.ch/teaching/lectures/ML_Phd/Notes/ML_Lectur...

Is another great resource that introduces many ML topics from the ground up.

five_star · on July 5, 2012

This is great! Thank you very much for sharing.

kqr2 · on July 4, 2012

Hilary Mason has a longer introduction to machine learning video using web data, however, it isn't free.

http://shop.oreilly.com/product/0636920017493.do

ramblerman · on July 5, 2012

This looks really good. Thanks for the link

marcelsalathe · on July 5, 2012

Nice talk. The example of google translate is not a good one though. Say you translate from language A to language B with 99% accuracy, and vice versa, which would be pretty awesome, you'd still have a substantial quality decay after only a few back and forth translations (0.99^x where x is the number of translation steps).

vecter · on July 5, 2012

That's not a very realistic model of communication. If A communicates to B with 99% accuracy in the text, it's likely that B will 100% understand what's going. He will reply back to A with a 100% accurate message that's 99% accurate after translation, and so forth.

marcelsalathe · on July 5, 2012

I agree. But my impression was that she took the fact that google translate rapidly decays into gibberish as an indication that it's not doing a good enough job. I don't think you can argue that exactly because google translate does not have the interpretation capability.

hmason · on July 5, 2012

Don't read too much into that example -- I chose it as a humorous metaphor, not a mathematical argument, and I messed up the delivery in the talk, anyway.

I'll refine the example for the next time!

orenmazor · on July 5, 2012

I'm really interested in doing more machine learning work (my current projects, as interesting as they are, dont really require it).

I've done a few weirdo projects with NLTK, tho, and its great fun. By stream hacking do you mean offloading learning sets (active or initial) and that heavy overhead into the "cloud", or am I misunderstanding the terminology?

hmason · on July 6, 2012

In most data analysis work, we assume that the data resides in some database and that you have the luxury of iterating over that data as many times as you like to get to a final result.

The challenge with stream analysis is that you are dealing with a continuous stream of data where you can see each element of the stream only one time and must still be able to cluster/classify/analyze it. There are still few algorithms and tools designed explicitly for that purpose.

ninjin · on July 5, 2012

Not that I am doing Machine Translation (MT), but saying accuracy is a bit vague. The whole notion of what is lost in a translation using MT is to the best of my knowledge not fully captured with any well-established measure.

Fair warning, I haven't had time to have a look at the video (short break at work). I'll do it once I get home.

taliesinb · on July 5, 2012

For more advanced audiences, here's a great resource I discovered recently: http://videolectures.net/mlss09uk_cambridge/ -- 60 hours of lectures from the giants of machine learning delivered at a summer school held at Cambridge in 2009.

jtagen · on July 5, 2012

From the video, seems like this was a really tough crowd. Every other video I've seen with her speaking had people laughing and enjoying themselves.

geoffw8 · on July 5, 2012

Gutted. I was there on the Friday, but not the Saturday (which is when the exciting stuff was). The BrewDog talk was best!