I'm enrolled in the online Applied ML class from Stanford, and I've also been wa...

zeratul · on Nov 5, 2011

About the bonus link:

It does not make sense to compare ensamble methods (bagging & boosting) with single instance classifiers. In practice, you try all classifiers and then you use best to create an ensamble. The paper leaves me unsatisfied, thinking that probably bagging or boosting SVM would give the best results.

lliiffee · on Nov 6, 2011

I don't see why not. Different classifiers have different bias/variance characteristics. If you want to increase variance and decrease bias, than boost your classifier. (This is why boosting is usually applied to simple classifiers.) But whether that will actually help depends on the characteristics of the problem and the classifier used.

I guess bagging is a different story. So far as I know bagging usually decreases variance with no bias penalty, so it is more a trade-off between variance and speed.

law · on Nov 5, 2011

It's actually fine to compare an ensemble method (using weak base learners) to a single instance strong learner. In this way, you compare the benefits of combining the weak learners with the benefits of using a single classifier. I see where you're going with that, but comparing ensemble methods with a single classifier is often a useful measurement.

zeratul · on Nov 5, 2011

That's true when experiments are design to show gain in performance due to some aggregation technique. The mentioned article achieved that only for DT and the body of the article doesn't seem to focus on the effects of ensemble methods.

monk_the_dog · on Nov 5, 2011

You make a good point. Ensemble methods seem to outperform single classifiers. There's no reason you can't have an ensemble of SVMs. The paper should have included something other than an ensemble of trees.

I tried to find a paper comparing an ensemble of SVM to an ensemble of trees and I came up empty (after a quick search). I did find papers showing ensembles of SVMs outperforming a single SVM. I also found a comment on a paper claiming an ensemble of trees out outperformed a "Parallel Mixture of SVM" (see here: http://www.mitpressjournals.org/doi/abs/10.1162/089976604323...). Of course, that's not a great source.

I absolutely agree they should have included ensembles other than trees. I don't necessarily agree an ensemble of SVM would have beat an ensemble of trees. It would have been interesting to see.

zeratul · on Nov 5, 2011

There was a suicide note emotion classification challenge:

http://computationalmedicine.org/home-0

Very noisy and sparse data. 25 teams. 22 system description papers. The winner used SVM ensamble.

monk_the_dog · on Nov 5, 2011

Zeratul, you're obviously into ML. Would you mind if I asked what your application is? I'm just curious.

I work in computer vision. When I do a machine learning problem, I spend most of my time brainstorming and implementing good features. I'm getting deeper into ML (and loving it). I'm always curious what other people are doing with ML.

zeratul · on Nov 5, 2011

Medical language processing, information extraction from patient data, text classification, and clustering.

Yes, it would be great to get a list of hackers that do ML and the domain that they are working with.

aperrien · on Nov 6, 2011

I'm working using ML in the casino industry. I use multiple forms of classification and forecasting.

monk_the_dog · on Nov 6, 2011

Once upon a time I thought about using ml/vision in slot machines. I would try to read the gamblers emotions/age/sex and the slot machine would change stimulation (music/lights etc; not mess with the odds) to try to keep them at the machine longer.

I thought it was a good idea until I actually visited a casino. People sit at the slots in what looks like a hypnotic state. The emotions don't change much. I don't think I could have made a measurable difference.

I'm not surprised the gambling industry is using ml, but cool to hear about it. Thanks.

bhickey · on Nov 5, 2011

To your list, I'd like to add Jaynes's 'Probability Theory' A few chapters are freely available here: www-stat.wharton.upenn.edu/~steele/Publications/PDF/PT.pdf

(The publisher asked the book's editor to stop distributing the whole PDF.)

shriphani · on Nov 5, 2011

I made the mistake of enrolling in a graduate level ML class without a strong foundation in statistics - my transcript is now going to be defaced permanently. But thanks for the inference text - is there an OCW version of an inference course?

danso · on Nov 5, 2011

I love it when people link to freely available academic texts, thank you.

Here's another one from Stanford: Mining of Massive Datasets http://infolab.stanford.edu/~ullman/mmds.html

monk_the_dog · on Nov 5, 2011

I just took a quick look on the chapter on clustering. Looks good! I'll put it on the ever growing stack. Thanks!