
A Tour of Machine Learning Algorithms - ColinWright
http://machinelearningmastery.com/a-tour-of-machine-learning-algorithms
======
billmalarky
Here's an interesting series of tutorials on intro to data learning / machine
learning. I've been able to read through several of them and it is very
beginner friendly but still useful.

Main website:
[http://guidetodatamining.com/chapter-1/](http://guidetodatamining.com/chapter-1/)

The actual pdfs [http://guidetodatamining.com/guide/ch1/DataMining-
ch1.pdf](http://guidetodatamining.com/guide/ch1/DataMining-ch1.pdf)

[http://guidetodatamining.com/guide/ch2/DataMining-
ch2.pdf](http://guidetodatamining.com/guide/ch2/DataMining-ch2.pdf)

[http://guidetodatamining.com/guide/ch3/DataMining-
ch3.pdf](http://guidetodatamining.com/guide/ch3/DataMining-ch3.pdf)

[http://guidetodatamining.com/guide/ch4/DataMining-
ch4.pdf](http://guidetodatamining.com/guide/ch4/DataMining-ch4.pdf)

[http://guidetodatamining.com/guide/ch5/DataMining-
ch5.pdf](http://guidetodatamining.com/guide/ch5/DataMining-ch5.pdf)

[http://guidetodatamining.com/guide/ch6/DataMining-
ch6.pdf](http://guidetodatamining.com/guide/ch6/DataMining-ch6.pdf)

[http://guidetodatamining.com/guide/ch7/DataMining-
ch7.pdf](http://guidetodatamining.com/guide/ch7/DataMining-ch7.pdf)

(edit: formatting)

------
rm999
Good write-up, but IMO these kinds of posts are not useful for most people
(despite there being many just like it) - the 30,000 foot view is too much
information for a beginner but too basic for someone who is knowledgeable in
the field. In my ten years of being fairly immersed in machine learning there
was maybe a one month period where an article like this would have been very
useful, and that was a couple of years in after I had mastered the basics of
the more useful techniques (like generalized linear models and SVMs) and was
ready to learn what else was out there. Honestly, learning the high-level
overview was not hard at that point.

I think a solid machine learning book coupled with a chart like this
([http://scikit-
learn.org/stable/tutorial/machine_learning_map...](http://scikit-
learn.org/stable/tutorial/machine_learning_map/)) is the most useful way to
approach machine learning.

~~~
nostrademons
This table of contents is beginning to seem very familiar (I've done Google's
internal machine-learning course and looked at several other courses/tutorials
on the net), but what I'm wondering is: how do you do the more basic
operations of feature and parameter selection? When I've tried using machine
learning in my daily work, I usually top out at "not quite good enough",
largely because the features I end up choosing have a lot of noise in their
data. Meanwhile, more experienced experts (those who have been doing this for
10+ years) produce much better results in much shorter time periods because
they use totally off-the-wall features that I would never have considered.
When they've presented at these classes, they've said that feature selection
is really the vast majority of the work in practical machine learning. How do
you develop an intuition about which features will be useful? Are there any
resources that are more like "A practitioner's guide to machine learning",
talking about the stuff that's useful in practice and not just in theory?

~~~
gms
Experience, as there is no canonical feature set that applies to all problems.
Feature sets instead tend to be domain-specific.

The usual iteration loop for a machine-learning system is:

1) Try features.

2) Evaluate your system. If you are happy with your scores/metrics, stop.

3) Do error analysis/run diagnostics. Go to 1.

It is the last step I find inexperienced people usually lacking. You need to
examine your errors and find commonalities among them.

For example, is there a certain class of error that hints at a certain feature
you haven't tried? Do your existing features provide decent results to begin
with? Or maybe you should just get more data? Whatever path you take, it needs
to be defensible.

So, in terms of general advice:

1) Realise it's an iteration loop, not a fixed recipe.

2) Be able to justify why you are trying whatever it is you are trying.

With time, you'll develop intuition for given domains.

~~~
jimbokun
"2) Be able to justify why you are trying whatever it is you are trying."

This is just a restatement, I think, of the question you are responding to. I
suspect a book on this topic would be quite popular.

------
j2kun
I wrote a series of didactic posts actually deriving and implementing some of
these algorithms, with long-term plans to get the heftier ones like SVM and
boosting.

[http://jeremykun.com/main-content/](http://jeremykun.com/main-content/)

~~~
gone35
For an earlier, non-stultifying version of the site with an actually usable,
tablet-scrollable wordpress theme, see:

[https://web.archive.org/web/20121114212050/http://jeremykun....](https://web.archive.org/web/20121114212050/http://jeremykun.wordpress.com/main-
content/)

Great content, otherwise. But I swear all this 'web 3.0' gimmickry has set the
web back at least a decade.

~~~
j2kun
TIL that "stultify" is a word. And that my theme is bad for tablets.

------
ColinWright
Some are saying they're getting a 500 response "Internal server error." It's
working for me, but just in case, here's the Google cache:

[http://webcache.googleusercontent.com/search?q=cache:wt-
Qc3D...](http://webcache.googleusercontent.com/search?q=cache:wt-
Qc3DtiBYJ:machinelearningmastery.com/a-tour-of-machine-learning-algorithms/+)

 _Edit: fixed link_

------
willis77
This is a tour like being handed a list of animals is a tour of a zoo.

------
weavie
It seems every week now there is a new guide to Machine Learning coming out.
Is it just that I have started noticing them, or is there a real resurgence of
popularity for these algorithms?

------
genofon
Why are there so many post about Machine Learning for beginners, all these
posts are basically saying the same things (I'm part of the problem as I did
it as well some time ago)

My explanation is that it's easy to write about that and feel good after.

------
cjauvin
I wasn't aware of Association Rule Learning, which seems like a very
interesting and useful concept.

------
weishigoname
there are so many machine learning algorithm out there, it is hard to decide
where to start, I know it isn't realistic to learn one by one, no one have
that much energy, any suggestions for who love this field, and don't know much
about it? very appreciate.

------
waterloong
Nothing for recommender systems?

------
Kip9000
broken

~~~
ColinWright
In what sense "broken" \- it loads for me and is extremely useful as a catalog
and description.

Can you be more specific?

~~~
cruise02
I'm getting a 500 response "Internal server error."

~~~
riffraff
same here, 500 and "This site is hosted by HostGator!"

~~~
angersock
Snappy strikes again!

