

Machine Learning Course Materials - cdl
http://cs229.stanford.edu/materials.html

======
fintler
Stanford's CS229 was also on Coursera.

[https://class.coursera.org/ml/lecture/preview](https://class.coursera.org/ml/lecture/preview)

~~~
dvdhsu
I've posted about this many times, but it's worth noting that the Machine
Learning course on Coursera is CS229A(plied), which is different from CS229
the one most Stanford students take. The Coursera one is more useful if you
want to apply machine learning; the Stanford one linked by the poster is more
useful if you want to enter the field. The Coursera one glosses over a lot of
the mathematics behind the algorithms; the Stanford one delves into the
mathematics supporting the algorithms.

Here are the complete corse materials (including video lectures!) for CS229,
courtesy of Stanford Engineering Everywhere:
[http://see.stanford.edu/see/courseinfo.aspx?coll=348ca38a-3a...](http://see.stanford.edu/see/courseinfo.aspx?coll=348ca38a-3a6d-4052-937d-cb017338d7b1)

~~~
signa11
i did take the cs-229A when it was first offered on coursera, and your comment
is _exactly_ right. most of the mathematics was glossed over, and the focus
was on not "worrying" about it, but on implementation (of various algorithms)
using octave, and observe the results. which for neural-nets mostly boiled
down to some _slightly_ complicated matrix-multiplication.

having said that, it was an excellent _overview_ of a broad spectrum of ml
techniques, and i for one, would heartily recommend it to anyone with high-
school background in maths, and a deep interest in the field. supplementing
the material with simon-haykin's text or christopher-bishop's (most excellent)
book, would make it slightly tougher.

unfortunately, due to some time constraints i could not partake on geoff-
hinton's wisdom on deep-learning. would you happen to have material for that
stashed somewhere ?

~~~
James_Duval
Ooh, Matrix multiplication? Would I get to use Strassen's sub-cubic algorithm?
I've been looking for an excuse to use it.

~~~
signa11
afaik, octave uses blas, which in turn _should_ be using strassen's for it's
sgemm, dgemm etc. computations. at least it would be _very_ surprising if it
didn't...

~~~
James_Duval
Having done a very small amount of research into it I actually think it would
be surprising if it _did_ use Strassen's. EDIT: At least frequently, I assume
it would use Strassen's for larger matrices much in the same way as introsort
works.

Apparently it's becoming irrelevant/unwieldy as computing power increases.

"Practical implementations of Strassen's algorithm switch to standard methods
of matrix multiplication for small enough submatrices, for which those
algorithms are more efficient. The particular crossover point for which
Strassen's algorithm is more efficient depends on the specific implementation
and hardware. Earlier authors had estimated that Strassen's algorithm is
faster for matrices with widths from 32 to 128 for optimized
implementations.[1] However, it has been observed that this crossover point
has been increasing in recent years, and a 2010 study found that even a single
step of Strassen's algorithm is often not beneficial on current architectures,
compared to a highly optimized traditional multiplication, until matrix sizes
exceed 1000 or more, and even for matrix sizes of several thousand the benefit
is typically marginal at best (around 10% or less)." \-
[http://en.wikipedia.org/wiki/Strassen_algorithm](http://en.wikipedia.org/wiki/Strassen_algorithm)

Coppersmith-Winograd, assuming it is not another Galactic algorithm, looks
better -
[http://en.wikipedia.org/wiki/Coppersmith%E2%80%93Winograd_al...](http://en.wikipedia.org/wiki/Coppersmith%E2%80%93Winograd_algorithm)

However I do strongly suspect that this _is_ a Galactic algorithm.

In fact, judging from this ([http://www-
cs.stanford.edu/~virgi/matrixmult-f.pdf](http://www-
cs.stanford.edu/~virgi/matrixmult-f.pdf)), the Big O of time taken for matrix
multiplication has been decreasing steadily and without much fuss ever since
Strassen.

Again, I suspect these are _even more_ galactic algorithms than strassen's is.

~~~
RogerL
Right. Basically, Strassen's is not friendly to the cache - any improvement
you get in the asymptotic behavior is usually swamped by the cost of cache
misses.

In that vein, the naive implementation (O(n^3)) implementation is also not
cache friendly - if you flip the inner loops you will get far better
performance (in row major languages).

But I mainly replied because I love the concept of "galactic algorithms" and
Regan&Lipton (orignators of the idea and name).

EDIT: the BLAS routine in question is dgemm.f (double general matrix multiply)
and is easily googled so I won't paste it here. No Strassen's in sight.

------
pauloortins
I wrote a blog post with some material suggestions to learn Machine Learning.
If you want, visit my blog post:

[http://pauloortins.com/resources-to-become-a-ninja-
machine-l...](http://pauloortins.com/resources-to-become-a-ninja-machine-
learning/)

------
gtani
As a reference, you'll want one or two of the Big 6 texts, by Murphy,
Koller/Friedman, Bishop, MacKay, and Hastie et al ESL. The first review is
good [http://www.amazon.com/product-
reviews/0262018020/ref=dp_top_...](http://www.amazon.com/product-
reviews/0262018020/ref=dp_top_cm_cr_acr_txt?showViewpoints=1)

Also, there are many freely available texts on ML, data mining, stats/prob
distributions, linear algebra, optimization etc, incl Barber, Mackay and ESL.
See
[http://www.reddit.com/r/MachineLearning/comments/1jeawf/mach...](http://www.reddit.com/r/MachineLearning/comments/1jeawf/machine_learning_books/)

------
tomcrisp
For a second, I misread this as "Machine Learns Course Materials". Must make
this happen - wonder if I can use these machine learning course materials to
help.

