Hacker News new | past | comments | ask | show | jobs | submit login
Level-Up Your Machine Learning (metacademy.org)
336 points by cjrd on July 20, 2014 | hide | past | web | favorite | 46 comments

I think this is fantastic advice.

As someone who has spent an embarrassing amount of time on various independent education, one of the key things I've taken from it is just how efficient text books are. Not only can you read more quickly than people can speak, but it's also active by nature. I've often found my attention wandering during videos, but it's just not possible to read without putting in a minimum amount of focus. It's also a lot easier to modulate your reading speed based on how easy material is for you than it is to do the same during a lecture video.

Some general thoughts on MOOCs:

Coursera and edX tend to be great for small, self-contained topics and the automated graders for programming assignments is great as well. The forums are also useful, though not ideal (since there are no practice questions students can get help with that don't fall under the honor code).

Where modern MOOCs really fall down is prerequisites. It's surprisingly difficult to do something like structure an entire CS degree from Coursera classes. Though many classes are taught by famous CS professors, they are from different institutions that break material into courses in different ways. Worse still a lot of the classes are either watered-down or shortened or both.

MIT's Open Courseware archives are actually a lot better for this. There are no certificates, and no credentials, but nearly all the material is freely available. The one biggest inefficiency though, is all the time spent in the lectures. At least they can be played back at a higher speed, but the lectures really do take a lot more time and cover less than the textbooks. For courses that have good textbooks, I think the best approach is to skip the lectures except in portions where you feel like you need more review.

Finally Khan Academy is fantastic for answering specific, mechanical questions (e.g. how to calculate eigen values), but a bit light on material. I'd use it as a supplement for the other resources.

I tend to watch lectures at 2x or 1.75x speed, and I tone it down for more difficult stuff, or pause, or rewatch. I think at 2x it's pretty even with how fast I can read, maybe even a bit faster. Modulating it isn't perfect but it's workable. Also, I struggle very hard to pay attention to almost any lecturer, my mind drifts very quickly and takes a while before I realize it's drifted. But at 2x I find my ADD is mostly defeated.

I also watch at 2x speed. I do the same for audiobooks. Your brain adapts. Sometimes I play them in 1x speed just for a bit and it sounds like extreme slow motion, and I think, people really listen to this?

I have the same problem, either I speed it up or I play a passive game on my 2nd monitor. Something like Frozen Synapse works well

Talking abouyt MOOCs and coursera; the course that started coursera was a Machine learning course by Andrew Ng.

Which is a great course but very shallow. I feel like the OP's advice applies.

I think coursera actually started tackling the issue by offering "specializations", i.e. the "Data Science" specialization[0] is a set of 9 small classes which should in the end provide the student with the tools needed to be a "junior number hacker" or something like that.

It's not a full CS curriculum, but I'd think vertical narrow "mini curricula" may be good enough.

[0] https://www.coursera.org/specialization/jhudatascience/1?utm...

I love textbooks and spend more of my childrens' inheritance on them than I should.

But what MOOCs give me is the exercises. I often think I understand a problem, but it's only after getting 2.1/10 on a Coursera quiz that I realize I've missed a key step or concept.

Many textbooks have exercises but few have solutions. I've been working through Barto and Sutton's Reinforcement Learning for example (again), and although I do the exercises and programming questions, I never know if I've gotten it right. My experience with MOOCs shows I probably haven't in a large number of cases.

The best of both worlds is when I can follow a MOOC with the textbook to gain more depth, for example with the PGM course on Coursera.

Textbooks also have "examples" which are exercises with worked answers. If you cover up the solution and try to work through it on your own first, a worked example is very instructive.

It's just as instructive (and usually pretty embarrassing) to go back and work through the examples a day or two after you first read them. And then a day or two after that :)

Yes indeed. It's the exercises and feedback that help a lot.

I believe that the optimized future is MOOC Problems to start, Texts and Recorded lectures to help where you struggle, humans to help when the texts and recording miss your mistakes.

Incidentally, I liked Barto and Sutton's book, but it feels a bit dated. Does anyone have another reinforcement learning book to recommend?

I chatted with Rich Sutton recently and he said he was writing and update - not sure how far along in the process he is though.

Also Csaba Szepesvári (a colleague of Rich's at the U of A) has a free RL book you can download. http://www.ualberta.ca/~szepesva/RLBook.html

Csaba's book is the most up-to-date on RL I know of. Sutton and Barto is very old by now. For the POMDP side of things there are no recent books I know of, but http://www.cs.mcgill.ca/~jpineau/files/sross-jair08.pdf is a recent enough survey.

A related book is "Regret Analysis of Stochastic and Nonstochastic Multi-armed Bandit Problems" which you can find at http://www.princeton.edu/~sbubeck/index.html

The bandit problem is very strongly related to the reinforcement learning problem, so you'll get some mileage out of studying bandits. Be aware this area is very maths heavy, which is good or bad depending on your background. If you like you like this stuff, also checkout "Prediction, Learning, and Games" which deals more with the "adversarial" setup.

Agree with others that Barto & Sutton is dated, but it's not obsolete. I was chatting a couple weeks ago with one of Sutton's former students, who has since Made Good in both academia and industry working on AI, and he said that the B&S book is all you need to be able to read the literature, so it's still a good place to start.

Apparently Sutton was going to do a second book, but has since retreated to a second edition. I'll take it though. :)

The Barto and Sutton book is available online too, http://webdocs.cs.ualberta.ca/~sutton/book/the-book.html although the printed copy is a high-quality publication so maybe worth it.

Agreed, it's also a bit topical.

Check out Wiering + van Otterlo's "Reinforcement Learning: State-of-the-art." Covers many new techniques--I used it as a reference for a project earlier this year:


Wow, $280. I wonder if my public library will spring for it...

If you're a student you can likely get a PDF for $0 through your library. But...yeah. Steep price.

Two free books that I haven't seen mentioned, that are from more of a stats perspective

* James, Witten, Hastie, and Tibshirani's An Introduction to Statistical Learning, with Applications in R


* Hastie, Tibshirani, and Freedman's Elements of statistical learning (more advanced)


PGM is a tough book. I'm not sure it's the right book for "level 3" unless you want to be a level-3 who is good at PGMs.

The problem with ML is there are so many different kinds. Bishop's book is a decent light weight survey, but it doesn't come close to covering all the interesting fields. You could read that and Hastie/Tibshirani's book and still know almost nothing about online training (hugely important for "big data" and timeseries), reinforcement learning (mentioned, but not in any depth), agent learning, "compression" sequence predicting techniques, time series oriented techniques (recurrent ANNs for starters, but there is a ton to know here, and most interesting data is time ordered), image recognition tools, conformal prediction, speech recognition tools, ML in the presence of lots of noise, and unsupervised learning. I don't own PGM, but it probably wouldn't help much in these matters either. I know guys who are probably level 4 at machine learning who don't know about most of these subjects. On the other hand, Peter Flach's book "Machine Learning" at least mentions them and makes pointers to other resources.

"Deep learning" is becoming kind of a buzzword for a big basket of tricks. I think it's worth knowing about drop-out training, and the tricks used to do semi-supervised learning, but the buzzword is silly. Technically "deep learning" just means "improved gradient descent." I figure level-4 is anyone making progress coming up with new techniques.

That said, reading good books is one way to make progress. Knowing the right people is the other way.

aaaand, after a late night coding session, this turned into a 2000 word tirade which might be of interest: http://scottlocklin.wordpress.com/2014/07/22/neglected-machi...

PRML is great. I haven't read PGM, but I took a relatively intensive course on it which had great lecture notes. Which I'd like to also suggest—lecture notes are often "skeletal books" which can bring you up to speed on a topic quickly given that you (a) are willing to work a bit more and (b) can fill in the missing fleshy bits with your own experience.

I'd also really like to suggest DGL (http://books.google.com/books/about/A_Probabilistic_Theory_o...) and Bickel and Doksum (http://www.amazon.com/Mathematical-Statistics-Basic-Selected...). These are two of my favorite core ML/stats books.

There are some (free) good books that haven't been mentioned yet:

1) "Data Mining and Analysis: Fundamental Concepts and Algorithms" by Zaki and Meira http://www.cs.rpi.edu/~zaki/PaperDir/DMABOOK.pdf

This book covers many ML topics with concrete examples.

2) "Computer Vision: Models, Learning, and Inference" by Simon Prince: http://web4.cs.ucl.ac.uk/staff/s.prince/book/book.pdf

Despite a CV book, the first half of it is like a statistics book that comes with examples in CV which are very easy to follow.

I would also suggest K. Murphy's Machine Learning for the journeyman level. In the intermediate apprentice-journeyman level Alpaydin's Introduction to Machine Learning is very friendly.

I second the recommendation of Murphy. It's very comprehensive and well written.


Form my own journey I would say that a good place to start for graphical models might be "Bayesian Reasoning and Machine Learning" by Barber. It's free (http://web4.cs.ucl.ac.uk/staff/D.Barber/pmwiki/pmwiki.php?n=...). I haven't read through it, but I've heard good things. However, it doesn't cover some basic things like SVM, RVM, Neural Networks...

For those I'd suggest "Pattern Recognition and Machine Learning" by Bishop. I've read throughout this and it's really well organized and thought out. For more mathematically advanced ML stuff I'd suggest "Foundations of Machine Learning" by Mohri. For a good reference for anything else I'd suggest "Machine Learning: A Probabilistic Perspective" by Murphy. For more depth on graphical models look at "Probabilistic Graphical Models: Principles and Techniques" by Koller.

On the NLP front there's the standard texts "Speech and Language Processing" by Jurafsky and "Foundations of Statistical Natural Language Processing" by Manning.

I also like "An Introduction to Statistical Learning" by James, Witten, Hastie and Tibshirani.

I skimmed over Mohri's book and I think the topics it covers are quite narrow.

For mathematical foundations of ML, I would recommend the book "Understanding Machine Learning: From Theory to Algorithms" by Shai Shalev-Shwartz.

A brief version of the book is available to download on the author's website: http://www.cs.huji.ac.il/~shais/Handouts.pdf

Yes, Mohri's book takes a strong learning theory approach.

At the same time, it's the only book I've seen that covers online learning well. Can you think of any others?

Great recommendations, some people might also find this interesting as a general guideline to "Data science" http://nirvacana.com/thoughts/becoming-a-data-scientist/

[edit] scroll down and look at the map.

What I'd really appreciate is ideas on how to learn or review the core math concepts. I haven't actually done any multivariable calculus, vector/matrix calculus, or linear algebra in years, even though I took them in undergrad.

Wholeheartily agree with the author's sentiments on the value of textbooks. Not because of the medium itself, but because they're (usually) accompanied by well thoughtout examples and practice problems.

When initially starting a dense subject such as PGM, having my hands held through the introductory material with incremental practice problems as the topic elaborated, helped me get a much more intimate grasp. Initially only reading superficially and watching lectures, I kept getting stumped trying to form a cohesive mental map of all the interleaved concepts.

What are the HN community's thoughts on Learning from Data by Abu Mostafa, Magdon-Ismail and Lin (http://amlbook.com/)? The lectures from their course are here: http://work.caltech.edu/lectures.html

I haven't started it yet, but this book was recommended by some folks at my company.

I really liked it - it was a lot smaller than I thought it was going to be (I didn't look at the page count) but they definitely explain some really import core ideas in ML like bias/variance tradeoff, curse of dimensionality, etc. in a clear way.


I'm in the middle of PGM right now. It's actually really easy to follow if you put some time into it. I'm reading PRML next. I didn't realize there was a 'path' to learning ML though, thanks for that.

We could use some more ML recs on https://books.techendo.com/

I would like to start my ML Journey in Python, then get to R.

How about learning things in python? Good recommendations?

If you go Python, you could start here: http://nbviewer.ipython.org/github/ptwobrussell/Mining-the-S...

Jake Vanderplas, Olivier Grisel: Exploring Machine Learning with Scikit-learn - PyCon 2014 :


For a really nice introductory book, try "Machine Learning" by Tom Mitchell.

Textbooks? Really?

How about start with a great lecturer like -

Nando de Freitas - https://www.youtube.com/channel/UC0z_jCi0XWqI8awUuQRFnyw

David Mackay -http://videolectures.net/course_information_theory_pattern_r...

or the (sometimes too dense) Andrew Ng - https://www.coursera.org/course/ml


The article addresses this almost immediately.

It's fine to disagree, but crapping a 'Really?' plus some contextless links onto someone who put forth general reasoning for the nature of the recommendations and spent 1300+ words describing expectations, key takeaways and projects for those recommendations is just lame.

On a given topic, I believe the best textbook is an inferior medium to the best lecture. Rather than blathering for 1300 words, I provided links to some excellent machine learning lectures.

You sound like you've had some bad experiences with books. If you learn better with lectures, great. However, it's not for everybody -- I personally think spending time away from the computer (until I'm programming something), with a book, paper and a pen is very good time spent.

I read textbooks on my computer actually. Not by choice though, textbooks are difficult to find and too expensive.

I used to hate lectures when I was in school but now I sort of prefer them. It's easier for some reason. It's more passive; you just sit there and listen rather than actively read. It doesn't seem to be slower like others complain, and may even be faster. I read difficult texts very slowly and methodically, and often have to reread stuff.

Its because textbooks are usually more in depth than lectures can go due to time considerations, this is why any graduate program is 90% papers and textbooks.

A great example of this is Andrew Ng's course, even though he is the co-inventor of LDA (complicated Bayes network) he does not explain Bayesian analysis in his course.

Not sure how you saw the books and missed the explanation, but

"But why textbooks? Because they're one of the few learning mediums where you'll really own the knowledge. You can take a course, a MOOC, join a reading group, whatever you want. But with textbooks, it's an intimate bond. You'll spill brain-juice on every page; you'll inadvertently memorize the chapter titles, the examples, and the exercises; you'll scribble in the margins and dog-ear commonly referenced areas and look for applications of the topics you learn -- the textbook itself becomes a part of your knowledge (the above image shows my nearest textbook). Successful learners don't just read textbooks. Learn to use textbooks in this way, and you can master many subjects -- certainly machine learning."

I saw the explanation and I don't buy it. In pretty much every occasion that I've found a textbook valuable, I've found lectures by the authors far more valuable.

David Mackay's lectures are incredible and go much further in explaining the material in an understandable fashion than his excellent book "Information Theory, Inference and Learning Algorithms".

Registration is open for Startup School 2019. Classes start July 22nd.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact