As someone who has spent an embarrassing amount of time on various independent education, one of the key things I've taken from it is just how efficient text books are. Not only can you read more quickly than people can speak, but it's also active by nature. I've often found my attention wandering during videos, but it's just not possible to read without putting in a minimum amount of focus. It's also a lot easier to modulate your reading speed based on how easy material is for you than it is to do the same during a lecture video.
Some general thoughts on MOOCs:
Coursera and edX tend to be great for small, self-contained topics and the automated graders for programming assignments is great as well. The forums are also useful, though not ideal (since there are no practice questions students can get help with that don't fall under the honor code).
Where modern MOOCs really fall down is prerequisites. It's surprisingly difficult to do something like structure an entire CS degree from Coursera classes. Though many classes are taught by famous CS professors, they are from different institutions that break material into courses in different ways. Worse still a lot of the classes are either watered-down or shortened or both.
MIT's Open Courseware archives are actually a lot better for this. There are no certificates, and no credentials, but nearly all the material is freely available. The one biggest inefficiency though, is all the time spent in the lectures. At least they can be played back at a higher speed, but the lectures really do take a lot more time and cover less than the textbooks. For courses that have good textbooks, I think the best approach is to skip the lectures except in portions where you feel like you need more review.
Finally Khan Academy is fantastic for answering specific, mechanical questions (e.g. how to calculate eigen values), but a bit light on material. I'd use it as a supplement for the other resources.
It's not a full CS curriculum, but I'd think vertical narrow "mini curricula" may be good enough.
But what MOOCs give me is the exercises. I often think I understand a problem, but it's only after getting 2.1/10 on a Coursera quiz that I realize I've missed a key step or concept.
Many textbooks have exercises but few have solutions. I've been working through Barto and Sutton's Reinforcement Learning for example (again), and although I do the exercises and programming questions, I never know if I've gotten it right. My experience with MOOCs shows I probably haven't in a large number of cases.
The best of both worlds is when I can follow a MOOC with the textbook to gain more depth, for example with the PGM course on Coursera.
It's just as instructive (and usually pretty embarrassing) to go back and work through the examples a day or two after you first read them. And then a day or two after that :)
I believe that the optimized future is MOOC Problems to start, Texts and Recorded lectures to help where you struggle, humans to help when the texts and recording miss your mistakes.
Also Csaba Szepesvári (a colleague of Rich's at the U of A) has a free RL book you can download. http://www.ualberta.ca/~szepesva/RLBook.html
A related book is "Regret Analysis of Stochastic and Nonstochastic Multi-armed Bandit Problems" which you can find at http://www.princeton.edu/~sbubeck/index.html
The bandit problem is very strongly related to the reinforcement learning problem, so you'll get some mileage out of studying bandits. Be aware this area is very maths heavy, which is good or bad depending on your background. If you like you like this stuff, also checkout "Prediction, Learning, and Games" which deals more with the "adversarial" setup.
Apparently Sutton was going to do a second book, but has since retreated to a second edition. I'll take it though. :)
The Barto and Sutton book is available online too,
although the printed copy is a high-quality publication so maybe worth it.
Check out Wiering + van Otterlo's "Reinforcement Learning: State-of-the-art." Covers many new techniques--I used it as a reference for a project earlier this year:
* James, Witten, Hastie, and Tibshirani's An Introduction to Statistical Learning, with Applications in R
* Hastie, Tibshirani, and Freedman's Elements of statistical learning (more advanced)
The problem with ML is there are so many different kinds. Bishop's book is a decent light weight survey, but it doesn't come close to covering all the interesting fields. You could read that and Hastie/Tibshirani's book and still know almost nothing about online training (hugely important for "big data" and timeseries), reinforcement learning (mentioned, but not in any depth), agent learning, "compression" sequence predicting techniques, time series oriented techniques (recurrent ANNs for starters, but there is a ton to know here, and most interesting data is time ordered), image recognition tools, conformal prediction, speech recognition tools, ML in the presence of lots of noise, and unsupervised learning. I don't own PGM, but it probably wouldn't help much in these matters either. I know guys who are probably level 4 at machine learning who don't know about most of these subjects. On the other hand, Peter Flach's book "Machine Learning" at least mentions them and makes pointers to other resources.
"Deep learning" is becoming kind of a buzzword for a big basket of tricks. I think it's worth knowing about drop-out training, and the tricks used to do semi-supervised learning, but the buzzword is silly. Technically "deep learning" just means "improved gradient descent." I figure level-4 is anyone making progress coming up with new techniques.
That said, reading good books is one way to make progress. Knowing the right people is the other way.
I'd also really like to suggest DGL (http://books.google.com/books/about/A_Probabilistic_Theory_o...) and Bickel and Doksum (http://www.amazon.com/Mathematical-Statistics-Basic-Selected...). These are two of my favorite core ML/stats books.
1) "Data Mining and Analysis: Fundamental Concepts and Algorithms" by Zaki and Meira
This book covers many ML topics with concrete examples.
2) "Computer Vision: Models, Learning, and Inference" by Simon Prince: http://web4.cs.ucl.ac.uk/staff/s.prince/book/book.pdf
Despite a CV book, the first half of it is like a statistics book that comes with examples in CV which are very easy to follow.
For those I'd suggest "Pattern Recognition and Machine Learning" by Bishop. I've read throughout this and it's really well organized and thought out. For more mathematically advanced ML stuff I'd suggest "Foundations of Machine Learning" by Mohri. For a good reference for anything else I'd suggest "Machine Learning: A Probabilistic Perspective" by Murphy. For more depth on graphical models look at "Probabilistic Graphical Models: Principles and Techniques" by Koller.
On the NLP front there's the standard texts "Speech and Language Processing" by Jurafsky and "Foundations of Statistical Natural Language Processing" by Manning.
I also like "An Introduction to Statistical Learning" by James, Witten, Hastie and Tibshirani.
For mathematical foundations of ML, I would recommend the book "Understanding Machine Learning: From Theory to Algorithms" by Shai Shalev-Shwartz.
A brief version of the book is available to download on the author's website: http://www.cs.huji.ac.il/~shais/Handouts.pdf
At the same time, it's the only book I've seen that covers online learning well. Can you think of any others?
 scroll down and look at the map.
When initially starting a dense subject such as PGM, having my hands held through the introductory material with incremental practice problems as the topic elaborated, helped me get a much more intimate grasp. Initially only reading superficially and watching lectures, I kept getting stumped trying to form a cohesive mental map of all the interleaved concepts.
I haven't started it yet, but this book was recommended by some folks at my company.
We could use some more ML recs on https://books.techendo.com/
How about learning things in python? Good recommendations?
How about start with a great lecturer like -
Nando de Freitas - https://www.youtube.com/channel/UC0z_jCi0XWqI8awUuQRFnyw
David Mackay -http://videolectures.net/course_information_theory_pattern_r...
or the (sometimes too dense) Andrew Ng -
The article addresses this almost immediately.
It's fine to disagree, but crapping a 'Really?' plus some contextless links onto someone who put forth general reasoning for the nature of the recommendations and spent 1300+ words describing expectations, key takeaways and projects for those recommendations is just lame.
I used to hate lectures when I was in school but now I sort of prefer them. It's easier for some reason. It's more passive; you just sit there and listen rather than actively read. It doesn't seem to be slower like others complain, and may even be faster. I read difficult texts very slowly and methodically, and often have to reread stuff.
A great example of this is Andrew Ng's course, even though he is the co-inventor of LDA (complicated Bayes network) he does not explain Bayesian analysis in his course.
"But why textbooks? Because they're one of the few learning mediums where you'll really own the knowledge. You can take a course, a MOOC, join a reading group, whatever you want. But with textbooks, it's an intimate bond. You'll spill brain-juice on every page; you'll inadvertently memorize the chapter titles, the examples, and the exercises; you'll scribble in the margins and dog-ear commonly referenced areas and look for applications of the topics you learn -- the textbook itself becomes a part of your knowledge (the above image shows my nearest textbook). Successful learners don't just read textbooks. Learn to use textbooks in this way, and you can master many subjects -- certainly machine learning."
David Mackay's lectures are incredible and go much further in explaining the material in an understandable fashion than his excellent book "Information Theory, Inference and Learning Algorithms".