Hacker News new | past | comments | ask | show | jobs | submit login
Machine Learning Video Library - Learning From Data (Abu-Mostafa) (caltech.edu)
193 points by mikhael on July 6, 2012 | hide | past | web | favorite | 23 comments

I am doing the https://www.coursera.org/course/ml from Stanford by Andrew Ng & I definitely recommend it.

I'm really excited by all of this free university level material flooding the web as I never even started college due to financial concerns (aka I didn't want to get any loans).

Do you know whether Prof. Ng has updated the material since the first run of the class?

We are still in the honeymoon phase of free, online university courses, so I think there's been relatively little criticism of what's available now, but I'll go for it: I was disappointed by the Coursera/Stanford ML class. It was obviously watered down, the homeworks were very (very) easy, and I retained little or nothing of significance.

In contrast, the Caltech class was clearly not watered down, and, as the material was much more focused (with a strong theme of generalization, an idea almost entirely absent from the Stanford class, as I recall) I feel I learned far more.

Another big difference: the Caltech class had traditional hour-long lectures, a simple web form for submitting answers to the multiple-choice* homeworks, and a plain vBulletin forum. The lectures were live on ustream, but otherwise, no fancy infrastructure.

So I think that some interesting questions will come up. Do we need complex (new) platforms to deliver good classes? For me, the answer right now is no -- what clearly matters is the quality and thoughtfulness of the material and how well it is delivered. Can a topic like machine learning be taught effectively to someone who doesn't have a lot of time, or who doesn't have the appropriate background (in CS, math)? Can/should it be faked? I don't think so, but I think there are certainly nuances here.

* Despite being multiple-choice, the homeworks were not easy -- they typically required a lot of thought, and many required writing a lot of code from scratch.

One of the conscious aims of the undergraduate coursera classes has been to lower the bar (in terms of assumed prerequisites, pace, and scope) in order to increase participation.

Daphne Koller's Probabilistic Graphical Models was their first graduate class and it was definitely tougher than other Coursera offerings have been.

This. The Coursera PGM class is the only free online class that I've enrolled in that felt like a similar difficult to a slightly harder than average undergrad course at Caltech (where I go to school).

Somewhat of a side-topic, but I just finished the Coursera compilers class. It didn't seem watered down to me, covering regular expression (including NFA and DFA representations), parsing theory and various top-down and bottom-up parsing algorithms, semantics checking (including a formal semantics notation), code generation (with formal operational notation), local and global optimization, register allocation and garbage collection.

I guess it was partially watered down in that the programming part of the class was optional.

The Coursera ML class is nowhere near the Stanford-level class in terms of academic rigor.

That being said, several of my peers who didn't go to the school really appreciated it for its accessibility.

I think the expectation of that class is to render ML education accessible and palatable, not to train everyone at an elite level. As this field grows, I'm sure the needs of various parties would be filled to an extent.

I don't think he has. In hindsight, I guess it does seem watered down - but personally, that is ok, I enjoy the pace / difficulty level right now.

However, I'm glad you pointed it out, because I'm eager to learn about ML & hope to use this (CalTech) material to augment the foundation I get from the coursera class.

I think the courses are great to get an idea of what the subject is about. If you face a related problem at least you will know wether it can be efficently solved. It will allow you to speak to an expert in the field at a basic level at least. That said, they certainly can be greatly improved.

Don't miss out on the original(i.e. before coursera) Andrew Ng lectures, starting here: http://tinyurl.com/6uqeoo2 These are also mathematically more rigorous.

I also see the course. I also definitely recommend it.

I took this course last term after reading an introductory book on machine learning and skimming through Andrew Ng's CS 229 lecture notes. I thought this class was particularly excellent at emphasizing the theoretical aspects of machine learning, as well as emphasizing some underlying themes (like avoiding overfitting with regularization and cross validation). The class didn't cover as many models and algorithms as many of the other ML classes, but I've found those relatively easy to learn with the intuition this course gave me.

I began the mlclass from coursera but I'd like a more advanced approach.

I found some courses(I don't know if there are more):

Andrew Ng Stanford CS229: http://cs229.stanford.edu/info.html

Caltech(the one from the OP link): http://work.caltech.edu/telecourse.html

Tom Mitchell Carnegie Mellon: http://www.cs.cmu.edu/~tom/10701_sp11/

I'm considering following the Tom Mitchell course as it seems to go deeper into the details, also because it uses a pretty cool bibliography.

What do you think, am I making the right choice?

I really liked the Google talk http://www.youtube.com/watch?v=AyzOUbkUf3M and there are a bunch of advances in machine learning mixing technologies, like inductive learning & genetic programming. The Google video also shows some combinations of techniques to make it learn much faster.

Fortunately I can find videos and whitepapers on all those subjects, but seems the libraries are all very much in 'the past'. Maybe I don't know about some, but is there a library/toolbox like Weka which implement all modern & old algorithms and allow you to play on datasets mixing and matching them? Maybe I just couldn't find that, but Weka seems to be too primitive for that?

Disclaimer: I majored in AI a long time ago and I understand most of these concepts, but I have never touched it after I finished, so I'm not up to date/aware of everything, so sorry if I missed a famous tool or something.

If you enjoyed Geoff Hinton's talk you will probably find the theano 'deep learning' library to be of use. Still undergoing quite a lot of iteration but powerful and you get to run your stuff on the GPU for added fun. http://deeplearning.net/software/theano. Incidentally Hinton gave another google tech talk in 2010 http://www.youtube.com/watch?v=VdIURAu1-aU.

Thank you for that deeplearning link; I guess that's my missing link! I did Google for that many times and it's the first hit, so I have no clue how I missed that. Anyway, thanks!

Being a newbie in ML, I found intro video quite helpful, having difficulty to grasp the idea of training, why is it needed etc, I found Mostafa's explanations quite helpful. I have taken ML's by Ng as well and due to heavy use of stats I could not grasp it.

Now I am learning Stats by Prof.Thrun at Udacity I assume I will be able to grasp it in much better way.

p.s: anyone is trying to learn ML basics and wish to learn? Why not learn together and solve the interesting problems together? contact me via email given in profile

I am not sure if this helps but posted yesterday on HN - http://news.ycombinator.com/item?id=4200931 (Good for newbies like myself)

No genetic algorithms :( I love genetic algorithms.

Although certainly a related field (to the point that the UCL ML MSc offers an Evolutionary Computation course), genetic algorithms are not ML per se.

I strongly disagree, though it depends on your definition of what machine learning really is.

If you define machine learning algorithms as those that learn from data, then okay. In that case, EAs are reinforcement learning algorithms and other methods like Q-learning are also not ML algorithms.

However, I use the canonical definition by Mitchell: A computer program is said to learn from experience E with respect to some class of tasks T and performance measure P, if its performance at tasks in T, as measured by P, improves with experience E.[1] In this case, it's clear that in fact EAs are very much so an ML method. One could even go so far as to say they are more appropriately ML than things like SVMs as they are truly learning from experience rather than just data being handed to them.

[1] http://en.wikipedia.org/wiki/Machine_learning#Definition

I don't necessarily disagree with you. However, I find that most contemporary real-world ML applications (from face recognition to collaborative filtering to computer vision) do use algorithms that "learn from data", using some kind of learning technique, i.e. supervised, unsupervised or a combination of those.

The process of creating that kind of algorithm is also very different: it is based heavily on mathematical/statistical/probabilistic methods and hence the resulting algorithm can be proven that it works with some kind of certainty. On contrary, creating an EA is mostly some kind of "art" (as one UCL professor put it).

All in all, I can't help but feel that even though they are both approaches to the same problem ("how can a computer program learn?"), data-driven methods and EA algorithms don't share much more. And since the results produced by the former are what most people expect of an ML algorithm to accomplish, I tend to think of those when using the phrase "machine learning".

But it's all about semantics in the end.

// I noticed that you are an ML PhD student. So you know exactly what I'm talking about.

True enough. I just love genetic algorithms so much <weeps in to YouTube>

this links summarizes it all! other links are needed in order to strengthen your knowledge!

Registration is open for Startup School 2019. Classes start July 22nd.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact