Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Good, freely-available textbooks in machine learning (metaoptimize.com)
133 points by alextp on Sept 28, 2010 | hide | past | favorite | 33 comments


For anyone interested in Machine Learning let me again link to the entire lecture series from Stanford.

http://itunes.apple.com/WebObjects/MZStore.woa/wa/viewiTunes...


Andrew Moore's lecture slides are also very useful: http://www.autonlab.org/tutorials/index.html


youtube playlist for those who prefer to download flash instead of itunes (at least flash has open alternatives and don't hijack my whole computer)

http://www.youtube.com/view_play_list?p=A89DCFA6ADACE599

class info: http://www.stanford.edu/class/cs229/


AARGGH! I paid close to a hundred bucks for the Tibshirani book a few years ago! On the plus side, I think it's a beautiful, beautiful machine learning book that is a welcome replacement to Tom Mitchell's book, which is showing its age. I highly recommend the Tibshirani et al. book to anyone with college-level math and a desire to understand what the hell this machine learning thing REALLY is.


Not free, but I highly recommend "Programming Collective Intelligence" if you are looking for python examples applicable to web applications. http://oreilly.com/catalog/9780596529321


That book has the worst Python code ever in a print publication.


The book is quite approachable for people who are curious about ML and don't necessarily have a strong math background, though. After reading it, other ML literature has been quite a bit easier for me to follow.

In its defense, I think the Python code was written to be readable, rather than necessarily idiomatic. I started doing the exercises in Lua, and haven't had a hard time translating them thus far.


I'm not a computer scientist nor really a programmer and I found it very readable. Care to elaborate?


I read it when it came out which was a while ago. I found it to be very unidiomatic Python at the time.

More critique at http://news.ycombinator.com/item?id=208895 if you'll chase the parent links.


Idiomatic, no (agreed). Very readable, yes (I think the style was intentional). For tutorials, I generally implement everything myself as part of the learning process and avoid source copy-pasting. So I wasn't irked but YMMV depending on learning style.


ML definitely seems to be more on people's radar these days but as far as I can tell the job market is pretty small.


My experience has been the opposite:

http://metaoptimize.com/qa/questions/2542/job-prospects-in-m...

I guess it depends what you're comparing against. Against general purpose programming jobs, yes, the market is small. But, of all CS PhDs, people who do ML are the most sought-after.


That was actually my question. :)

I'm sure CS PhD's are in good shape but breaking into ML from another programming discipline seems pretty hard.


I think most of this is due to the fact that most ML techniques need lots of specific knowledge to apply correctly. Many introductions focus too much on specific algorithms, for example, while ensemble methods are probably best in real-world data. The pactices of building training and test sets, of regularizing, of doing proper feature engineering, and of making structured models (for example) are more easily learned in a mentor-student relationship than by reading blogs (reading papers might take you a long way, but you won't build an intuition as to why things work and why they fail without lots of experimentation or advice from someone who already has this intuition) and/or AI books, and ML books are usually far too technical (or too superficial in the technical side, as is programming collective intelligence).


Most of the ML books I've worked with so far seem a bit overly formal. Steven Marsland's book seems to strike the best balance between theory and implementation I've seen, even if the Python code is a little clumsy.


> I'm sure CS PhD's are in good shape but breaking into ML from another programming discipline seems pretty hard.

That's because ML isn't a "programming discipline". It's pretty much pure statistics, optimization algorithms, and linear algebra these days, and those algorithms are HARD to code, HARD to scale.


Those algorithms are actually surprisingly simple to code most of the time, if you're using a high-level language and aren't worried about scaling. Scaling them is hard, but for that you can already find good, scalable implementations out there of the basic building blocks.

In my experience coding machine learning algorithms is actually easier than the hairier sorts of programming (multithreaded, distributed, very low-level, etc) if you do the math first. Most errors are come from doing the math worngly (or not doing it at all) and most slowness are due to missing very obvious optimizations (which a good programmer will pick up on sometimes even if it's not explicit in the papers that describe the technique; some papers unfortunately assume you will pick up on the obvious optimizations yourself).


That's been my experience so far as well. The algorithms are surprisingly straightforward. Understanding the underlying theory well enough to get good results is the hard part.


This is true - my boss has been pressuring me to do the legwork required to get at least a masters in stats or CS so I can carry a higher title in our institutional research department.

Just for those outside of academia - state schools really drink their own Kool-aid. Associate/assistant directors require a masters degree in anything while directors require PhD credentials.


You think so? Well okay, it's not a topic for everybody, but with increasing data loads, it's about time people think more about machine learning technologies to use to get to know more about their data. I do both Operations Research and Machine Learning and see many possibilities to improve a company's knowledge about the data they have - even if they don't know it (yet).

Especially as with unsupervised ML you can definitely find patterns in your data, groups of similar data and trends, you can do forecasts, imply relationships, and much more.

It's a hard topic, and you definitely need a lot of time to get into the really bloody details, but it is definitely worth it. And it makes a lot of fun if you like maths and statistics.


I agree it's a fascinating field. I've been doing nothing for the past few months but intensive study of the fundamentals. I've had to review a lot of math but I've really been enjoying it. It also seems reasonable to me to expect that demand for those skills will grow.

My impression of the current job market, however, is that it's tough for somebody without formal academic qualifications to crack.


There I agree, without the academic qualifications, people will hardly listen to you talking about formal Hilbert spaces, Jaccard index, etc.


Depends on how you define the market. I think ML is still too much a pie in the sky research topic for a job, but it's in huge "demand" as a startup founder

i.e. If you can actually do ML, you're much more valuable as a startup founder than looking for a 9-5 job, unless it happens to be at say, Google.

My guess is it'll be another decade before ML and statistics become seriously sought after in "normal" corporate programming.


We are in the process of writing up and hiring three positions from Associate director to database admin here at KU for Institutional Research (the focus is on machine learning and prediction). I'd say it's a fair indicator in a 'business' that's faced 2%-8% annual budget cuts in the past three years that people even here are paying attention.

From my IR background - it's about bloody time.


I've read a good 1/4 of Artificial Intelligence - A Modern Approach, and scanned a handful of pages further on, and thus far it's been excellent.

So far, it looks to be very much rule / first-order logic oriented, but that makes sense, as it's been the most well understood, and much more easily provable / understandable once something is built. There's very little about neural networks, for instance, or genetic algorithms, aside from an explanatory section or so. Though I may have the first edition, and apparently the 3rd came out last year.

So, if anyone is interested in a relatively-introductory, extremely readable book on machine learning, it's the best I've come across so far.


> There's very little about neural networks, for instance, or genetic algorithms

For good reason. Outside of neuroscience, there's very little reason to use neural networks for classification or regression problems, and outside very ill-posed optimization problems, there's very little reason to use genetic algorithms. If you want a good MODERN machine learning book (which is where your post seems to imply your interests lie), try the Elements of Statistical Machine Learning by Tibshirani, Hastie and (I forgot the third guy's name, sorry). Anyway, it's included in the posted link.


Not sure neural net expert Yann LeCun (yann.lecun.com) would agree with your neural networks comment. Neural nets have taken on a life of their own since the time they were originated and actually have little direct connection with neuroscience these days.


This is true, but the very good deep learning results are very recent (from less than 5 years ago), and they generally require very different models and/or training strategies from the standard multilayer perceptron you find in most AI textbooks. So it's an omission, but I'm not aware of any textbooks that cover these new neural networks in depth, as basic research in this area is still going in a frantic pace.


Yeah, I'm finding that there's little book-format info on deep neural networks as well. Similarly for belief networks. Gotta hunt down some journals, and wield some Google-fu.


The most promising approaches for neural nets these days involve learning generative models using Restricted Boltzmann Machines (RBMs). Geoffrey Hinton at UToronto has some good seminal papers on these, like http://www.cs.toronto.edu/~hinton/absps/ncfast.pdf. See http://www.cs.toronto.edu/~hinton/ for excellent videos of digit generation / classification.

As a note, this approach outperforms all others (sans tweaks), whereas neural nets used to be beaten by SVMs and the like.


Actually, the best recent results in this area involve discriminatively trained stacked auto-encoders using second-order algorithms (for which the code, as of Sept 2010, is still private to the group at the university of toronto), as described in James Martens's paper at this year's ICML: http://www.cs.toronto.edu/~jmartens/docs/Deep_HessianFree.pd...


This is a very popular undergraduate book which I hated. I simply hated how the work is represented - instead of having good mathematical examples (and a clear notation!), they have a lot of text and a bad/substandard notation.


Then this is probably right up your alley: http://research.microsoft.com/en-us/um/people/cmbishop/prml/

"Pattern Recognition and Machine Learning". And I dare say there may be more space used on the equations than on the text.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: