Hacker News new | past | comments | ask | show | jobs | submit login
A New Book: Building Machine Learning Systems with Python (metarabbit.wordpress.com)
134 points by derpapst on May 31, 2013 | hide | past | web | favorite | 19 comments

The book details building ML systems with Python and does not necessarily teach ML per se. It is a good time to write a ML book in Python particularly keeping in mind efforts to make Python scale to Big Data [0].

What material you want to refer to is entirely dependent on What you want to do?. Here are some of my recommendations-

Q : Do you want to have an "Introduction to ML", some applications with Octave/Matlab as your toolbox?

A :Take up Andrew Ng's course on ML in Coursera [1].

Q : Do you want to have a complete understanding of ML with the mathematics, proofs and build your own algorithms in Octave/Matlab?

A : Take up Andrew Ng's course on ML as taught in Stanford; video lectures are available for free download [2]. Note - This is NOT the same as the Coursera course. For textbook lovers, I have found the handouts distributed in this course far better than textbooks with obscure and esoteric terms. It is entirely self contained. If you want an alternate opinion, try out Yaser Abu-Mostafa's ML course at Caltech [3].

Q : Do you want to apply ML along with NLP using Python ?

A : Try out Natural Language Tool Kit [4]. The HTML version of the NLTK book is freely available (Jump to Chapter 6 for the ML part) [5]. There is an NLTK cookbook available as well which has simple code examples to get you started [6].

Q: Do you want to apply standard ML algorithms using Python?

A : Try out scikit-learn [7]. The OP's book also seems to be a good fit in this category (Disclaimer - I haven't read the OP's book and this is not an endorsement).

[0] http://www.drdobbs.com/tools/us-defense-agency-feeds-python/...

[1] https://www.coursera.org/course/ml

[2] http://academicearth.org/courses/machine-learning/

[3] http://work.caltech.edu/telecourse.html

[4] http://nltk.org

[5] http://nltk.org/book/

[6] http://www.amazon.com/Python-Text-Processing-NLTK-Cookbook/d...

[7] http://scikit-learn.org

For anyone curious to learn more about machine learning, I would recommend: http://www.amazon.com/Machine-Learning-Algorithmic-Perspecti...

To tag onto this, I found "Learning from Data" [0] by Abu Mastafa to be a great intro to the field. It's not heavy on the math, but it doesn't gloss over it either


Have to agree. And it's very inexpensive because Yaser refused to give-in to academic publishers, who would've charged the typical $70-80, and self-published so he could offer it for less than half the cost.

Not only is the book great, but his lectures are PHENOMENAL. He breaks concepts down in such a careful, accessible way. Its a bit late to join the online course, but you can see all the lectures on YouTube (work.caltech.edu/telecourse.html) or iTunesU (I prefer the latter, using the app on iOS - awesome b/c you can bookmark and record notes at those marks - otherwise I notice these video types of courses are way less useful - no way to review - wish Coursera/Udacity/EdX had that feature.)

Yaser is an awesome guy btw - he's very active on the forum (see the link from the above caltech site - on right hand side). He is very gracious with his time - I'm not a CalTech student, and yet he has answered all my questions and even helped me find a tutor for the course that was a previous student at CalTech (I live in Pasadena). He truly cares - and that comes off in the lectures as well. Enjoy!

Agreed with everything you said. Only thing that's missing from his lectures are the homework assignments, which are only available to those who signed up for the online course (signups are closed now), and I can't even make a post about it on the forums, because I don't have the book. :(

I take notes on all videos with http://videonot.es

Unfortunately Amazon won't ship this book outside of the United States.

You should be able to get (illegal) PDFs of most popular books with a simple Google search. I found a PDF of the ML book I mentioned earlier as the top result on Google for "<name of book> pdf".

Admittedly epub is a better format, because it naturally reflows on smaller screens, but "free" epubs are harder to come across. I've been thinking of converting some really good PDFs that I have, to ePub myself, but just haven't gotten around to it yet.

Actually, according to the authors' website (http://amlbook.com/), Amazon does ship the Learning from Data book to many different countries outside the US.

For the curious, I think nothing beats the introduction to ML class from Stanfords Andrew Nq [1]. He lectures and explains with a clarity and consistency I don't often see.

[1]: https://www.coursera.org/course/ml

I really want to take the course -- but I wish there was a textual version of it. I strongly strongly prefer text to audio/video.

Also, the voice of some of these lecturers have this sort-of monotone to it, that has the tendency to let you mind wander off. They're just not "arresting" enough.

For instance, I took the Crypto I class part-way on Coursera, and had this experience. The instructor voice was slow, drawn-out and kind-of put you to sleep. I actually downloaded the videos and just played it on VLC at 1.25x or 1.5x the speed (because he spoke so annoyingly slow).

On the other hand Tim Roughegarden (I think that's his name), who teaches an Algorithms class on Coursera, has an amazing "video personality". Just the way he speaks -- it catches your attention. He passion and enthusiasm for the topic really come across. Now, I'm not saying the other professors aren't as passionate about what they teach -- but it's just that some of these lecturers have a really good way of bringing it through (their love for the topic) on video. Not everyone can (or is) doing it.

You get annotated as well as original PPT-slides along with clear text transcripts of what he says in the videos. Can be a bit awkward as it is not a textbook text but it gets the job done. I honestly think it is hard to do a better course than what you get from Ng's Machine learning on Coursera.

Taking this course right now. Awesome.

I have this book, and would not recommend it as a stand-alone learning guide. It gives a decent intuitive treatment of some topics but is inconsistent. It will furthermore jump between e.g. an explanation of neural networks without any mathematics to the full derivation of backpropagation. It tries to hit a sweet spot between rigor and intuition but in my opinion it largely fails to bridge the two.

Unfortunately, I'm not aware of any good ML books that are current. Mitchell's was really good but is out of date. Bishop is a megalithic tome of statistical mathematics and is better as a reference than a textbook. I think that a good MOOC course paired with selected readings is the best currently available option.

Just checked out Mitchell's site. Looks like he's been working on a 2nd ed for awhile. he should just open-source the first one since it's so old. But I found he just released all the vids from a ML class he gave in 2011 - wonder if they are as good as Yaser's? http://www.cs.cmu.edu/~tom/10701_sp11/lectures.shtml

It appears that the deal doesn't apply to eBooks, as there is no place to enter the code. Also, they have gigantic red text informing the potential purchaser that they don't offer Android or "normal" eBooks (PDF, ePub, etc).

I appreciate the second perspective.

This book was recommended to me by a friend (who is a genius and great at ML), and I've just begun reading it.

My problem with MOOC is that I strongly dislike the audio/video format. I love textbooks. I learned a lot of what I know about computer science from books, not lectures. I went through many of Tanenbaum's thoughout high school -- and was more addicted to his textbooks than many of the novels that I read at the time.

I would really like to get some recommendations on some good _textual_ ML material.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact