Critique: "Programming Collective Intelligence"

jgrahamc · on June 4, 2008

It's a very good introductory text. If you get interested in going deeper then try: http://www-csli.stanford.edu/~hinrich/information-retrieval-...

jimbokun · on June 4, 2008

Just took a class with this as the text. Very readable, without dumbing anything down.

This book (by two of the same authors) is also good:

http://nlp.stanford.edu/fsnlp/

aitoehigie · on June 4, 2008

thanks

plinkplonk · on June 4, 2008

"At the same time, it shows how much most web applications being developed today lack deep technology"

Just my 2 cents but I find PCI a very shallow book without much "deep technology" in it either.

AI cannot be divorced from the underlying math. PCI takes (in my opinion, feel free to differ) a math-lite, "dummies guide" approach to AI algorithms. I realize that my opinion is in the minority, and a lot of people think that the book is very cool. So take it with a grain of salt.

toby · on June 5, 2008

I really appreciate the feedback (I'm the author), I definitely expected this sort of response as I was writing it.

I think I got a lot right when I set the level but having seen it in the wild there are things I would have done differently too.

Mostly I wanted people to get excited and see that they could do this right now and that there were good reasons to. I never intended to compete with Norvig, more to create something that would inspire them to want to read it.

bockris · on June 4, 2008

While I agree with your assessment, IMO you don't seem to be the target audience.

I loved PCI because it was math-lite.

I'm a business applications programmer by trade and fun stuff like this isn't anywhere on our radar. I also have a family and the associated obligations like T-Ball and dance recitals so I don't have the time to dig deeply into the algorithms and math. PCI was a good balance to me between showing the theory and also providing concrete working code.

My only quibble with the book was that a lot of the code was not idiomatic Python. (un-Pythonic, if you will.)

Just my $0.02.

mcroydon · on June 4, 2008

The biggest thing I got from PCI was a basic discussion of a lot of algorithms that I was able to research further on my own. I had no idea that I wanted to know more about K-means clustering until I read the book.

If that's all you get out of it great, but there's a lot of interesting stuff in there, covered in enough detail to get you started, which is more than enough to make me happy.

brent · on June 4, 2008

It is certainly a "dummie's guide" approach and probably the best book applying this framework to machine learning related problems. It introduces a lot of ideas to people where the math would simply not be approachable.

That said I have three complaints:

1 I think it tries to stay too high level on some areas where it is truly unavoidable and wouldn't scare anyone off to add a little depth.

2 I think it too often labels their implementation with a name that is associated with a family of techniques. This is simply going to embarrass/mislead the reader. In the interest of maintaining simplicity the book states several things that are flat out wrong.

3 No references. If someone wants to take the next step there is zero guidance. This is important because someone learning about a deep topic could read the book and a) think they understand it (not likely) and b) not have a clue where to look next.

me2i81 · on June 4, 2008

I thought the lack of references was a major failing of the book. Some of the algorithms barely scratch the surface, and it does a disservice to the reader to provide noplace to go. Here's a few to get started: 1. Russel and Norvig's AI text, 2. Elements of Statistical Learning by Hastie et. al., 3. Pattern Recognition and Machine Learning by Chris Bishop.

On the other hand going right into code examples is useful, including jumping right into getting real data downloaded and worked on.

brent · on June 5, 2008

I agree about #2. I have Russell and Norvig and agree that it is an excellent book, but I am not sure how much overlap there truly is here.

I also do not have PRML, but Neural Networks for Pattern Recognition by Bishop is excellent (and includes many non-NN related items).

ESL is excellent and, to me, the best modern text in machine learning. It covers many of the topics in PCI both at a reasonable level and in much more depth and provides MANY references (100s?) to dig deeper.

j2d2 · on June 4, 2008

Can you elaborate on the things it simply gets wrong?

brent · on June 5, 2008

One example, simply because I opened the book to this page after reading your comment and happen to know a bit about the subject (p49):

"[t]he algorithm takes the difference between every pair of items and tries to make a chart in which the distances between the items match those differences"

This is referring to multidimensional scaling which is a large family of techniques, some of which attempt to achieve this goal (delta ~= d), but not necessarily. Granted, in this case the algorithm presented DOES attempt to do this. BUT, the algorithm presented is not "multidimensional scaling". It is an implementation (not even a good one) attempting to minimize one of many possible objective functions.

I noticed several more when I flipped through it initially, but this is the type of thing I'm referring to.

Kaizyn · on June 4, 2008

Does this mean that implementations of the algorithms in PCI won't work as well as something one would write if they first learned all the deep math those techniques are built on? If so, then sure this is a problem. Otherwise, then the 'deep math' is perhaps a bit superfluous to the problem of programming applications that make use of AI techniques?

brent · on June 5, 2008

I agree with plinkplonk, but if you'd like examples:

1) I mentioned multidimensional scaling elsewhere. In the book there is implementation and it doesn't even say what the goal is (in terms of objective function) and/or how well it performs. There are many different goals of multidimensional scaling. If one understand even the objective function they are minimizing then the algorithm to do so will make more sense.

2) The book grazes over SVM's and makes a few blanket statements that I haven't heard from serious researchers. E.g., "[t]he one that is usually recommended ... is called the radial-basis function. The radial-basis function is like the dot-product in that it takes two vectors and returns a value. Unlike the dot-product, it is not linear and can thus map more complex spaces." Most of the kernel/svm folks I know would never be so confident in saying such a blanket statement as it entirely relies on the data/situation.

plinkplonk · on June 5, 2008

"Does this mean that implementations of the algorithms in PCI won't work as well as something one would write if they first learned all the deep math those techniques are built on? "

That is exactly what it means. "Deep math" is not learned because people have nothing else to do with their time. The moment you try to apply or extend the techniques in PCI beyond the toy examples in the book you will see the need for the "deep math".

Just one example. BackPropogation Neural Networks are covered very superficially PCI. To get a real understanding of BPNNs, read Chris Bishop's Neural Networks for Patttern Recognition. That's a whole book on one type of neural network.

I quote Peter Norvig's Amazon Review of Bishop's book. (emphasis mine)

"To the reviewer who said "I was looking forward to a detailed insight into neural networks in this book. Instead, almost every page is plastered up with sigma notation", that's like saying about a book on music theory "Instead, almost every page is palstered with black-and-white ovals (some with sticks on the edge)." Or to the reviewer who complains this book is limited to the mathematical side of neural nets, that's like complaining about a cookbook on beef being limited to the carnivore side. If you want a non-technical overview, you can get that elsewhere, but if you want understanding of the techniques, you have to understand the math. Otherwise, there's no beef."

"If you want understanding of the techniques, you have to understand the math". Without understanding you can't (a) judge the appropriateness of a particular algorithm to a data set (b) know which variant of the algorithm to apply (c) understand the results, especially when they don't make sense (d) debug the implementation if required. In short at best you are making calls into a black box library and praying the results make sense.

PCI skips the "understanding" part of AI aIgorithms (which does need math as Norvig points out. Most AI is applied math)and provides a superficial outline and many blackboxes. E.g: the "use libSVM for Support Vector Machines" idea PCI propagates (after a very superficial overview of SVMs) .

LibSVM is a great library,but trying to use it to "SVM ize" your code without understanding exactly how exactly SVMs work (which does need "deep math" ;-) ) , will give you ... ummm.. suboptimal ... results.

One of my friends tried to use algorithms from PCI to add "intelligence" to some code for his startup (he first ported the PCI code to Ruby) and ended up getting nonsensical results. Eventually I looked at the data, threw away his code and replaced it with a different algorithm which was appropriate. But, hey, I got some equity in return for the AI code so that turned out all right :-P

All that said, I have no problems with folks using/liking PCI. I was just reacting to the "deep tech" phrase in the original post.

Kaizyn · on June 5, 2008

Thank you. That's the sort of answer to the question I was looking for.

jmatt · on June 4, 2008

what would you recommend then?

brent · on June 5, 2008

I am not the 'parent', but as I said elsewhere I would recommend The Elements of Statistical Learning. It gets quite a bit deeper than PCI, but I'm fairly confident you could learn almost everything in PCI from ESL.

It doesn't come with pre-canned python, but honestly almost everything (in terms of code) in PCI is available somewhere on the web and/or already built into python libraries, matlab, and/or R.

kurtosis · on June 4, 2008

I just got it in the mail yesterday, and after a first reading I am really impressed. Most of the published material on these topics is academic and heavily mathematical. This type of practical guide is very useful. The things that I like the most so far:

1) The examples are real python code, and many chapters the code samples are small enough that you can follow the text by typing the code into an interpreter to try it out

2) He doesn't just test algorithms on standard datasets like MNIST or MovieLens - He gives programming examples for how to extract your own datasets e.g. by showing how to use the eBay api to download prices from eBay, or how to use the facebook api.

andreyf · on June 4, 2008

after a first reading I am really impressed. Most of the published material on these topics is academic and heavily mathematical.

To each his own - I thought this book could easly be condensed into a list of mathematical theorems - a page worth of links to wikipedia.

1) The examples are real python code

Really crappy python code... definitely not the kind of code one should be looked at as example.

He doesn't just test algorithms on standard datasets like MNIST or MovieLens - He gives programming examples for how to extract your own datasets e.g. by showing how to use the eBay api to download prices from eBay, or how to use the facebook api.

He combines math theory with code and API's? What's the point of that?

If you want to learn how to code python, read a python book. If you want to learn math, read a math book. If you want to learn API's, read the API docs. If you want to learn all three, read all three, but don't think that you know any of them if you see a bunch of badly coded examples of the results of knowing them.

kurtosis · on June 4, 2008

I agree with you about the non idomatic python code - It reminds me of my earliest attempts at python after switching from matlab. I saw a lot of criticism that PCI should have been done in pseudocode and I wonder if the book's coding style was a deliberate attempt to satisfy this complaint.

I think that any maturing field needs a hierarchy of explanations that go progressively deeper into the subject - the explanations of the second law of thermodynamics that you get in freshman chemistry course are superficial compared to what you would deal with in a graduate course, but this doesn't mean that chem101 should be taught at the grad level, or that it is pointless to teach freshmen about the second law. The real danger is if you lie or misrepresent the facts in an attempt to simplify. I'm sure there are some errors and falsehoods in PCI but I didn't see anything too bad in the one day that I've had the book.

Learning API's by reading the docs sounds like trying to learn english by reading the dictionary - It's just my personal preference, but I feel that I learn more effectively by reading real source code that works (even if it is horribly written)

azsromej · on June 4, 2008

I subject myself to constant distractions but I did get through the first couple chapters and enjoyed them. I forced myself to focus on the material by writing the examples in Ruby (http://romej.com/archives/590/programming-collective-intelli...). I've only skimmed the other chapters.

ctbk · on June 4, 2008

I got the book yesterday and I've read the first chapters. It covers very interesting stuff in a introductive, easy to follow way. I like its pragmatical approach a lot, but I would have liked more depth in the teorical side of things maybe. I guess that is be material for a different kind of book, but an appendix would have been much appreciated.

dfranke · on June 5, 2008

It doesn't go as deep as I wish it did. However, it still gives an excellent introduction to a lot of techniques I've never heard of before and can now look up in more detail when I need to.

thorax · on June 4, 2008

It makes me wish all of the AI courses I took that used generic pseudocode really just used Python. It would have felt more tangible to me and given us more examples to hack on.

jimbokun · on June 4, 2008

I think many pseudo-code examples in text books are almost Python already, and that speaks well of Python as a language. "Executable pseudo-code", if you will.

sunkencity · on June 4, 2008

I found the book really captivating and very useful. It's seldom a book about programming contains so much stuff that one can put to use.

berryg · on June 4, 2008

Looks very interesting. Ordered it right away.