
Critique: "Programming Collective Intelligence" - aitoehigie
I have just gotten the book "Programming Collective Intelligence" and having read a few chapters i am wishing that i had gotten the book earlier!. At the same time, it shows how much most web applications being developed today lack deep technology (I am also guilty of this). Whats your take on this?
======
jgrahamc
It's a very good introductory text. If you get interested in going deeper then
try: [http://www-csli.stanford.edu/~hinrich/information-
retrieval-...](http://www-csli.stanford.edu/~hinrich/information-retrieval-
book.html)

~~~
jimbokun
Just took a class with this as the text. Very readable, without dumbing
anything down.

This book (by two of the same authors) is also good:

<http://nlp.stanford.edu/fsnlp/>

------
plinkplonk
"At the same time, it shows how much most web applications being developed
today lack deep technology"

Just my 2 cents but I find PCI a very shallow book without much "deep
technology" in it either.

AI cannot be divorced from the underlying math. PCI takes (in _my_ opinion,
feel free to differ) a math-lite, "dummies guide" approach to AI algorithms. I
realize that my opinion is in the minority, and a lot of people think that the
book is very cool. So take it with a grain of salt.

~~~
brent
It is certainly a "dummie's guide" approach and probably the best book
applying this framework to machine learning related problems. It introduces a
lot of ideas to people where the math would simply not be approachable.

That said I have three complaints:

1 I think it tries to stay too high level on some areas where it is truly
unavoidable and wouldn't scare anyone off to add a little depth.

2 I think it too often labels their implementation with a name that is
associated with a family of techniques. This is simply going to
embarrass/mislead the reader. In the interest of maintaining simplicity the
book states several things that are flat out wrong.

3 No references. If someone wants to take the next step there is zero
guidance. This is important because someone learning about a deep topic could
read the book and a) think they understand it (not likely) and b) not have a
clue where to look next.

~~~
me2i81
I thought the lack of references was a major failing of the book. Some of the
algorithms barely scratch the surface, and it does a disservice to the reader
to provide noplace to go. Here's a few to get started: 1. Russel and Norvig's
AI text, 2. Elements of Statistical Learning by Hastie et. al., 3. Pattern
Recognition and Machine Learning by Chris Bishop.

On the other hand going right into code examples is useful, including jumping
right into getting real data downloaded and worked on.

~~~
brent
I agree about #2. I have Russell and Norvig and agree that it is an excellent
book, but I am not sure how much overlap there truly is here.

I also do not have PRML, but Neural Networks for Pattern Recognition by Bishop
is excellent (and includes many non-NN related items).

ESL is excellent and, to me, the best modern text in machine learning. It
covers many of the topics in PCI both at a reasonable level and in much more
depth and provides MANY references (100s?) to dig deeper.

------
kurtosis
I just got it in the mail yesterday, and after a first reading I am really
impressed. Most of the published material on these topics is academic and
heavily mathematical. This type of practical guide is very useful. The things
that I like the most so far:

1) The examples are real python code, and many chapters the code samples are
small enough that you can follow the text by typing the code into an
interpreter to try it out

2) He doesn't just test algorithms on standard datasets like MNIST or
MovieLens - He gives programming examples for how to extract your _own_
datasets e.g. by showing how to use the eBay api to download prices from eBay,
or how to use the facebook api.

~~~
andreyf
_after a first reading I am really impressed. Most of the published material
on these topics is academic and heavily mathematical._

To each his own - I thought this book could easly be condensed into a list of
mathematical theorems - a page worth of links to wikipedia.

 _1) The examples are real python code_

Really crappy python code... definitely not the kind of code one should be
looked at as example.

 _He doesn't just test algorithms on standard datasets like MNIST or MovieLens
- He gives programming examples for how to extract your own datasets e.g. by
showing how to use the eBay api to download prices from eBay, or how to use
the facebook api._

He combines math theory with code and API's? What's the point of that?

If you want to learn how to code python, read a python book. If you want to
learn math, read a math book. If you want to learn API's, read the API docs.
If you want to learn all three, read all three, but don't think that you know
any of them if you see a bunch of badly coded examples of the results of
knowing them.

~~~
kurtosis
I agree with you about the non idomatic python code - It reminds me of my
earliest attempts at python after switching from matlab. I saw a lot of
criticism that PCI should have been done in pseudocode and I wonder if the
book's coding style was a deliberate attempt to satisfy this complaint.

I think that any maturing field needs a hierarchy of explanations that go
progressively deeper into the subject - the explanations of the second law of
thermodynamics that you get in freshman chemistry course are superficial
compared to what you would deal with in a graduate course, but this doesn't
mean that chem101 should be taught at the grad level, or that it is pointless
to teach freshmen about the second law. The real danger is if you lie or
misrepresent the facts in an attempt to simplify. I'm sure there are some
errors and falsehoods in PCI but I didn't see anything too bad in the one day
that I've had the book.

Learning API's by reading the docs sounds like trying to learn english by
reading the dictionary - It's just my personal preference, but I feel that I
learn more effectively by reading real source code that works (even if it is
horribly written)

------
azsromej
I subject myself to constant distractions but I did get through the first
couple chapters and enjoyed them. I forced myself to focus on the material by
writing the examples in Ruby ([http://romej.com/archives/590/programming-
collective-intelli...](http://romej.com/archives/590/programming-collective-
intelligence-with-ruby)). I've only skimmed the other chapters.

------
ctbk
I got the book yesterday and I've read the first chapters. It covers very
interesting stuff in a introductive, easy to follow way. I like its
pragmatical approach a lot, but I would have liked more depth in the teorical
side of things maybe. I guess that is be material for a different kind of
book, but an appendix would have been much appreciated.

------
dfranke
It doesn't go as deep as I wish it did. However, it still gives an excellent
introduction to a lot of techniques I've never heard of before and can now
look up in more detail when I need to.

------
thorax
It makes me wish all of the AI courses I took that used generic pseudocode
really just used Python. It would have felt more tangible to me and given us
more examples to hack on.

~~~
jimbokun
I think many pseudo-code examples in text books are almost Python already, and
that speaks well of Python as a language. "Executable pseudo-code", if you
will.

------
sunkencity
I found the book really captivating and very useful. It's seldom a book about
programming contains so much stuff that one can put to use.

------
berryg
Looks very interesting. Ordered it right away.

