

Bayes nets by example with Python and Khan Academy data - kohlmeier
http://derandomized.com/post/20009997725/bayes-net-example-with-python-and-khanacademy

======
hucker
After taking multiple courses that touches on bayesian nets and other AI-
techniques in general, I have attained what I feel is a good general feel for
how stuff works and its mathematical underpinnings. However, I still stumble
when trying to implement what I've learned in code, so thank you for this!

My school is using AIMA for every introductory AI-course, and if you're
looking for pointers on how to implement some of the algorithms and methods
described in there in python, have a look at <http://code.google.com/p/aima-
python/> . Since it's Norvig himself coding, the quality of the python is
practically flawless.

~~~
abecedarius
A little correction: aima-python is primarily by Norvig, but unfinished and
less polished than the essays on his website. I took it up last fall to get it
into better shape for the free AI class, and about 2/3 of the code now is his
unchanged from before then. He is keeping an eye on things, but, you know,
busy guy.

That said, Bayes nets are fully implemented (as presented in the book, i.e.
for boolean variables) and I think worth reading and trying out if you're
learning the subject. Also worth checking out is aima-java, in a much more
complete state, though of course a lot wordier the way it is with Java.

If you'd like to contribute, there's plenty to do. (For example the E-M
algorithm used in this post to learn the Bayes net parameters.) I set up a
mirror on github the other day to make it easier:
<https://github.com/darius/aima-python-mirror>

(I don't have the spare energy right now to do a lot, myself. Including
editing too many patches coming in, though it'd be a surprise to get that
problem.)

~~~
hucker
Ah, you're correct. I can only speak of the code in there I have myself used
for self study, which was great, but I see now that a lot is missing. I
wouldn't mind contributing to such a project myself, but I'm afraid my python
is not as uber-"pythonistic" as Norvig's is...

~~~
abecedarius
Understood. :) I'm a big fan of his work, too.

------
a1k0n
Isn't the coefficient on 'T', the predicted 'mastery' variable, pointing the
wrong way in the final logistic regression? And 'E', which is just getting 85%
of the exercises right, is completely dominating all the other stuff
(exponential moving average, etc) they did in the model.

So while all this is really cool, and I like the idea of doing EM on missing
data (I've done it myself), it doesn't seem like it actually adds much, if I'm
reading that right.

~~~
kohlmeier
The 'E' in the regression is the inferred/predicted value of the E variable
for that exercise, using no problem history from that exercise--only what's
pulled in through the Bayes net. (Sorry that wasn't clear)

The 'T' variable is likely just a case of multicollinearity with the 'E'
variable and should go away on a full-scale data set. If not it can easily be
removed from the model. The 'E' variable is dominating because is additionally
captures cross-sectional affects across the various exercises in the
regression.

~~~
a1k0n
Ah, ok. So is this a sort of Markov model, where you are predicting the
probability of getting an exercise right after observing (some subset of) the
previous exercises? And E is not 1 or 0, but the expected probability of
getting it right? I'm still confused where all the different E_i's fit in.

That would explain the magnitude, and I agree the negative weight on T would
just be due to the direct correlation between E and T.

Edit: I just realized that an exercise consists of multiple problems, so
you're predicting whether or not the student will get >= 85% of the problems
right on an exercise.

------
mamp
Very helpful article. Note that this is for an implementation of a Naive Bayes
model, sometimes called 'idiot' Bayes. It assumes independent observations and
therefore can be overconfident. More complex Bayes net models are way harder
to implement. Here's a good overview of general networks:
<http://www.cs.ubc.ca/~murphyk/Bayes/bnintro.html>

~~~
kohlmeier
This is not naive bayes and does not assume independent observations on the
exercises. The point of using a network is to model the joint distribution
with dependencies.

