

Ask HN: Is anyone here using Bayesian Networks? - mathgladiator

I'm just curious because I've thought about building a library for it.
======
danger
I use them, but there are plenty of libraries out there.

[http://compbio.cs.huji.ac.il/FastInf/fastInf/FastInf_Homepag...](http://compbio.cs.huji.ac.il/FastInf/fastInf/FastInf_Homepage.html)

<http://robotics.stanford.edu/~sgould/svl/>

<http://code.google.com/p/factorie/>

<http://people.kyb.tuebingen.mpg.de/jorism/libDAI/>

[http://research.microsoft.com/en-
us/um/cambridge/projects/in...](http://research.microsoft.com/en-
us/um/cambridge/projects/infernet/)

BTW, factor graphs are generally more popular than Bayes Nets these days, at
least in research circles. (Bayesian networks can be seen as a special case of
factor graph.)

~~~
mitmatt
Those are a lot of great links! But I disagree with your "BTW" line:

I don't think factor graphs are more popular than Bayes nets (a.k.a. directed
graphical models), at least not in general machine learning (though perhaps in
some particular subfield?). Directed graphs are usually most appropriate for
generative models, which are very popular, especially in Bayesian approaches.
In graphical models, directed, undirected, and factor graphs are all used in
their appropriate contexts.

And it's not accurate to say that directed models (Bayes nets) are a special
case of factor graphs: there are conditional independence structures that can
be represented by Bayes nets that can't be captured exactly with factor
graphs. The canonical example is

O --> O <\-- O

~~~
danger
Glad you like the links, but I disagree with your disagreement =P

You can use factor graphs for either directed or undirected models. See this
paper:

Extending factor graphs so as to unify directed and undirected graphical
models B. J. Frey Proceedings of the 19th Conference on Uncertainty in
Artificial Intelligence, 257-264, Morgan Kaufmann, San Francisco, CA, August
2003.

------
ifesdjeen
I'm using Mahout, it does have Bayesian classifiers (Naive Bayes and
Complementary Naive Bayes). However, Bayesian algorithms never worked well for
my datasets. I went with LDA for my tasks :)

BTW, if you do want to develop a library to work with classifiers, my only
"wish" would be to have a good export/import tools. Most libraries i worked
with are using their own serialization formats, so it's quite difficult to
import/export data without an overhead. If you write several adapters that
could allow people to work with datasets through the adapters, i'm quite sure
that ecosystem will get created by other people extremely fast. Some people
use mongo, some use postgre, others use files, csv, serialized formats. It's
great to be able to continue to use your own things without changes when
running that kind of a lib.

Other "wish" thing is to have an ability to run classifier as a map/reduce
job, and do it easily :)

~~~
mitmatt
Just so you know, the Latent Dirichlet Allocation is very much a Bayesian
model, as it says in the original paper's abstract [1] (first google hit).

(LDA is a bit of an overloaded acronym in machine learning, but I'm assuming
that's what you mean.)

[1]
[http://www.cs.princeton.edu/~blei/papers/BleiNgJordan2003.pd...](http://www.cs.princeton.edu/~blei/papers/BleiNgJordan2003.pdf)

~~~
ifesdjeen
You can check out mahout implementation to see differences as well. LDA isn't
supervised/trained.

Thanks for pointing that out, but i was speaking more of concrete algorithm
implementation rather than abstracts.

------
w00t
<http://www.cs.waikato.ac.nz/ml/weka/>

has a pretty good Bayesian library

------
zdw
Just so you don't recreate the wheel, here's a Perl module that provides
Bayesian text categorization:

<http://search.cpan.org/~kwilliams/AI-Categorizer-0.09/>

I'm currently using this (and String::Approx) on a not-so-secret project...

