
List of probabilistic model mini-language toolkits - blasdel
http://anyall.org/blog/2009/12/list-of-probabilistic-model-mini-language-toolkits/
======
andrewcooke
i don't know much about machine learning, but is the idea that i can use these
to generate synthetic random data? for example, samples from a given
distribution? or, in astronomy, start with the distribution of stars
(brightness and spatial clustering), and hardware (sensitivity, resolution)
and generate images, complete with noise?

~~~
imurray
The idea of some of these packages (Blog, BUGS, Church, HBC) is that you
describe how you _would_ generate synthetic random data if you had to. If you
don’t know any intermediate quantities, you specify distributions from which
plausible values could be drawn. This is a way of defining your model and your
uncertainties in a way that is easy to critique (“Your synthetic data looks
rubbish!”). Then given some observed images, complete with noise, the systems
can infer what the underlying quantities that generated the image were. A
simple example: “is that smudge of pixels one star or two, and how probable is
each option?”

Some of the packages (Alchemy, Factorie) ask you to define joint probability
distributions differently. Rather than asking for a sequence of steps that
would sample from the distribution, you specify _factors_ or soft constraints
amongst variables. The product of the factors evaluated at a setting of the
variables gives its probability up to a (often hard-to-compute) constant. The
point of these packages is also to infer interesting quantities given observed
data. The factors may have unknown parameters, which can be fitted to the data
(“learning”). In the first set of packages unknown parameters are just
variables, infered along with everything else.

The first group of packages let you define _directed graphical models_ , the
sampling schemes can be defined on a directed acyclic graph. Factor graphs,
used by the second group, can be defined on undirected graphs. Infer.NET lets
you express directed models, but internally for algorithmic reasons it
converts to undirected factor graphs.

PMTK while not finished, seems to be aiming to provide special support for a
variety of machine learning techniques, in addition to generic graphical model
support.

Dyna doesn't really fit in with the rest. It is a programming language that
makes dynamic programming easy, which is useful in some inference tasks.

~~~
andrewcooke
ah, ok, thanks - i had it backwards.

thanks for the details reply. i keep reading it, understanding a little more
each time. this sounds like something i should look into more.

