
Compiling Knowledge into Probabilistic Models - wcrichton
http://willcrichton.net/notes/compiling-knowledge-probability/
======
nathcd
> In the general-purpose programming context, imagine if you could give
> examples of a program output (domain data) along with a skeleton of a
> program (source file with incomplete parts) and ask a system to fill in the
> holes.

This part reminds me of some of capabilities of the Idris compiler [1]. In an
Idris program you can leave "holes" to stand in for incomplete parts of a
program [2], and the compiler can infer various bits of code from types and
holes. In a demo of the in-progress Idris 2 compiler [3], Edwin Brady refers
to it as a "lab assistant" and shows it writing a whole function when given a
function type.

[1] [http://docs.idris-
lang.org/en/latest/tutorial/interactive.ht...](http://docs.idris-
lang.org/en/latest/tutorial/interactive.html#editing-commands)

[2] [http://docs.idris-
lang.org/en/latest/tutorial/typesfuns.html...](http://docs.idris-
lang.org/en/latest/tutorial/typesfuns.html#holes)

[3]
[https://www.youtube.com/watch?v=mOtKD7ml0NU](https://www.youtube.com/watch?v=mOtKD7ml0NU)

------
scribu
Modelling uncertainty is definitely a useful tool to have, but I'm not sure
why the author expects there to be a "scientific" (a.k.a. mechanistic) way of
doing it.

In normal programming, there's no fool-proof formula for picking the best data
structure or the best algorithm. If there were, we could just write one
program to write all other programs and be done with it!

~~~
wcrichton
I’m definitely not advocating for an automated way of building these models,
for the reasons you point out. Instead, I’m saying more that probabilistic
models are not taught in a way that we can easily map our knowledge structures
onto them, so I’m advocating for more explicit guidance in that process.

------
i_am_proteus
Philosophically, one could consider model formulation to be a way the author
encodes prior beliefs about the system into the model.

------
nartz
If you look at kernel code for many flavors of Linux you'll see annotations
that hint at which branches of code are more likely.

Similarly, many JIT compilers create statistics on the fly already; for
instance these are used to better predict which branches are most likely to
occur and thus be prefetched.

------
marmaduke
> seems to me that the the act of compiling knowledge into probabilistic
> models is still more art than science

Because there’s no modularity: writing probability models is still like using
unstructured assembly.

~~~
wcrichton
Having written a handful of probabilistic programs, I don’t think there’s no
modularity at all. You can still abstract a complex stochastic process behind
a function. I think it’s more that certain tasks are necessarily antimodular,
eg doing conditional inference can introduce unexpected dependencies between
modules, see d-separation [0].

[0]
[https://www.andrew.cmu.edu/user/scheines/tutor/d-sep.html](https://www.andrew.cmu.edu/user/scheines/tutor/d-sep.html)

~~~
eli_gottlieb
Well, as part of the probabilistic programming community, I can at least
report that inference programming and model synthesis are problems we're
actively working on!

