
An Introduction to Probabilistic Programming - lainon
https://arxiv.org/abs/1809.10756
======
tbenst
I was introduced to probabilistic programming by Frank Wood and Brooks Paige
at DARPA’s PPAML program, and they are excellent teachers. Highly recommend
this resource! Anglican is a great language embedded in Clojure so this is
particularly good for those on JVM

------
wodenokoto
How does this compare to probmod[1] and webppl[2]?

[1] [https://probmods.org/](https://probmods.org/)

[2] [http://dippl.org/](http://dippl.org/)

------
molteanu
[http://probabilistic-programming.org/](http://probabilistic-programming.org/)

------
__skeptic
I always feel like I'm missing something deeper when I hear / read about
"Probabilistic Programming" \- am I wrong in thinking that it's a well thought
way of expressing Bayesian Models through the language of OOP? I always feel
like the marketing side of this is hinting at something greater, but I haven't
seen any deeper connections past a good design choice.

~~~
pedrosorio
Having a way to express Bayesian models is a requirement for probabilistic
programming but that’s not the “point”.

Representing a Bayesian model in some language is not useful by itself. Being
able to do efficient inference without requiring the programmer to implement
an inference algorithm for each model they want to specify is the point of
probabilistic programming.

~~~
__skeptic
I see that, so I should be thinking of the use of the phrase "probabilistic
programming" to be more analogous to "Nonlinear programming" than to
"Functional programming".

~~~
alew1
Or maybe more analogous to “differentiable programming” or “logic
programming”: a style of programming that supports a broader range of
operations than “running.”

Also, it’s not quite that you are using a DSL to encode Bayesian models
traditionally represented in other ways; you are using a full-featured
programming language to express a set of models larger than the set that is
easy to express using existing formalisms. (That said, there are also some
models that are easier to express using those other formalisms; you can think
of them as being DSLs in a Turing-complete probabilistic programming language,
which make it easier to express certain limited classes of programs.)

------
mlevental
can some one give a quick tldr of probabilístic programming? is it a
convenient way to use Bayes (update priors) or is there more to it?

~~~
alew1
"Updating priors" is quite complex, and arguably describes every interesting
task you'd want to do as a learner, reasoner, analyst, etc. One view of
probabilistic programming languages is that they allow you to express your
priors—i.e., uncertain beliefs about the world—as programs, and help you to
answer all sorts of interesting questions in the contexts of those programs.

As a concrete example, you might write a program expressing your beliefs about
how the scenes you see come to be: some random objects are placed at some
random positions, lighting sources are placed, and then a rendering algorithm
converts that scene description into the light intensities perceived by your
eye. If you write this up in a probabilistic programming language, you can
then ask questions like: supposing I see this image, what are some likely
scenes (positions and shapes of objects) that might have produced it? How does
that distribution over scenes change if I know that the light source was here,
or that there was a small chair there, or that my vision isn't that great and
I should trust the image's pixels a little less? Section 1.3 of the
introduction linked above lists many more applications for which probabilistic
programming turns out to be useful: breaking CAPTCHAs, constrained procedural
graphics, automatic program induction/synthesis, recursive multi-agent
reasoning... and, as the authors write at the end of this section, "the list
goes on."

Of course, solving the inference task in full generality, for any program and
query that the user might write, is very difficult; you can design
pathological cases where it boils down to solving the halting problem, and is
therefore uncomputable. Different languages take different approaches to
addressing this difficulty. They might limit you to writing certain kinds of
programs for which a limited class of inference algorithms can always be made
efficient; they might try to plan a fancy inference algorithm based on the
specifics of your model; they might just expect you to learn which programs
will be tractable and which won't; or they might provide abstractions that
make it possible for you to safely experiment with writing new inference
algorithms (or combinations of existing ones) specialized for your use case.
Probabilistic programming is still a young field and this is an area of active
research.

~~~
pacala
> supposing I see this image, what are some likely scenes (positions and
> shapes of objects) that might have produced it?

Sounds like there exists a device where I can write some program, run it over
some inputs, pass the program and the outputs to the device and have it
magically infer the possible inputs. This is basically logical programming.
Then we add a probabilistical layer on top of that.

Maybe the probabilistic part hides some significant simplifications. Even
relatively simple programs are nontrivial to express logically. See, for
example, the pure treatment of arithmetic in
[http://okmij.org/ftp/Prolog/Arithm/arithm.pdf](http://okmij.org/ftp/Prolog/Arithm/arithm.pdf).

What kind of programs are practically feasible with Probabilistic Programming?

~~~
alew1
Great connection. Here's one way of looking at it: when you add stochasticity
to your program, you make it easier to search the space of possible solutions
(and you also relax the criteria for what it means to be "a solution").

In (deterministic) logic programming, the default search strategy is
backtracking. We search the space of "inputs that could have produced this
output" exhaustively. Each set of inputs we try either works or doesn't, and
doesn't tell us anything about whether similar inputs will behave similarly.

In probabilistic programming, we can use the information we have about
probability densities to tell whether we're on the right track; as we search,
we can tell if we're getting "warmer" or "colder." Suppose I have the simple
probabilistic program:

    
    
       def f():
         a = normal(0, 1) # mean 0, stddev 1
         b = normal(0, 1)
         c = normal(a+b, 0.1)
         return c
    

In this generative model, I choose two random numbers, add them together, and
then pick a third number very close to their sum. This could be thought of as
a probabilistic version of the deterministic program f(a, b) = a+b. If I want
to "invert" it, and condition on c=2, say, I can walk around the space of a
priori likely (a, b) values, "scoring" them based on how likely "c=2" would
have been given the particular (a, b) under consideration. The tight bell
curve around the value c=2 can be seen as an implicit "loss function," telling
us how good or bad the current solution is. This is very different from a
logic programming approach, which would be able to exactly invert the +
operation; but as you pointed out, it's nontrivial to formulate `+` correctly
as a logic program, whereas here, we are free to use an imperative
implementation of `+` (or much more complex deterministic operations, like a
3D renderer).

There is research in probabilistic logic programming languages, which can be
seen as combining some of the strengths of both paradigms, though that's not
covered by the introduction linked in the OP.

