
Pyro: PyTorch-Based Deep Universal Probabilistic Programming - singhrac
http://pyro.ai/
======
ineedasername
"Deep" probabilistic? A quick search & it seems the term "Deep Probabilistic"
was coined by the package author. Which, hey, nice work and all, but "deep" in
this context looks like pure marketing fluff.

Maybe it's not, I'm no longer cutting edge on this stuff-- my grad school days
were a decade ago and the day job ::sigh:: doesn't require the more
interesting stuff.

But I'm gonna get a bit "get off my lawn" on this one and say that, in my day
(woohoo!), neural nets could be deep; they had hidden depths (& layers).
Belief networks could be deep, and they were adding depth to learning too;
Much of the "deep" stuff today seems to use that word the same way Tide &
OxyClean have "deep" cleaning technology in their laundry detergent.

All of which is to say, this is a question from someone in the early stages of
cruftiness, meant in good humor, to ask "What makes them there probablistics
'deep' ?" :)

~~~
fritzo
Deep Probabilistic Programming

Dustin Tran, Matthew D. Hoffman, Rif A. Saurous, Eugene Brevdo, Kevin Murphy,
David M. Blei

[https://arxiv.org/abs/1701.03757](https://arxiv.org/abs/1701.03757)

~~~
ineedasername
That's very recent, and authored by d.tran, package maintainer for this
project on github. Before that paper I see no other reference to it. Makes me
think itwas coined in the paper and this package as part for trendy "deep"
buzzword value. An impression bolstered by pyro's "about" section, which gives
a vague generality and not much else.

This isn't a criticism of the work done which, on looking at the details, is
useful and interesting. This is a comment on the difficulty of finding out
what the actual interesting bits are.

I looked further: Uber's more detailed writeup has a statement for "why deep
probablistic?", but it too is vague, about learning generative knowledge from
data. And that a claim about generative knowledge is extremely ambitious for
any learning method to make without still more explanation.

EDIT: I'll be honest, it was really this project's connection to Uber, and
Uber's corporate culture, that prodded me to think a bit more about the
project claims & terminology use. Maybe I'm being unfair.

~~~
fritzo
Thanks for being honest :)

> ...difficulty of finding out what the actual interesting bits are. > Uber's
> more detailed writeup ... is vague

We indeed struggled to balance accessibility and depth that blog post. One
place to go deeper is the Pyro tutorials
[http://pyro.ai/examples](http://pyro.ai/examples) For more technical framing,
take a look at Noah's paper [1]. Broadly I think the "interesting bits" are
the integration of newer deep neural net methods with older Bayesian methods
of flexible probabilistic modeling.

[1] Deep Amortized Inference for Probabilistic Programs Daniel Ritchie, Paul
Horsfall, Noah D. Goodman
[https://arxiv.org/abs/1610.05735](https://arxiv.org/abs/1610.05735)

~~~
ineedasername
The paper clarifies a lot, and makes me think you meant something like
"generative model" instead of "generative intelligence"

The problem with using the later is that it only rarely (maybe a couple dozen
over the last few years) appears in conjunction with ML in rigorous research
publications. When it does, it is used with extremely precise and well-defined
parameters (see GPRS from M.Golumbic) or used in reference to the traditional
"home" of generative intelligence at the cross roads of cognitive science and
pedagogical theory & methods. There, it is a close cousin and necessary
component of "general intelligence."

That's why its use here set off alarm bells of "Alert! Marketing speak at
work! Gross overstatement of capabilities have been detected!" Because it
doesn't fall much short of claiming "general intelligence" as a practical
result of & use for Pyro.

------
juxtaposicion
How does this compare to Edward, PyMC, Stan, et al? Is the primary distinction
due to PyTorch’s imperative, dynamic programming?

~~~
fritzo
Edward: like Edward, Pyro is a deep probabilistic programming language that
focuses on variational inference but supports composable inference algorithms.
Pyro aims to be more dynamic (by using PyTorch) and universal (allowing
recursion).

PyMC, Stan: Pyro embraces deep neural nets and currently focuses on
variational inference. Pyro doesn't do MCMC yet. Whereas Stan models are
written in the Stan language, Pyro models are just python programs with
pyro.sample() statements.

One unique feature of Pyro is the probabilistic effect library that we use to
build inference algorithms:
[http://docs.pyro.ai/advanced.html](http://docs.pyro.ai/advanced.html)

~~~
psinger
This still doesn't really explain the dfifference to PyMC. What is the
advantage of using Pyro over PyMC which supports a multitude of inference
algorithms as well as mini-batch advi.

~~~
f00_
0.1.0 definitely not feature full, but Pyro seems promising.

PyMC3 is fine, but it uses Theano on the backend. Theano will stop being
actively maintained in 1 year, and no future features in the mean time. That
was announced about a month ago, it seems like a good opportunity to get out
something that filled a niche: Probablistic Programming language in python
backed by PyTorch. They are taking cues from edward and webppl, which from a
casual glance seem to be the best libraries for python and javascript
respectively [http://edwardlib.org/](http://edwardlib.org/)
[http://webppl.org/](http://webppl.org/)

But Edward is backed by TensorFlow

That announcement by Theano’s main developer Pascal Lamblin and Yoshua Bengio:
[https://syncedreview.com/2017/09/29/rip-
theano/](https://syncedreview.com/2017/09/29/rip-theano/)
[https://groups.google.com/forum/#!topic/theano-
users/7Poq8BZ...](https://groups.google.com/forum/#!topic/theano-
users/7Poq8BZutbY)

"Dear users and developers,

After almost ten years of development, we have the regret to announce that we
will put an end to our Theano development after the 1.0 release, which is due
in the next few weeks. We will continue minimal maintenance to keep it working
for one year, but we will stop actively implementing new features. Theano will
continue to be available afterwards, as per our engagement towards open source
software, but MILA does not commit to spend time on maintenance or support
after that time frame. "

[https://www.wired.com/2016/12/uber-buys-mysterious-
startup-m...](https://www.wired.com/2016/12/uber-buys-mysterious-startup-make-
ai-company/)

Uber acquired Geometric Intelligence and renamed it Uber AI. From this
article:

"But it hasn't published research or offered a product. What is has done is
assemble a team of fifteen researchers who can be very useful to Uber,
including Stanford professor Noah Goodman, who specializes in cognitive
science and a field called probabilistic programming, and University of
Wyoming's Jeff Clune, an expert in deep neural networks who has also explored
robots that can "heal" themselves."

------
yodon
Can someone eli5 probabilistic programming?

~~~
nightski
I honestly don't think it would be possible for a 5 year old to learn or even
grasp the concept of probabilistic programming. So maybe wait until high
school or if very gifted middle school where you have a more solid
mathematical foundation. It's the process of inferring the parameters of a
Bayesian probabilistic model from data and then making predictions with that
model.

A probabilistic programming language lets you express the model in code, and
then it performs inference automatically using various methods. Note doing it
manually may produce better results but it is a very involved and time
consuming process.

~~~
geoelectric
eli5 usually just means simplest terms that make sense for the answer.

------
tbenst
How flexible is this compared to Church/Venture, Webppl, Anglican, etc? Does
it support recursively-defined generative processes?

Edit: nvm, Noah Goodman is behind this, who created Webppl. This looks super
flexible and awesome, congrats all!

------
nl
The combination of probalistic programming and deep learning is pretty
interesting to me because that's what I have going on in two of my work
projects.

What we do is have features built using deep learning models, then use that
extract simple linear or categorical features which we condition our
probalistic model on.

We've found it quite hard to use very high numbers of variables in the
probalistic model.

Has anyone found a better way of doing this?

~~~
wakkaflokka
Do you have a recommended resource on how to feature engineer with DNNs for
use as inputs in other models?

~~~
nl
I don't really understand the question.

For classification we just use the softmax over the classes. We had a long
discussion about if this was close enough to probability to use and I think
the conclusion was that it is.

------
orbifold
It was bound to happen, the dynamic control flow of pytorch makes this really
interesting compared to Edward.

------
catchmeifyoucan
Blocked on our corporate network at CapitalOne as "suspicious"

~~~
Diederich
Pyro is a universal probabilistic programming language (PPL) written in Python
and supported by PyTorch on the backend. Pyro enables flexible and expressive
deep probabilistic modeling, unifying the best of modern deep learning and
Bayesian modeling. It was designed with these key principles:

Universal: Pyro can represent any computable probability distribution.
Scalable: Pyro scales to large data sets with little overhead. Minimal: Pyro
is implemented with a small core of powerful, composable abstractions.
Flexible: Pyro aims for automation when you want it, control when you need it.

Check out the blog post for more background or dive into the tutorials.

[https://eng.uber.com/pyro/](https://eng.uber.com/pyro/)

------
traverseda
How does it compare to Pyro, the python-remote-objects library?

~~~
pinouchon
They are similar like java and javascript are similar

~~~
traverseda
I mean, clearly they're a bit more similar than that.

------
l5870uoo9y
Anyone have a qualified opinion on how Pytorch compares to Tensorflow?

~~~
lowpro
Siraj Raval covers PyTorch very well in a 5 min video
([https://youtu.be/nbJ-2G2GXL0](https://youtu.be/nbJ-2G2GXL0)).

It comes down to design decisions, of which I'm not qualified to go into. This
article made the front page last week, about the downsides to Tensorflow which
people rarely talk about:
[http://nicodjimenez.github.io/2017/10/08/tensorflow.html](http://nicodjimenez.github.io/2017/10/08/tensorflow.html)

And this interview with a Tensorflow engineer (10 mins) explains a little bit
about those design decisions
([https://youtu.be/axRHotkkTVI](https://youtu.be/axRHotkkTVI)).

------
polskibus
Is there a tensorflow based equivalent?

~~~
julien_c
Maybe [http://edwardlib.org/](http://edwardlib.org/)

------
bj0
There's already a really cool python project called Pyro (python remote
objects):
[https://pyro4.readthedocs.io/en/stable/intro.html](https://pyro4.readthedocs.io/en/stable/intro.html)

I haven't used it since I was in undergrad (>10 years) where I used it to
communicate between nodes on a small cluster, but it made RPC really easy.

~~~
zero_iq
Indeed. It was (is?) pretty well known, but I've not heard it mentioned for a
long time. With all the fashionable modern RPC and serialisations around
nowadays, perhaps the original Pyro is now obscure enough that the name can be
reused? Ideally, though, it would be nice to know for sure that an existing
project is considered obsolete before causing any confusion.

~~~
naturalgradient
Or it is just Uber being Uber and not caring.

