
End-User Probabilistic Programming [pdf] - matt_d
https://www.cs.uoregon.edu/research/summerschool/summer19/lecture_notes/DRAFT___Probabilistic_Programming_for_End_Users.pdf
======
exp1orer
Guesstimate [1] (cited in this paper in footnote 6 and previously discussed on
HN [2]) is a really nice implementation of some of these ideas.

Has anyone come across other good implementations?

As a side note, I've been doing more probabilistic programming with pymc3
recently, and it's pretty incredible how leaky the abstractions can be. I'm
not saying there's a way to do better, just that at present there's a huge gap
between the beautiful vision of "the inference button" and the current tools.

[1] [https://www.getguesstimate.com/](https://www.getguesstimate.com/)

[2]
[https://news.ycombinator.com/item?id=18785371](https://news.ycombinator.com/item?id=18785371)

~~~
refrigerator
We're working on something like this: [https://causal.app](https://causal.app)
:)

Guesstimate is awesome, but their team sadly stopped working on it a while
back. It's definitely early days for automated inference, but I think giving
people the tools to build "static" (non-learning) models that can account for
uncertainty is hugely valuable in itself. You need serious gymnastics to do
this in spreadsheets right now, and I wouldn't wish Excel's probabilistic
plugins (Palisade @RISK, Oracle Crystal Ball) on anyone.

But progress towards the "inference button" dream is starting to accelerate:

\- Tensorflow recently got its own PPL [1]

\- The first international conference on probabilistic programming was held
(PROBPROG 2018) [2]

\- Lots of PPL development going on in tech companies: Uber, FB, Google,
Microsoft, Stripe, Improbable, etc.

[1]:
[https://www.tensorflow.org/probability](https://www.tensorflow.org/probability)

[2]: [https://probprog.cc/](https://probprog.cc/)

~~~
cheerlessbog
Microsoft recently released a probabilistic programming library for .NET named
Infer.NET [1]

[1] [https://github.com/dotnet/infer](https://github.com/dotnet/infer)

~~~
rahulravu96
They recently open sourced it. It was previewed in lectures from Christopher
Bishop more than 10 years ago.

1) [https://www.microsoft.com/en-
us/research/project/infernet/](https://www.microsoft.com/en-
us/research/project/infernet/) 2)
[http://videolectures.net/mlss09uk_bishop_ibi/](http://videolectures.net/mlss09uk_bishop_ibi/)

------
thanatropism
It's nowhere as sophisticated as this, but I wrote a little utility for Monte
Carlo analysis of spreadsheet models with Python and Xlwing:

[https://github.com/asemic-horizon/stanton](https://github.com/asemic-
horizon/stanton)

As a bonus, since the spreadsheet model is exposed as a Python function,
emulating complex spreadsheets with simple ML models (decision trees...) is
easy.

~~~
floki999
Nice. The Excel/Python combo is very powerful. Microsoft has kept Excel in the
dark ages, and could have implemented such better analytics than it currently
offers. If they could integrate Python in lieu of VBA and make it compatible
with their Form designer, it would become very popular. The ability to quickly
knock-up desktop interfaces, closely integrated with powerful analytics and
the user’s data source is priceless in certain environments (e.g. capital
markets).

------
floki999
Interesting but hardly novel. In certain fields, ‘end users’ have routinely
employed probabilistic tools via Excel or other spreadsheets for decades.
Wider adoption has been limited due to the knowledge base that is required to
either (a) confidently build such models, (b) communicate probabilistic
results to stakeholders.

@Risk and CrystalBall were some of the earlier Excel add-ins which simplified
simulation-based spreadsheet development.

As someone else mentioned, the Excel/Python combination is really powerful,
although lower-level. DataNitro comes to mind, as well as a product by
Resolver Systems (?) which was essentially an IronPython powered spreadsheet
interface.

------
lgessler
Dead link for some reason:
[https://web.archive.org/web/20190619190008/https://www.cs.uo...](https://web.archive.org/web/20190619190008/https://www.cs.uoregon.edu/research/summerschool/summer19/lecture_notes/DRAFT___Probabilistic_Programming_for_End_Users.pdf)

------
axpence
Let's say i want to predict an output `C` by multiplying two distributions `A`
_`B` = `C`.

Assuming I am just guessing at the distribution of `A` and `B` (Uniform?
Bernoulli? Geometric? Log-Normal?), would I get a better estimate by just
multiplying `mean(A)` _ `mean(B)` ?

Point values suck. However, predicting the mean is often possible/realistic.
And I feel like I am taking wild guess when describing a distribution of a
data set to be honest.

TLDR: What results in better prediction/guestimate? multiplying incorrect
probability distributions? Or multiplying more-correct means/point values?

~~~
jmmcd
I don't have a good answer. But I wonder if there are some realistic
situations where we would have a good guess at the mean, but no clue about the
distribution.

