Hacker News new | past | comments | ask | show | jobs | submit login
Probabilistic Programming and Bayesian Methods for Hackers (2013) (camdavidsonpilon.github.io)
302 points by jxub on Feb 28, 2018 | hide | past | web | favorite | 36 comments

This looks amazing and I will definitely read it.

As an aside, does this line bother anyone else?

> programming from a computation/understanding-first, mathematics-second point of view

Mathematics is understanding. What's implicit? The suggestion that mathematics isn't understanding, it's just funny symbols that no one understands.

> The latter path is much more useful, as it denies the necessity of mathematical intervention at each step, that is, we remove often-intractable mathematical analysis as a prerequisite to Bayesian inference.

In fact, I would say that by removing maths you're removing at least some amount of understanding. You're left with a workflow of "if x use model a else use model b." That will work until it doesn't, i.e. until you have to face a problem that isn't predicted by the workflow. Not saying that's a bad approach (it's certainly the most pragmatical) but it shows that the phrase "understanding first maths second" is misguided at best and disengenious at worst.

I think a more appropriate phrase for what the author describes in the prologue would be "practicalities first, theory second".

Some people think that mathematics can also be seen as a language. I use this categorization quite frequently nowadays and it really simplifies my work a lot. By not using a more elaborate mathematical notation, you basically are able explain mathematical concepts to people who are not "speaking math" like you would explain the work of Montesquieu to anyone not speaking french: It takes more time and effort, and francophone readers of Montesquieu will hate you for doing it. ;)

Skipping the notation sure is nice and fine as long as you don't end up losing any of its benefits.

The whole idea of notation/language is:

1.to have a common reference for each definition that is explicit 2. forces you to define and handle proofs from a proper angle of detail (think of it like trying to write the specification of a data structure and an algorithm in prose instead of Haskell, C, some assembly dialect. This is actually interesting because what people call "math language" comprises a wide variety of different styles, like programming languages do. 2. concise -- again and again using notation and prior well-defined notions saves a lot of time and space; you don't tangle on the properties of registers for everything you do.

I would say that mathematics has a variety of languages that were, and are introduced per area, to go along. If you have a programming languages background you are familiar with the setting (and joke) that each new work defines a new (somewhat) programming language.

In general, there is a reason computer science is regarded to be that close to math.

On the other hand, some things can get lost in translation. This, of course, depends on the language (and culture), and mathematics is probably the strongest example of such language (with, say, physics being the "culture").

I agree that that would be a more appropriate phrase.

However, I can also understand what the author is getting at when they say "computation first", "mathematics second."

Here they are using the layperson's notion of "mathematics" as a bunch of a Greek symbols detached from concrete meaning.

Of course, the layperson's notion is wrong, but nonetheless if that's how people think I can see why the author would try to communicate in their terms.

Real mathematics can of course be described with Greek letters, English sentences, or computer code - it's all the same mathematics!

The more experience you have in mathematical thinking the more you understand why using a "bunch of Greek letters" is actually easier than English sentences or computer code, but the layperson with limited mathematical experience doesn't grasp that, so they want to learn in code or plain English.

There is something inherently counter-intuitive about teaching probability theory. Consider the Monte Hall problem and how many smart, seemingly rational people will refuse to believe the outcome even after rigorous proof.

For more in-depth treatment, see Probabilistic Models of Cognition


I find the Monte Hall problem is simple to grasp once you explain it properly. But you need math for that.

I think it can be explained just using intuition too.

Instead of 3 doors, imagine there are 1e9 doors. Then you choose one door (at random), and the host reveals all but one other door.

I think then it's clear why you would switch doors.

The problem is that it is not obvious why the generalisation mus be that the host opens n-2 doors. Many people view it as the host opening just one door, and doing so also in the 1e9 case.

What complicates things is that these people are right, too. The generalisation of the problem does not fix the number of doors opened by the host. It could be that the host opens just one door. Or 5e8 doors, splitting the set in the middle. Or any other number in between. In fact, the number of doors opened by the host could be decided by dice before each round starts.

So now the person you talk to goes, "I'm still not convinced, but let's say for the sake of the argument that switching is good in the situation you mention. Now tell me why it is better to switch in all other cases."

You're ending up in formal proof country a lot faster than you'd want for an intuitive explanation.

(Odd observation: if k is the number of doors opened by the host, I find any case where n=2k+1 a lot harder to explain than when k=1 and n is some large number. Maybe that can give some insight into why it's a tough problem.)

It's fairly obvious to me (after careful elaboration, that is): if you don't switch, you'll win iff you got the right door the first time (1/3). If you do switch you'll win iff you picked one of the wrong doors (2/3).

My intuitive explanation when I was tutoring statistics was the following. (This works better while drawing I find.)

Assume you pick one door. It's the right door with a chance (1/3) and the wrong door with a chance (2/3).

Now if you could pick both other doors (2/3) you would choose those. And this is exactly what you're doing when you switch after the reveal.

The other two doors always have one wrong one, which can be revealed. So switching after the reveal is the same as having the chance to pick two doors before the reveal.

Sure, there are many possible generalizations, some of which are more intuitive than others. But it should be clear that n=3 is a specific case of this particular generalization, for which I think the intuition is clear, which should help illuminate why the n=3 case holds.

I don’t think you do. Imagine the Monty hall problem had 1,000 doors. You choose one, then 998 others are opened to reveal goats. It’s abundantly clear that switching in this case is better odds, without even formalizing the mathematics.

Look at kqr's comment, which shows why this is not an equivalent problem, and also points out that questioning it is quite right in this case.

It is an equivalent problem. kqr's comment is wrong in its implications. Whether you open one door or 998 doors, you are revealing information about the likelihood of what is behind the last door. It's just more obvious that this is the case when N-2 doors are opened than just one.

I notice this in a lot of people. I am involved with an organization that teaches Python. A lot of people dealing with this organization mention Python and immediately follow up with "but I don't understand anything about that" as if to say "but thank God I am not one of these nerdy guys".

I have even noticed one otherwise intelligent guy pronounce it Bython.

> Mathematics is understanding.

Mathematics brings with it some necessary baggage that by itself not only does not help understanding but may interfere with it when one is trying to make sense of things. A mathematical model usually includes a lot of "scaffolding" that does not correspond to anything in the real world, and it is often all too easy to get confused about which parts of the model do and which do not. This problem of interpretation is an especially difficult one in cases when the reality is outside of the domain of one's immediate experience, quantum mechanics being one example.

> Mathematics brings with it some necessary baggage that by itself not only does not help understanding but may interfere with it when one is trying to make sense of things.

Give me a specific example.

Mathematics is understanding. What's implicit?

Yes and no. Even as a mathematician I find it much easier understand the actual mathematics if I first have an intuitive 'feel' for the thing. Even if my initial mental model is simple and incomplete I find it much easier to go from there to the formal mathematical understanding than trying. Conversely I can also find myself in situations where I can understand all the math on the page, not really grasp what it is trying to tell me until I've had a chance to 'play' around with it in some from.

It's absolutely possible to learn some math without understanding it beyond a symbol-manipulation level. But if you have a strong enough intuition, you will be able to translate it into math. So yes, understanding does precede mathmatical formalization.

Try doing actual calculations without mathematical formalization and let me know how it goes.

Being able to do calculations does not imply understanding. For instance I can apply the identity e^pi*i=-1 algebraically without having any understanding of why the identity is true, how complex numbers relate to rotation, etc.

I didn't say that being able to do calculations implies understanding did I?

Now, understanding implies being able to do calculations. And you said that you could understand without maths, which means you can do calculations without maths. So try that and let me know how it goes.

No, I said that if you have a strong enough intuition you will be able to recreate the math from that intuition. But knowing the math does not suffice to give you the intuition.

I recommend David MacKay's Intro to Info Theory class: https://www.youtube.com/watch?v=BCiZc0n6COY

There is a free book for it at: http://www.inference.org.uk/mackay/itila/

Daphne Koller's 3 coursara classes on Probabilistic Graphical Models are also really good.

This is the "online" (abridged) version of this book:


There's a full version of another good book ("Think Bayes: Bayesian Statistics in Python") available as a PDF from the publisher here:


(related Amazon reviews)


This book is helpful - I went through it and did all the interactive examples.

Having said that, I much prefer "Doing Bayesian Data Analysis" by Kruschke. It is extremely clear. It introduces concepts with intuition first, then math, then code, which I find to be an extremely useful order.

I would recommend incoming readers to really invest time on theory before starting this book (I know that is exactly what this book does not want you to do) and use this book as the driver of your mental model the next time you encounter Bayesian methods.

I personally find exceptional clarity once I see the code for a certain technique in Machine Learning. Often times, the theory skips certain implementation details which always leaves a void for me. After having read quite a bit on Bayesian Learning sometime ago, it is easy to connect this guide back to theory. That immediate click of ideas is rewarding!

Do you have any suggested reading for the theory? I'm going to be using QUESO[1] for some research with Bayesian statistics and am trying to learn more of the background knowledge.

[1]: https://github.com/libqueso/queso

I absolutely love David Barber's book (BRML). It is available here - http://web4.cs.ucl.ac.uk/staff/D.Barber/pmwiki/pmwiki.php?n=...

I'm inclined towards Machine Learning and hence the bias. Not sure if this would cover the statistics parts but I think at least the fundamentals are the same.

This is a must have, mostly because it serves as the best documentation / user guide for PyMC - a fantastic stats package that can be a little tricky to get started with.

Support great work like this and buy the book!

Is P(a|b) = ( P(b) * P(b|a) ) / P(b) more true when the probabilities are from a binomial distribution as opposed to poisson, or am I just confused and it doesn't matter.

I think this is quite good:


although if I knew more about Bayesian methods I would be in a better place to recommend it.

Nice to see a practical way to address this important topic I will read it , thanks

Looks good. Someone gave a lecture on PyMC at work last week and put PyMC on my radar. This book is also on O’Reilly Safari, and I just started reading it.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact