What is “Bayesian” Statistical Inference? 26 points by fogus on Sept 10, 2009 | hide | past | favorite | 11 comments

 Too dense, man. Only people who already get it will get it. People who don't get it, will still not get it after reading.Try to explain it to your 5 year old daughter, or your 80 year old grandmother.
 Try this instead - http://yudkowsky.net/rational/bayesAfter that, this - http://yudkowsky.net/rational/technical
 Seems pretty clear to me, it doesn't claim to be an article for the non mathematically inclined. Not that "pop" articles on this subject wouldn't be pretty cool too.
 Well, I'm pretty mathematically inclined, but ignorant of Bayesian statistics, and I still didn't really get it. Frinstance the first full-length paragraph:The full Bayesian probability model includes the unobserved parameters. The marginal distribution over parameters is known as the “prior” parameter distribution, as it may be computed without reference to observable data. The conditional distribution over parameters given observed data is known as the “posterior” parameter distribution.uses too much jargon; I'm sure I'd understand it if he'd defined "marginal distribution" and "conditional distribution" and clarified exactly what the difference between observable and unobservable data and/or parameters is. The hypothetical audience for this seems to be people who are intimately familiar with statistical terminology but know absolutely nothing about Bayesian statistics.
 I think those concepts are best understood by example.
 Amen... I'm a recovering mathophobe, and I was discouraged by how utterly impenetrable this article was. I mean, I honestly had absolutely no idea what was going on, despite really wanting to understand.
 I like Eliezer's page on this much better:http://yudkowsky.net/rational/bayesIt has been discussed on HN before:
 Eliezer Yadkowsky's page is a nice intro to Bayes's theorem, the understanding of which is critical for understanding why the posterior is proportional to the prior times the sampling distribution.But Bayesian stats isn't just about applying Bayes's theorem. Its key feature is using probabilities for model parameters and incorporating posterior uncertainty of their values into inference.
 Here's what I actually said in response to all this on the original blog, which showed up as a huge spike in our traffic!There was a sudden spike in traffic, and it turns out it comes from Y Combinator Hacker News, where there's a discussion of this post with seven comments as of today.The criticisms were sound -- it's too technical (i.e. jargon filled) for someone to understand who doesn't already get it. Ironically, I've been telling Andrew Gelman that about his Bayesian Data Analysis book for years.Unix man pages are the usual exemplar of doc that only works if you mostly know the answer. They're great once you already understand something, but terrible for learning.I think Andrew's BDA is that way -- it's clear, concise and it actually does explain everything from first principles. And there are lots of examples. So why is this so hard to understand?I usually write with my earlier self in mind as an audience. Sorry for not targeting a far-enough back version of myself this time! The jargon should be familiar to anyone who's taken math stats. I don't think it'd have helped if I'd have defined the sum for the prior defined as a marginal.
 Wow thanks, this is a great, well written intro that makes a great refresher !Anybody starting machine learning and data mining ?

Search: