
Think Bayes - Bayesian Statistics Made Simple - SkyMarshal
http://www.greenteapress.com/thinkbayes/
======
equark
My problem with books like this is that they have almost no connection to why
Bayesian statistics is successful: Bayesian statistics provides a unified
recipe to tackle _complex_ data analysis problems. Arguably the only known
unified recipe.

The Bayesian book I want should emphasize how Bayes is a recipe for studying
complex problems and teach a broad range of model ingredients. Learning
Bayesian statistics is about becoming fluent in describing scientific problems
in probabilistic language. This requires knowing how to express and compose
traditional models and build new ones based on first principles.

An unfortunate reality is that you still need to know computational methods
too, but that should change soon enough.

~~~
AllenDowney
Yes, that's exactly what the objective of this book is! I am not using
computation out of necessity, but rather because I think it provides leverage
for understanding the concepts, and learning to (as you say) compose
traditional models and build new ones.

As the book comes along, I am finding that many ideas that are hard to explain
and understand mathematically can be very easy to express computationally,
especially using discrete approximations to continuous distributions.

For example, I just posted a section on ABC

[http://www.greenteapress.com/thinkbayes/html/thinkbayes008.h...](http://www.greenteapress.com/thinkbayes/html/thinkbayes008.html#toc44)

that (I think) really demonstrates the strength of this approach.

Of course, my premise only applies for people who are as comfortable with
programming as with math, or more so.

~~~
equark
I'd recommend using as many real examples as possible. Things like
forecasting, product recommendations, topic modeling, etc. While you can
conceptually explain how Bayesian statistics is a unified recipe, it's
incredibly hard to have this sink in with toy problems. This is especially
true since many people using traditional tools are actually using advanced
methods to solve real problems, so when they start reading about urns or doors
it all comes across as rather academic. That's sad because the benefit of
Bayesian coherency is mostly that it leads to a highly productive mode of
practical data analysis.

Definitely shoot me an email at tristan@senseplatform.com if you're interested
in the computational side of this area. At Sense
(<http://www.senseplatform.com>), we're working on making applied Bayesian
analysis as amazing as it should be.

------
nowarninglabel
So, I'm going to counter here and say I don't find this to be a good intro. I
started reading and had not heard of the "Girl named Florida" problem and then
went to the linked to blog post [http://allendowney.blogspot.com/2011/11/girl-
named-florida-s...](http://allendowney.blogspot.com/2011/11/girl-named-
florida-solutions.html)

The way he explains it I found to be confusing and counter-intuitive. I've
taken basic stats in college, and learned some of the associated problems,
though not this one, and learned the material though not in this particular
way. I have to agree whole-heartedly with the commenter on that post "JeffJo"
who stipulates why it's an ineffectual way to present the material.
Furthermore, I found the author's dismissal of the valid criticism to be
enough to not want me to read further.

~~~
AllenDowney
I am coming around to the conclusion that this example is more trouble than
it's worth. I think it's kind of fun, but it does seem to annoy people.

This kind of feedback is exactly why I like to post drafts early. Expect this
example to magically disappear very soon :)

~~~
nowarninglabel
Wow, great to see a reply from you, and thanks for taking feedback :)

~~~
AllenDowney
And... it's gone!

I re-read the chapter and decided that example was doing nothing except
confusing half the audience and antagonizing the other half.

------
udit99
I'm really interested in knowing the prereqs I should have before picking up a
book like this. Coming from a weak math background I find these books highly
appealing but mildly intimidating. Also, could someone advise me on the
preferred order of tackling the following Books?

1\. Think Bayes

2\. Think Stats

3\. Programming Collective Intelligence by T.Segaran

~~~
jey
3, 1, 2.

3: Get (back) into the swing of thinking about mathematics and algorithms.

1: Bayesian statistics is a principled, coherent, consistent, intuitive,
complete framework for reasoning about uncertainty. A good foundation.

2\. Traditional statistics is more random and ad-hoc, but can be more
practical than Bayesian methods. (Bayesian models are well-motivated, but it
can be impractical to compute exact answers and you'll have to switch to
approximation techniques, some of which are simple/universal/slow, and others
get fairly complex.)

------
pmjordan
So this is all very well and good, I've had about 5 intros to Bayesian
Statistics. But those are a fair bit away from actually applying that
knowledge in practice in software.

Let's say we have N different kinds of events with unknown probabilities and
unknown dependence or independence between them. The naive approach to
gathering data on the probability of event n occurring following an occurrence
of event m would require O(N²) in space. Let's say N ~ 10⁹~10¹⁰. Storing that
much data as a raw matrix isn't practical in most cases, so we have to find a
more efficient data structure - in terms of both space and the operations we
need to perform. (and taking into account characteristics of the storage
medium, i.e. memory or disk or a combination) What happens if the
probabilistic properties of the system change over time?

Are there any introductory books or other resources on modeling this kind of
problem? Clearly this has been tackled before, but I'm having a hard time
making the leap from theory to practice - and I don't mean import the data
into R or SPSS or whatever and let that grind out a solution, but coming up
with approximations when you have runtime and space constraints that make that
approach impractical.

~~~
trhtrsh
I think the general approach is dimensionality reduction: start measuring, and
round down to 0 for the low-correlation pairs of events.

Do you actually have a stream of more than N^2 observations to process? If
not, then most of your correlations are in fact 0, and sparse-matrix
techniques apply.

------
PostOnce
I'm a big fan of the author's other books, Think Python and Think Complexity
(haven't had the time for Think Stats), I found them more understandable than
most other books that purport to teach people of the same skill level.

I'm hoping this will be as good, but all the negative comments here leave me
skeptical. Perhaps this is the crowd that would enjoy K&R C more than Think
Python. The former is more of a reference to me than an introductory tome.
Perhaps everyone here is just better at math than I am.

~~~
AllenDowney
Ignore the haters -- Think Bayes is going to be awesome!

Just kidding (mostly), but your point is correct: there is no book that is
right for all audiences. But if you can program, and the mathematical approach
to this material doesn't do it for you, this book might.

------
SkyMarshal
FB announcement where I found this -
<https://www.facebook.com/thinkstats/posts/325778617519425>

------
roryokane
Link to _Think Stats_ , the author’s previous book: <http://thinkstats.com/>

------
hessenwolf
Bayesian is cool because you can make arbitrarily complex models, and when you
have the parameters estimated it is really easy to calculate all the cool
things you want to.

Bayesian is not cool because estimating the parameters takes bloody ages on a
supercomputer, unless you spend ages being really careful to specify your
model.

Frequentist statistics is cool because it is a massive big bag of tricks to
estimate all sorts of stuff, and pretty much all of the tricks are already in
R.

Frequentist statistics is not so cool because calculating all the specific
things you want to can be a pain in the ass.

Once either Quantum computers kick in or a better algorithm than MCMC for
Bayesian is created, Bayesian will win.

There are some philosophical arguments about the objectivity of the prior in
Bayesian statistics, but these wash out in a decision theoretic framework
because of the subjectivity of the utility function at the other end of the
process.

Also, less than 5% of people reporting p-values really know what a p-value is.

~~~
zmjones
Definitely agree. Do check out Stan though. HMC is pretty fast compared to
BUGS/JAGS.

------
juanfatas
I found Udacity's class (CS373, ST101) and the 2011 ai class also explained
bayesian very well by Sebastian Thrun.

------
Groxx
> _This HTML version of is provided for convenience, but it is not the best
> format for the book. In particular, some of the symbols are not rendered
> correctly._

I would actually recommend the opposite - they have ASCII versions of the
symbols that e.g. Chrome might not render correctly, and all I checked looked
fine. The PDF meanwhile copies this text from the (not linked) link to the
"girl named Florida" article:

    
    
      ❤tt♣✿✴✴❛❧❧❡♥❞♦✇♥❡②✳❜❧♦❣s♣♦t✳❝♦♠✴✷✵✶✶✴✶✶✴❣✐r❧✲♥❛♠❡❞✲❢❧♦r✐❞❛✲s♦❧✉t✐♦♥s✳❤t♠❧
    

And the sections are linked in the HTML version, where they are not in the
PDF, which seems like a simple oversight (that infects the vast majority of
PDFs, sadly).

------
ninetax
Has anyone used the Think Stats book? Is it a good intro to stats?

~~~
stdbrouw
It's really, really good, especially if you're interested in the why and not
just the "give me the damn test I need to run in SPSS and what number to look
at". Plus, because you spend a lot of time coding, it's more fun and less dry
than most stats books.

~~~
disgruntledphd2
Think Java is also excellent. I think it may have been the first (non R)
programming book I read, and it helped me get more into programming which
wasn't just for stats.

I have recommended think stats to many people, and it appears to be somewhat
of a success. And its free documentation, which is wonderful.

------
jcrubino
I think you and Prof Scott Page could make a great team combining your think
series with his model thinking class content.

~~~
jcrubino
U Mich. Prof Scott E. Page <http://masi.cscs.lsa.umich.edu/~spage/>

------
signa11
thank you ! should help in coursera's pgm course somewhat i guess

