
Statistical Inference for Everyone - jeffmax
http://web.bryant.edu/~bblais/statistical-inference-for-everyone-sie.html
======
edtechdev
Here are some more interactive and effective ways to learn stats online. The
Open Learning Initiative's open Probability & Statistics Course out of
Carnegie Mellon might just be the most researched and carefully designed
course out there. [http://oli.cmu.edu/courses/free-open/statistics-course-
detai...](http://oli.cmu.edu/courses/free-open/statistics-course-details/)
Students learn more statistics concepts in half the time as a traditional
stats course. [http://oli.cmu.edu/get-to-know-oli/see-our-proven-
results/](http://oli.cmu.edu/get-to-know-oli/see-our-proven-results/)

The Statistics Online Computational Resource (SOCR) site is also amazing for
actually learning and playing with common statistical tests and tools:
[http://www.socr.ucla.edu/](http://www.socr.ucla.edu/)

Collaborative Statistics is a free and interactive statistics textbook:
[https://www.kno.com/book/details/productId/txt9780983804905](https://www.kno.com/book/details/productId/txt9780983804905)

You can also run Sage, R, Python, Octave (Matlab clone) and other tools right
in the browser now: [https://cloud.sagemath.com/](https://cloud.sagemath.com/)

~~~
tlmr
What do you think of the CMU course vs this text book in question?

~~~
capnrefsmmat
The CMU course is very traditional. It covers basic exploratory data analysis
(summary statistics, plotting data), basic probability, and hypothesis testing
and estimation. There's no programming, nothing Bayesian, and only brief
discussion of regression.

(I taught 36-201, the intro stats course that was used to build the OLI
course, this summer.)

 _Statistical Inference_ , on the other hand, seems to take a Bayesian
perspective and is very much not your standard intro stats class. It looks
interesting and I'll have to skim through some of it.

------
daniel-levin
This book interesting because it forgoes the traditional approach of most
mathematical statistics books. The preface states that it is done like this in
order to avoid the "cookbook" approach taken by many statistics students. This
is why it is ironic that "Bayes' Recipe" appears 15 times in this text, and on
page 131 there is a five step algorithm for parameter estimation, and my
favourite, oft-repeated, never explained recipe - "n > 30, you'll be fine".
There is no mention of the CLT, MLE, method of moments estimation, biasedness
of estimators, convergence in probability, how sampling distributions arise,
or any of the theory of distributions that underpin all of the inferential
procedures detailed in the book. I think that excluding these topics actually
increases the cookbooky-ness of the text.

It is important that students understand the provenance of the inferential
techniques they use so that they don't land up doing bogus science (which
hurts the world) by not knowing the failure modes of these techniques. Of
course not all students of statistics know the requisite mathematics to
understand it all, at the very least put the failure modes into a cookbook
form.

For the sake of science please don't ever do any inferential statistics
without knowing when the method you're using works and when it breaks, what it
is robust to, and what assumptions it makes. Statistics is really easy to
break when used naively. The mathematics of statistics is not easy, and often
results are highly counter-intuitive.

~~~
bblais
"There is no mention of the CLT, MLE, method of moments estimation, biasedness
of estimators, convergence in probability, how sampling distributions arise,
or any of the theory of distributions that underpin all of the inferential
procedures detailed in the book."

Lot's of good criticisms in this thread, which I'll have to look at. This one,
however, is not. :) how many intro stats book, of the traditional kind,
mention MLE, method of moments, biased vs unbiased estimators, etc...? None
that I've seen. So, you're right, it becomes more "cookbooky" as a result,
however, I would argue that all Bayes analysis follows the same recipe,
whereas frequentist analysis typically follows many recipes - not obviously
connected. It is that part that I criticize, not the fact that there is a
recipe for doing things.

~~~
daniel-levin
>> how many intro stats book, of the traditional kind, mention MLE, method of
moments, biased vs unbiased estimators, etc...? None that I've seen

Oh - there are quite a few. Here's a small sample (no pun intended):

\- Probability and Statistical Inference by Hogg & Tanis (we used this in my
stats course)

\- Modern Mathematical Statistics with Applications by Devore & Berk

\- Probability and Statistics by DeGroot & Schervish

~~~
bblais
Ah, yes. I concede the point. What I find interesting in all this is that the
term "Introduction" is used is so many ways. When looking, for instance, for
an intro bayes book you get things like Lee and Bolstad which, for some is
intro. However, if you tried to teach med students or business students from
that it would be a disaster.

Personally, MLE I see as just an approximation of MAP - which is superior.
Biased vs unbiased also doesn't play into probability theory as logic, except
as a consequence of those parameters that maximize the posterior.

------
grayclhn
I haven't looked at it carefully, but it's hard to think of a setting where
I'd want to teach from this book: it's aimed at stats 101 students, but uses
python as the programming language (great language, but far beyond what I'd
expect a typical intro stats student to be able to handle); it advocates
bayesian statistics, which is a reasonable decision, but seems to take it to
such an extreme that "hypothesis test" never appears in the table of
contents...

But, it's obviously a labor of love and it's an interesting take on intro to
stats. And, from skimming it, I don't see anything in it that's wrong. So this
might be a good intro to bayesian stats for most HN readers.

edit: there is a _wide_ range of quality for the graphs, though. Some look
great, but some (the histograms especially) are... unappealing. And the
formatting for the code sections is quite at odds with the style of the rest
of the book. Those are minor, though.

second edit: not to start a license flamewar, but can this book be
redistributed? It's licensed under either CC or GNU FDL, but I don't see a way
to get the source code. So anyone hosting a copy would also need to license it
under the FDL (since they can't remove the FDL licensing from the pdf), which
they would then be violating. Am I understanding things correctly, or am I
wrong?

~~~
ced
_it advocates bayesian statistics, which is a reasonable decision, but seems
to take it to such an extreme that "hypothesis test" never appears in the
table of contents..._

That's not very unusual. It seems to follow the "logic of science" approach
from Jaynes. Hypothesis testing is covered in chapters 4 and 6. Other books
(Mackay, Jaynes, Murphy) only cover frequentist hypothesis testing to argue
against it, so this is rather refreshing.

~~~
grayclhn
It's very unusual to not cover hypothesis testing in an introduction to
statistics class. The students are going to see "testing" again. The passage
you quoted was about teaching from the book, not using it for self study.

------
Perceval
Hopefully they can fix the embarrassing typo: "Monte Hall problem" should be
"Monty." Not sure how that could have escaped notice. Maybe they were thinking
about Monte Carlo simulations when writing that bit, but someone should have
caught this.

~~~
tjradcliffe
Proofreading is one of the great unsolved technological problems. Human
attention is a fantastically limited resource, and even multiple layers of
checking frequently lets what subsequently appear to be "obvious" errors slip
through.

The recent "cite crappy Whoever paper here" goof in a peer-reviewed journal is
a typical example, and is notable only in that it is so egregious that it was
caught and publicized. It is essentially certain that a large fraction of
published papers contain at least one significant typo. I know of one case
where two figures in a paper were identical (figure 2 was duplicated in figure
3) and it was missed by the co-authors (one of whom was fanatically careful)
the journal editors and the referees.

We are never directly aware of our own inattentiveness, by definition, so the
reality of how inattentive we are comes as a constant surprise.

To twist this vaguely back on topic: as well as being attentionally blind, we
are also probability blind. I liken this to colour-blindness: we simply do not
see probability distributions and have a terrible time thinking about them,
yet we are completely immersed in them every day.

Between these two things--attentional blindness and probability blindness--we
frequently end up interacting with the universe in ways that make little or no
sense, as we behave as if we a) notice everything and b) live in a world of
certain outcomes. The modern revolution of treating probability theory as
logic is a huge big deal, and people who adopt it are likely to have a
considerable advantage in years ahead. For one thing, it makes dealing with
our attentional blindness easier, because it helps us understand and represent
in our reasoning our imperfect attentional capabilities.

~~~
grayclhn
A clear link to the LaTeX source file on github would solve a large part of
the proofreading problem. Especially since the book is nominally licensed
under the GNU FDL.

