
The Probability and Statistics Cookbook - kercker
http://statistics.zone/
======
nonbel
In section 13, I see they are still teaching the Fisher-Neyman Pearson hybrid
(ie the null ritual). For a brief overview see [1]. To start you off: Fisher
said the idea of power was nonsense[2], and Neyman-Pearson said a hypothesis
is either rejected or not (there is no gradient of evidence for/against).[3]

[1] Gigerenzer, G (November 2004). "Mindless statistics". The Journal of
Socio-Economics. 33 (5): 587–606. doi:10.1016/j.socec.2004.09.033

[2] 'The phrase "errors of the second kind", although apparently only a
harmless piece of techinical jargon, is useful as indicating the type of
mental confusion in which it was coined.' -Ronald Fisher. "Statistical Methods
and Scientific Induction." Journal of the Royal Statistical Society. Series B
(Methodological) Vol. 17, No. 1 (1955), pp. 69-78
[https://www.jstor.org/stable/2983785](https://www.jstor.org/stable/2983785)

[3] 'no test based upon the theory of probability can by itself provide any
valuable evidence of the truth or falsehood of that hypothesis.' -Neyman, J;
Pearson, E. S. (January 1, 1933). "On the Problem of the most Efficient Tests
of Statistical Hypotheses". Phil. Trans. R. Soc. Lond. A. 231 (694–706):
289–337. doi:10.1098/rsta.1933.0009.

~~~
mavam
Indeed, when I took my introductory statistics courses at UC Berkeley (STAT
200A and STAT 200B), we discussed the pitfalls of blindly applying these
concepts---albeit at very high level.

Initially, I wrote this cookbook/cheatsheet in order to structure and retain
the material in these courses, not to challenge them. Most of the content
comes from the cited references, all of which have a very terse and
mathematical presentation. It would be great to augment the current document
with pointers to the literature that offer a critical discussion. As a non-
statistician, I lack the historical perspective, but I always appreciate
contributions from experts in the field. (The document is open-source:
[https://github.com/mavam/stat-cookbook](https://github.com/mavam/stat-
cookbook))

~~~
nonbel
I don't have time to contribute but the wikipedia page on NHST[1] used to be
pretty good about refs. A lot of the stuff pointing at controversy/history has
been slowly removed the last few years... it still isn't too bad though.
Anyone interested can also try looking through old versions. There have been
thousands of papers published on that topic.

[1]
[https://en.wikipedia.org/wiki/Statistical_hypothesis_testing...](https://en.wikipedia.org/wiki/Statistical_hypothesis_testing#Origins_and_early_controversy)

------
mooneater
Cool, but I would call this a cheatsheet. Almost the inverse of a cookbook,
which I think of as a set of "how tos" for tasks you want to accomplish.

~~~
donw
Yeah. Don't suppose anybody has the equivalent of an actual "cookbook"? It
would be a great resource for helping to skill up product owners on A/B and
multivariant testing when they don't have a math-heavy background.

~~~
mch82
"The Seventh Edition of Introduction to Statistical Quality Control provides a
comprehensive treatment of the major aspects of using statistical methodology
for quality control and improvement."
[http://a.co/5xRV2UE](http://a.co/5xRV2UE)

Covers: Hypothesis testing (single population), Analysis of variance (multiple
populations), Control charts, Design of Experiments (A/B testing and beyond).

Edit: Used in the undergraduate industrial & systems engineering program at
USC when I was a student. The 7th edition has various cook book style
walkthroughs.

~~~
eitally
This was used at NC State in the industrial & systems engineering program,
too. Very good book.

------
varlock
Quite useful indeed! If I may, I'd love to see references for each concept.

I appreciate anyone can look "sample space" or "parametric inference" up on
Google, but it'd take some time to find some authoritative source (especially
for people like who do not work with stats every day). It'd be awesome if I
could see a "[1]" and a reference (or list of references), either online or
offline, where the concept is defined.

~~~
mavam
I fully agree that tracing each concept to its originating source has great
value.

From a presentational point of view, I wonder if it would pollute the
plain/clean presentation of the formulas. Perhaps very small footnote-like
citations could work, but it has to be unobtrusive.

The hardest part, however, will be coming up with the authoritative source for
each concept. As it's outside my field of expertise, we would have to rely on
the community to fill in these details incrementally.

------
DevonSA
I'm struggling to understand the CDF for the discrete uniform distribution
(line 1 of the first page). I think the equation is missing a "+1" in the
denominator as otherwise it doesn't make sense. Although it has been a while
since I took a stats class, so I may be mistaken

------
geokon
Anyone have a recommendation on a good statistic textbook?

Everything I've tried has been absolutely horrible except for "An Introduction
to Error Analysis" by John Taylor (yeah the classical mechanics guy).
Unfortunately it's a bit basic...

~~~
pbowyer
Have you looked at "Statistics in a Nutshell"?
[http://shop.oreilly.com/product/0636920023074.do](http://shop.oreilly.com/product/0636920023074.do)

~~~
AimHere
I have this book, and it's not what I'd call a textbook.

It rarely _explains_ how any of it works (you'd be hard pressed to find the
formula for a probability distribution function, for instance), so it's just a
one-stop collection for a lot of useful tests and the circumstances under
which they should be used.

It's less of a textbook, and more of a reference for someone who needs to
occasionally work with statistics and can't remember offhand when the T-test
is appropriate or the procedures to undergo for a chi-squared test or
whatever.

------
neves
Nice! Did anyone make a mobi file so I can read it in my Kindle? PDF s*cks!

~~~
politician
You can email your @kindle.com account with a PDF attachment, and it'll be
delivered to your device. Google the details.

~~~
neves
First, it just will convert it the word "convert" is in the email subject.

Second, this is almost useless for multi-columns PDFs. It just works fine with
very simple PDFs. Everything with a minimal of complexity becomes junk.

Reading PDFs in the kindle is a terrible experience. The text doesn't reflow.
You get lost in multiple columns. Even the next doesn't work fine.

Since it is something open, with the original content published in GitHub, I
thought someone should have generated a mobi file.

