
How to Hang Yourself with Statistics - acangiano
http://math-blog.com/2012/09/02/how-to-hang-yourself-with-statistics/#.UEN3BG7Uva1.hackernews
======
capnrefsmmat
Unfortunately it looks like the author of this piece got caught in the p value
trap. He says:

 _Keep in mind that people flip coins and get five heads (or five tails) in a
row all the time. With a p value of only five percent, one in twenty published
papers reporting a p value of five percent will be wrong purely by chance._

That's only true if 50% of the hypotheses you test are true, and your
experiment is so good that you have no false negatives. In typical medical
trials, on the other hand, sample sizes are small enough that there's perhaps
a 50% chance of false negative. (If you see a difference between groups with
small sample sizes, you can't tell whether it's due to the tested medication
or just chance, so you conclude there's no statistically significant
difference.)

Under realistic circumstances, the odds of a p < 0.05 result being true can be
as small as 45%.

I've written more on this problem here:
[http://www.refsmmat.com/statistics/#the-p-value-and-the-
base...](http://www.refsmmat.com/statistics/#the-p-value-and-the-base-rate-
fallacy)

~~~
tmoertel
I interpreted the author's claim to be that, assuming that all tested
hypotheses were false, 5% would by chance alone be publishable anyway. So one
must not take too strongly as evidence of truth the fact that a hypotheis was
published. (Indeed, this observation has led to cautionary papers like
Ioannidis's "Why Most Published Research Findings Are False" [1]).

[1]
[http://www.plosmedicine.org/article/info:doi/10.1371/journal...](http://www.plosmedicine.org/article/info:doi/10.1371/journal.pmed.0020124)

~~~
sesqu
The issue I take with that (important) observation is that just because the
threshold is set at 5%, the results can be far more convincing. In an
analysis, you may see several components with p<10⁻⁶, a few with p=0.23,
several more with p>0.3, and one with p=0.06. They won't all have p=0.05 - in
fact, none of them will.

That said, it's worth being reminded occasionally that statistical
significance is in many ways just the first step on the road to knowledge, not
the last. But when it comes to false findings, stuff like false negatives and
exploratory analysis are far more impactful.

~~~
tmoertel
Yeah, it always worries me that so many people see a small p-value like 10^-6
and say, "Wow, that one's _definitely_ true."

But p=10^-6 doesn't mean, as commonly believed, that there's only a one-in-a-
million chance that the proposed hypothesis is really false, nor does it even
mean what many more-statistically-savvy people think it means, that if the
proposed hypothesis were false, there would only be a one-in-a-million chance
of observing test data as extreme as what was observed. No, what it really
means is that – and here's the part most people miss – _assuming that the
researchers' model of the underlying data-generating process is correct_ ,
__then __, if the proposed hypothesis were false, there would be only a one-
in-a-million chance of observing test data as extreme as what was observed.

Yes, as the p-value becomes smaller, it does indeed become easier to believe
that the hypothesis of interest is true, _assuming that the humans didn't
screw up the model._ But, in any complex work, I'm going to have a hard time
believing, sans replication, that there's not a reasonable chance of humans
screwing up.

To me, then, p=10^-6 is the new p=10^-2.

EDIT: Replaced Unicode superscripts (10⁻⁶) with circumflex notation (10^-6)
because the superscripts weren't showing up on my Nexus 7.

~~~
sesqu
Yes, the calculation of a p-value is always done by assuming a specific model,
though usually a less controversial one than what is proposed. But I wouldn't
go so far as to demand smaller values. I would prefer we stop thresholding so
much altogether, and instead operate with the understanding that what has or
has not been found is a suggestion with evidence for it, rather than fact, for
all but the best-understood processes.

It is a difficult task for many people, who have been taught facts for
decades, to accept that objective knowledge is hard to come by. But everyone
understands the value and properties of a crude model.

~~~
tmoertel
Sorry, I wasn't clear. I meant that when I see a p-value of 10^-6 in papers, I
expect that there's at least a 1% chance that the humans screwed up the models
somewhere, so I don't see it as more persuasive than 10^-2. That is, p-values
lose credibility once they start getting smaller than a few percent.

So I agree with you. If it were up to me, we'd all report evidence intensity
in decibels, anyway.

------
tokenadult
After I read this article (which I see is submitted by the founder of this
interesting blog, now a blog with many guest articles), I went to the reading
list page on the blog site. There are many VERY GOOD books about mathematics
listed there,

<http://math-blog.com/mathematics-books/>

but the subtopic with the most disappointing recommendations was actually the
subtopic on statistics. Two of my favorite online articles on statistics
education

<http://statland.org/MAAFIXED.PDF>

and

<http://escholarship.org/uc/item/6hb3k0nz>

both point to better books on statistics and the key issues in the discipline.

~~~
tom_b
I would be interested in hearing specific recommendations for self-study of
statistics from you.

I currently have Feller (based on reading www.ams.org/notices/200510/comm-
fowler.pdf) but don't have a good "taste" for what would be the best books for
a self-study approach to statistics. There is also the "Teaching Statistics: A
Bag of Tricks" book and I was considering dropping it into the mix.

I am probably over-thinking the book choice and would be fine to just dive in
to anything, but if I would prefer to not pick up references that are going to
drive me off a good path to start . . .

------
ordinary
Meta: what's up with that URL?

~~~
cowsaysoink
The submitter is probably using it to track hacker news visitors (especially
those using https).

The submitter is the owner of the blog.

------
de1978st
Good post, thanks.

