

Most published research results are false - henning
http://www.johndcook.com/blog/2008/02/07/most-published-research-results-are-false/

======
rubidium
Note well. This is for clinical research.

This says nothing about C.S., Math, Physics, etc...

It's harder to publish a big result in Physics, for example, and have it stand
for long without being questioned. Just ask Jan Hendrik Schön.

~~~
Lewisham
From what I've seen, I wouldn't say CS is immune to this at all. CS seems to
only be accepting papers that illustrate some big win.

In my short PhD career, I've an article in the drawer where the result was
inconclusive. This result, in and of itself, was fairly valuable. It spoke
much about all the ways things went wrong. However, wrong was not acceptable
for publication, it had to be right. Much hand-wringing was done to identify
the bits that could be pulled out from it, polished, and then submitted (I was
not especially pleased about this). The paper was, rightly, rejected, but all
the valuable findings are still in my drawer. The pressure to produce Good
findings rather than Bad findings does not meet my expectation of scientific
pursuit.

I took great joy in seeing a history of science exhibition recently, and
reading the notebooks of people like Newton and Darwin, who they themselves
spend a lot of time writing out all the things that went wrong, all the
concerns they have about their "results", and general humbleness. It speaks
volumes that the greatest scientific minds that have ever lived have put more
caveats and concerns in their published work than 98% of what is published
today does. To get published, it seems you have to be 110% sure that your work
is 120% amazing. I don't think this is a good state of affairs.

------
kenjackson
Given that this proclamation was itself a published research study, does it
mean that it is also probably wrong? The barber also publishes papers?

------
arctangent
I always found statistics deeply disturbing when I was studying it at school.

You can establish a null hypothesis and then test at some confidence interval
(say, 99%) and find that your null hypothesis still holds.

But, being somewhat ingenious, you may decide to lower the confidence interval
(to, say, 98%) and find that your experimental evidence is now significant
enough for you to reject your null hypothesis and accept your alternative
hypothesis.

Lies, damned lies, and statistics, indeed.

~~~
brent
This is why different communities accept p-values at certain levels (typically
<0.05). You have to live with some amount of uncertainty when the data
generating mechanism is random... that is unfortunately the nature of the
beast.

~~~
arctangent
I understand this. But let's say I ran some experiments and collected some
data in an attempt to disprove theory X. What does it mean for me to say that
at the 99% confidence level X is still true, but at the 98% confidence level
it is not true? I just find it a bit spooky, is all.

------
jakestein
Linke to the PLoS article that this post summarizes:
[http://www.plosmedicine.org/article/info:doi/10.1371/journal...](http://www.plosmedicine.org/article/info:doi/10.1371/journal.pmed.0020124)

------
j_baker
Just out of curiosity, what is the alternative to using p-values? What should
researchers really be doing?

~~~
Read_the_Genes
Most studies use raw p-values. This, coupled with "exploratory analyses" which
are not mentioned in the research, leads to a high false discovery rate.

Alternatives include using simulations to estimate false discovery rates (this
can be done analytically for some problems). Bayesian frameworks can also be
applied, depending on the amount and accuracy of prior knowledge of the system
under study.

That said, I see nothing wrong with publishing results with nominal p-values,
as long as the researchers indicate the weakness in their results. Meta-
analyses can always come back later and use the results they publish.

What annoys me is seeing 50 or so statistical tests done, and then researchers
stating they have found something when 1 of those tests shows a p-value of
0.05. Just using a Bonferroni correction, the simplest of all corrections,
would demonstrate that findings like this are not significant.

------
s3graham
I enjoyed this related article a few months ago on the problem:
[http://www.newyorker.com/reporting/2010/12/13/101213fa_fact_...](http://www.newyorker.com/reporting/2010/12/13/101213fa_fact_lehrer)

