by two experienced science journalists picks up many--but not all--of the cases of peer-reviewed research papers being retracted later from science journals.
Psychology as a discipline has been especially stung by papers that cannot be reproduced and indeed in many cases have simply been made up.
That has prompted statistically astute psychologists such as Jelte Wicherts
and Uri Simonsohn
to call for better general research standards that can be practiced as checklists by researchers and journal editors so that errors are prevented.
Jelte Wicherts writing in Frontiers of Computational Neuroscience (an open-access journal) provides a set of general suggestions
Jelte M. Wicherts, Rogier A. Kievit, Marjan Bakker and Denny Borsboom. Letting the daylight in: reviewing the reviewers and other ways to maximize transparency in science. Front. Comput. Neurosci., 03 April 2012 doi: 10.3389/fncom.2012.00020
on how to make the peer-review process in scientific publishing more reliable. Wicherts does a lot of research on this issue to try to reduce the number of dubious publications in his main discipline, the psychology of human intelligence.
"With the emergence of online publishing, opportunities to maximize transparency of scientific research have grown considerably. However, these possibilities are still only marginally used. We argue for the implementation of (1) peer-reviewed peer review, (2) transparent editorial hierarchies, and (3) online data publication. First, peer-reviewed peer review entails a community-wide review system in which reviews are published online and rated by peers. This ensures accountability of reviewers, thereby increasing academic quality of reviews. Second, reviewers who write many highly regarded reviews may move to higher editorial positions. Third, online publication of data ensures the possibility of independent verification of inferential claims in published papers. This counters statistical errors and overly positive reporting of statistical results. We illustrate the benefits of these strategies by discussing an example in which the classical publication system has gone awry, namely controversial IQ research. We argue that this case would have likely been avoided using more transparent publication practices. We argue that the proposed system leads to better reviews, meritocratic editorial hierarchies, and a higher degree of replicability of statistical analyses."
Uri Simonsohn provides an abstract (which links to a full, free download of a funny, thought-provoking paper)
with a "twenty-one word solution" to some of the practices most likely to make psychology research papers unreliable. He has a whole site devoted to avoiding "p-hacking,"
an all too common practice in science that can be detected by statistical tests. He also has a paper posted just a few days ago
on evaluating replication results (the issue discussed in the commentary submitted to open this thread) with more specific tips on that issue.
"When does a replication attempt fail? The most common standard is: when it obtains p>.05. I begin here by evaluating this standard in the context of three published replication attempts, involving investigations of the embodiment of morality, the endowment effect, and weather effects on life satisfaction, concluding the standard has unacceptable problems. I then describe similarly unacceptable problems associated with standards that rely on effect-size comparisons between original and replication results. Finally, I propose a new standard: Replication attempts fail when their results indicate that the effect, if it exists at all, is too small to have been detected by the original study. This new standard (1) circumvents the problems associated with existing standards, (2) arrives at intuitively compelling interpretations of existing replication results, and (3) suggests a simple sample size requirement for replication attempts: 2.5 times the original sample."
The writers of scientific papers have a responsibility to do better. And the readers of scientific papers that haven't been replicated (or, worse, press releases about findings that haven't even been published yet) also have a responsibility not to be too credulous. That's why my all-time favorite link to share in comments on HN is the essay "Warning Signs in Experimental Design and Interpretation" by Peter Norvig, LISP hacker and director of research at Google, on how to interpret scientific research.
Check each submission to Hacker News you read for how many of the important issues in interpreting research are NOT discussed in the submission.
On a side note. Has anyone found a good alternative for Mendeley? I heard http://bohr.launchrock.com/ is working on sth cool.