
We found only one-third of published psychology research is reliable – now what? - tempestn
https://theconversation.com/we-found-only-one-third-of-published-psychology-research-is-reliable-now-what-46596
======
vox_mollis
Sneaking suspicion that this will readily extend to other soft sciences,
particularly sociology.

A good chunk of our social and educational policy is informed by psychology
and sociology.

Since we have all grown up within various systems and institutions that have
been largely informed by bad research, it's unlikely much will happen, since
we see the claims made by these charlatans to be normal or common sense.

Nonetheless, it will be amusing watching the charlatans try to defend
themselves in the coming years.

~~~
unprepare
This seems to me to be a clear cut case of poor incentives in the academic
research sector.

No graduate student is incentivized to re perform existing experiments to
verify them, no they find much more success in creating a unique experiment
which tests a previously untested hypothesis.

This is modern academic science. If you aren't doing something newsworthy,
you're not getting any funding.

~~~
EdiX
>No graduate student is incentivized to re perform existing experiments to
verify them

If you look at how this thing with psychology started ("replication bullies")
or at what Brockman said about the LaCour scandal earlier this year (he didn't
pursue publication initially because he thought he'd risk his career), it's
not that they aren't incentivized, it's that they are actively discouraged.

------
analog31
One small suggestion: Eliminate "published and outa here," the policy where
students have to publish their MS and Phd work in order to receive those
degrees. All that does is to fill the journals with crap.

~~~
omginternets
I agree in principle but oftentimes the research done by grad-students is part
of a project that is spearheaded by a career researcher. I question the notion
that MS and PhD work is filling journals with crap.

Instead, I think there are three problems:

1\. The inherent problem of inference. When you set p < .05, then 5% of
studies will yield false positives. This can be mitigated by a variety of
statistical approaches, but it's an irreducible problem on a fundamental
level.

2\. Certain (sub)disciplines of science aren't very scientific. In particular,
they suffer from a high incidence of unfalsifiable claims and hand-wavy
definitions. A good example, I think, is the shame vs guilt literature.

3\. Certain (sub)disciplines of science are highly politicized. I point my
finger at anything involving "identity" (in the sense of "identity politics"),
"diversity" (sex, age, ethnicity, sexual orientation, etc) or "bias".

Regarding point 3, take a look at the faculty and students of any university
you'd like and take stock of how many Black people are working on racism,
women on sexism, LGBT on sexual orientation, etc. As a cognitive
neuroscientist, I don't want to claim that people can't study themselves, but
in light of the serious reproducibility issues that social psychology is
facing, I think this correlation is anything but trivial.

I strongly suspect that the "hard psych vs social psych" dichotomy can be
further broken down, and that the sub-disciplines listed in point 3 account
for a disproportionate quantity of false-positive results, on accounts of
flawed methodology. I suspect that much of this research is actually an
attempt to validate political opinion.

I may be wrong about all of this, but at least the core claim testable with a
handful of linear regressions.

~~~
analog31
All good stuff. Thanks. I'm just saying, don't make publication a
_requirement_ for graduation.

I'm not a statistician, but as I understand it, a problem with the p value is
that the standard calculation hinges on assumptions that are not necessarily
met by the data, and if those assumptions were correctly accounted for, p
would be higher.

If p < 0.05, and in practice 50% of published results are bunk, then
regardless of the underlying math, a practical rule of thumb seems to be:
Multiply p by 10.

~~~
omginternets
The practical rule of thumb for a p-value is that it's nothing more than the
rate of expected false-positives. p=.05 means you can expect 5% of false
positives.

If more than 5% of your studies are failing to replicate, then something else
is involved, but it's not necessarily fraudulent. Post-hoc hypotheses are
usually the biggest culprit, but it's insidious to the point that researchers
often don't realize they're doing it.

------
geomark
Let's be clear on the differencen between _reproducible_ and _replication_.
Reproducible is when an experiment's data and methods are published and other
researchers can reproduce the same statistical results. An experiment that has
been replicated is a _new_ experiment that applies the same methods to a new
set of subjects and arrives at statistically equivalent results.

The distinction is important. Reproducible research is a lot easier but there
are still famous examples of attempts to reproduce published results that
revealed serious flaws. Cases where the data are kept secret to thwart
reproducibility are thoroughly suspect. Replication is a harder problem, both
in being able to perform a potentially costly and difficult experiment, and in
the statistical analysis to show equivalent results.

~~~
nonbel
>"Let's be clear on the difference between reproducible and replication."

I did not find this clear. What is the distinction you are trying to make?

~~~
geomark
Reproduce: take the published data and methods, rerun the analysis, see if you
get the same results. No new experiment.

Replicate: perform a new experiment with new subjects, apply the same methods,
see if you get statistically equivalent results.

~~~
nonbel
Thanks, I had not heard that distinction in terms before. I suppose it would
be good to have, but I would not assume people are using them in that way.

~~~
geomark
Some people are making a lot of noise about making the distinction, like the
guys teaching data science at John Hopkins. It's because both are important
but they are distinctly different activities.

------
DennisP
A friend of mine is a social psychology researcher, who's published work
destroying the statistics in other papers. He says this claim of unreliability
is a lot more dubious than the media portrays.

In many cases, they say a finding was not reproduced simply because the new
study didn't quite meet the cutoff for "significance." However, if you do a
meta-analysis of both studies, you find that the likelihood of the effect
being real is higher, rather than lower.

In other cases, the replications were done poorly. The first step of many
studies is calibration. You do tests to find out what sort of things your
subjects are familiar with, and calibrate questions accordingly. A common
mistake in the replications was to skip the calibration step, and simply reuse
the questions on subjects with different backgrounds.

~~~
Asbostos
So we need the original authors to come back with their own re-replication.
Instead of throwing their published paper over the wall and moving on, they
are the people in the best position to actually defend it and address problems
with replications.

This publish-and-forget behavior annoys me about researchers. Sometimes I find
results that look like they'll useful for my (non-academic) work, but the
paper containing them is the final say on the matter. There's no version 2,
there's no other work improving on it. It's just a dead end.

------
omginternets
I actually have an answer for this:

Stop talking about "psychology" like it's a monolithic thing. Start by
separating "hard" psychology (perception, attention, psychophysics, etc) from
social psychology.

Papers about saccadic adaptation are, I suspect, much more reproducible than
those about implicit association.

~~~
Lawtonfogle
One thing I noticed is that the more focused the subset of psychology, the
better the results. For example psychology that looks at a group of neurons
and how they function is vastly better than psychology looking at how some
early life event X leads to later problems in life.

~~~
drumdance
Isn't that really neuroscience?

~~~
omginternets
When neuroscience involves predicting behavior and/or mental state, it usually
falls under the umbrella of "cognitive neuroscience", which in turn is a way
of saying "psychology that's not soft as all hell".

The boundaries are fuzzy, and you'll find people with psych, medical, neuro,
math, CS and philosophy degrees in experimental cogsci labs.

------
nonbel
Now what? You need to attempt independent replications of every published
claim going forward. It is clear a single published result is unreliable. This
is well known but apparently needed to be rediscovered by those who
misunderstand the meaning of a p-value.

>"The first is a p-value, which estimates the probability that the result was
arrived at purely by chance and is a false positive. (Technically, the p-value
is the chance that the result, or a stronger result, would have occurred even
when there was no real effect.) Generally, if a statistical test shows that
the p-value is lower than 5%, the study’s results are considered “significant”
– most likely due to actual effects."

No, there is endless literature on this. Just to start:

[http://andrewgelman.com/2015/07/21/a-bad-definition-of-
stati...](http://andrewgelman.com/2015/07/21/a-bad-definition-of-statistical-
significance-from-the-u-s-department-of-health-and-human-services-effective-
health-care-program/)

~~~
geomark
More to the point is how "replication" is defined. This quote covers it: "Note
that the 36% figure comes from a definition of replication that mimics the
definition used by regulatory agencies: results are considered replicated if a
p-value < 0.05 was reached in both the original study and the replicated one."

If you read the post where that quote comes from [1] they make a number of
points about how a better and more rigorous definition of replicated is
needed, because p-value alone doesn't tell the whole story.

[1] [http://simplystatistics.org/2015/10/20/we-need-a-
statistical...](http://simplystatistics.org/2015/10/20/we-need-a-
statistically-rigorous-and-scientifically-meaningful-definition-of-
replication/)

~~~
DennisP
So say one study has a p-value of .04, meaning there's a .04 probability that
there's no effect and the results occurred by chance. A replication comes
along and gets a p-value of .06, so it gets counted as a failed replication.
And yet, the probability that both results happened by chance is only .0024.

~~~
nonbel
>"there's a .04 probability that there's no effect and the results occurred by
chance."

No, the p value is the probability of observing a result at least as extreme
as your own given the null hypothesis is true. There are two errors here

1) Transposing the conditional: P(A|B) != P(B|A)
[http://rationalwiki.org/wiki/Confusion_of_the_inverse](http://rationalwiki.org/wiki/Confusion_of_the_inverse)

2) Deviations from the null hypothesis can occur even the absence of a
treatment effect, ie one of your model assumptions is wrong, baseline
differences, etc.

Anyway, that is why instead of statistical significance researchers need to
estimate the size of the effect. If you estimate the effect is in the range -1
to +2 then you publish that result. Others also estimate the range and see if
these are consistent with each other.

~~~
DennisP
> probability of observing a result at least as extreme as your own given the
> null hypothesis is true.

My wording was poor but that's what I meant.

~~~
nonbel
They are two completely different things. Check out the link. Would you
accidentally say "It's cloudy therefore there is a high probability of rain"
when you meant "It's raining so there is a high probability of clouds"?

------
douche
Reach for an extra-large grain of salt whenever the nightly news trumpets some
new breakthrough in the psychological field.

Continue avoiding psychologists whenever possible.

------
medymed
Redistribute most of the $5 billion dollars of psychology and behavioral-
related NIH research funding to historically high ROI areas like infectious
disease research.

[http://report.nih.gov/categorical_spending.aspx](http://report.nih.gov/categorical_spending.aspx)

------
gregn
From 10 years of working as a sysadmin in a Psych department, one constant in
life I've learned is that Psychologists are the most incompetent, fanciful
egoists in all the sciences. They have no idea what is going on basically. Or,
put as a friend of mine said, "People who go into Psychology are precisely
those that do not intuitively understand it already." Implying that most of us
have a decent, commonsense appraisal of how things work, but psych people are
the most clueless of the clueless, that's why they go into a field (first
falling for its false promises) that professes to explain life's mysteries to
them. None of this is surprising, except perhaps that they scored so far away
from the median.

------
MollyR
This is quite frightening considering how people try to set government
policies based on them.

------
themetrician
The reliability of Psychology throughout the 20th century is covered in depth
in the book The Culture of Critique, where the exact persons responsible for
fostering in Psychology an anti-Science environment are discussed.

------
matthferguson
now what ... Accurately label psychology and the social sciences as philosophy
not science.

------
rjruxhchd
Far more worrying is that only a third of economics findings can be replicated
without the author's help, and yet these are the supposedly scientific
findings our politicians insist on basing policy off of. Even if policy seems
counterintuitive and clearly harmful to the general public, we're given the
explanation "because economics says so." It's clear now from the Federal
Reserve replicability study that we're being duped.

[http://www.federalreserve.gov/econresdata/feds/2015/files/20...](http://www.federalreserve.gov/econresdata/feds/2015/files/2015083pap.pdf)

~~~
chongli
Economists are the modern equivalent to the oracles at Delphi. They speak in
gibberish and the priesthood (politicians) make the interpretations which
coincidentally happen to favour themselves and their friends.

~~~
sopooneo
I feel there's truth to that, but most at the edges. At core there is stuff
economists tend to very widely agree on. There was a Planet Money episode
outlining some pretty drastic policy changes that a panel of economists across
the political spectrum all stood behind. Here's a write up:
[http://www.npr.org/sections/money/2012/07/19/157047211/six-p...](http://www.npr.org/sections/money/2012/07/19/157047211/six-
policies-economists-love-and-politicians-hate)

------
subliminalzen
It's not just psychological research that is deeply flawed. Only a quarter of
scientific drug research is successfully reproduced as well.[1]

Carl Jung claimed one of the chief factors responsible for mass brainwashing
is scientific rationality.[2] Society worships the Goddess of Reason while
frowning down on "irrational" and non-verifiable religious testimony.

Now that science is proven to be systemically corrupt, what will "rational"
people base their understanding on?

Aside: I designed a personality test / psychoanalytical tool that was inspired
by Carl Jung. It's called Critical Stimulus and it can be found at
[https://www.thegamecrafter.com/games/critical-
stimulus](https://www.thegamecrafter.com/games/critical-stimulus). The
printable version can be downloaded at gumroad:
[https://gumroad.com/l/criticalstimulus](https://gumroad.com/l/criticalstimulus)

.

[1]
[http://www.economist.com/news/briefing/21588057-scientists-t...](http://www.economist.com/news/briefing/21588057-scientists-
think-science-self-correcting-alarming-degree-it-not-trouble)

[2]
[https://en.wikipedia.org/wiki/Self_in_Jungian_psychology](https://en.wikipedia.org/wiki/Self_in_Jungian_psychology)

~~~
omginternets
>Now that science is proven to be systemically corrupt, what will "rational"
people base their understanding on?

This is not a new notion in the least. The "rational" people will continue
doing what they've always done: revising their conclusions.

Science is a process, and while we can question the notion of "scientific
progress" in philosophical terms, this is neither a new idea nor evidence that
science doesn't work. It's certainly not evidence that science is no better
than irrational thinking.

>Carl Jung claimed one of the chief factors responsible for mass brainwashing
is scientific rationality.

Few people take psychoanalysts seriously these days, in large part because of
their long record of absurd claims and shoddy clinical work. Jungian theory
has it's place in a conversation about literary theory, but not in a
conversation about science.

~~~
drumdance
I don't know about psychoanalysis as defined by Freud and Jung, but therapy in
general is an intensely personal experience.

I'm sure there are scientific aspects that can be brought to bear on a
situation, but for a lot of people the "literary theory" part is just as
helpful. Especially when you stop to think how much neurosis is fueled by pop
culture (i.e. status envy).

~~~
omginternets
>but therapy in general is an intensely personal experience.

And empirically speaking, psychoanalytic therapy has a piss-poor record in
dealing with mental illness.

If you're looking for spiritual guidance, then _maybe_ a psychoanalyst can
help. If you're looking for clinical efficacy, they demonstrably don't.

I'll leave it as an exercise to the reader to make a relevant Google Scholar
query, but be careful not to confuse _psychoanalysis_ with _clinical
psychology_.

> for a lot of people the "literary theory" part is just as helpful.

Fine, but this is a conversation about psychology as a science.

