
Psychology’s Replication Crisis Is Real - zwieback
https://www.theatlantic.com/science/archive/2018/11/psychologys-replication-crisis-real/576223/
======
MAXPOOL
You would think that computer science can't have replication failures, but it
can. And I'm talking about my own field: machine learning. There is so much
hype that I suspect that people are pushing papers and trying to actively hide
the irrelevance of the methods and algorithms they develop.

Artificial intelligence faces reproducibility crisis
[http://science.sciencemag.org/content/359/6377/725](http://science.sciencemag.org/content/359/6377/725)

Reproducibility in Machine Learning-Based Studies: An Example of Text Mining
[https://openreview.net/pdf?id=By4l2PbQ-](https://openreview.net/pdf?id=By4l2PbQ-)

Missing data hinder replication of artificial intelligence studies
[http://www.sciencemag.org/news/2018/02/missing-data-
hinder-r...](http://www.sciencemag.org/news/2018/02/missing-data-hinder-
replication-artificial-intelligence-studies)

>In a survey of 400 artificial intelligence papers presented at major
conferences, just 6% included code for the papers' algorithms. Some 30%
included test data, whereas 54% included pseudocode, a limited summary of an
algorithm.

~~~
androidgirl
Why isn't more research and research materials open source in the AI world? I
don't really understand.

If one doesn't have your training dataset or your code, how could they
possibly replicate your results?

~~~
MAXPOOL
If you have a good paper with important result, providing code and data is not
necessary.

Providing the code to replicate is good form. It shows good faith and
confidence. Exact replication (exactly replicating the study) is just the
starting point to check that the code works and no obvious mistakes were made.

replication / reproducibility / hyperparameter sensitivity

If the research yields something really important and the method is well
documented, usually it can be easily checked without having the data and the
code. Things like dropout, batch normalization, residual learning, .. work
over multiple different datasets and hyperparameters. You can reproduce the
results without faithfully replicating the experiment.

If the claimed result vanishes unless you have the exact data, code or the
hyperparameters, the research can't be said to be meaningfully reproducible in
the scientific sense. Hyperparameter sensitivity is ML equivalent to
P-Hacking.

~~~
alevskaya
As someone who's often implementing models from ML papers, the issue is that
english-language descriptions of methods is often found to be sorely lacking.
Even good authors simply forget to mention small critical details of model
design or training that wouldn't remain ambiguous if they simply released
their code. Part of the effort of reproducing work is certainly figuring out
which aspects of the model design are critical or incidental - but this is all
greatly aided by not having to guess at what was actually done from a heap of
english and LaTeX generally written feverishly over a few days before a
deadline!

~~~
iguy
> Part of the effort of reproducing work is certainly figuring out which
> aspects of the model design are critical or incidental

Isn't this the work of writing up? A paper is a claim that you have discovered
something, and implicitly that other details are standard / unimportant. If it
turns out that some hidden assumption was in fact doing all the work, then the
claim you made was false.

I'm all in favor of sharing working code. But working code which magically
does something... amounts to an anomaly awaiting an explanation. Or an
advertisement.

~~~
sdenton4
Alas, training times are quite long... At any given time, there's probably a
couple best-in-class architectures out there for the problem you're interested
in, and one or two dozen interesting bells and whistles one can add as
decoration, each with a paper that makes pretty reasonable arguments and has
some stats demonstrating modest gains.

The right thing to do in this circumstance is an ablation study - throw
together your best-possible model and then test different subsets of features
sitting between your model and the 'basic' prior work. For large datasets,
though, each of these models might take a very long time to train (especially
if you don't work at a place with a stupid number of GPUs available).

So, lacking resources, you get your new best-ever accuracy number with your
'everything' model, and do an extensive write-up about how awesome the new
bell and/or whistle that you added to the pile is... (The problem is
compounded by a need to publish quick, lest someone else describe your
bell/whistle first.)

Another big problem is that adding a bell/whistle to the base model often
means adding more parameters to the model. There's decent evidence coming out
of the AutoML world that number of parameters matters a hell of a lot more
than how you arrange them. (It's real real easy to convince yourself that your
clever new idea is more important than the shitpile of new parameters you've
added to the model, after all.) So a really solid ablative study probably
needs to scale the number of parameters in a reasonable way as you add/remove
features... And there may not be obvious ways to do that smoothly.

And this is closely related to the replication study in psych: it's real real
easy to do kinda sloppy work with big words attached, and convince yourself
and all your peers that you're a genius.

I think a big database for reporting and searching for results with various
architecture+dataset combinations would be much more useful than pushing more
papers to the arxiv in many cases. (though, really, whynotboth.gif) Let me do
some searches to see if a particular bell/whistle actually adds value across
the god-knows-how-many-times someone's used it to train up imagenet from
scratch...

~~~
iguy
Then it sounds like people are doing something quite far from science. That's
not a criticism, there is lots of knowledge which isn't scientific... for
example if you want to know how to run a restaurant, you need to talk to lots
of people who do, and try to learn what they know about the art. Maybe you
even go to trade shows and listen to talks. But if you follow what they said
and you fail, it's not really a replication crisis.

~~~
sdenton4
It's a mix; some groups are more disciplined than others (and have better
resources for ablation studies).

To be sure, some sciences are more rigorous than others. Various natural
sciences are reliant on transient observations, which might or might not be
'reproducible' in any sense... And still progress is made, though the results
might contain more prejudices than one would encounter in other areas.

It's also worth noting that some areas within machine learning are extremely
difficult to test rigorously (e.g., generative processes), but are still
totally worth pursuing. So be careful with cries of 'it's not even science,
man!'

~~~
iguy
Right I genuinely don't mean that as an insult, perhaps my cooking example was
flip. There is lots of genuine progress made by other means -- anyone think
that Toyota's process inventions in were done by double-blind trials on a
large sample of factories?

And of course some guys a few doors down are proving theorems. And lots of
other things.

It is slightly strange though that we try to vet all of these with the same
peer review mechanism. This is one major source of the differing opinions in
this thread about how this ought to work, I think.

------
njarboe
Richard Feynman had a speech warning about this problem back in 1974. Complete
with a story about how a student was excited about doing some replication
experiments on rats and was told "no, you cannot do that, because the
experiment has already been done and you would be wasting time".

Others have also been worried about replication problems from the 1960's on.
Hopefully some of this worry sticks this time and we can get a better
understanding of what is really true in these fields. Physics likes to have
very well defined uncertainties on everything they observe and don't like to
say something is "true" unless the fact is likely at 5 to 7 sigma. That seems
like a good margin. In psychology, 2 sigma is the standard for publishing a
result.

[1][http://calteches.library.caltech.edu/51/2/CargoCult.htm](http://calteches.library.caltech.edu/51/2/CargoCult.htm)

~~~
cultus
P-values are not good measures for reliability, because they absolutely are
not the same thing as the chance that a result is wrong. This confusion, and
ignorance of the prior probability of hypotheses is the reason why replication
rates are so bad.

In physics you can get away with not caring about the distinction, because of
how accurate the measurements and precise the theories are. That doesn't fly
in most other sciences, though.

~~~
jmcgough
> P-values are not good measures for reliability, because they absolutely are
> not the same thing as the chance that a result is wrong

As an undergrad, I even had tenured professors try to tell me that a p-value
is the chance that a result is wrong. Most researchers in psych or bio
sciences have a weak understanding of statistics, usually taking a single
statistics class in undergrad, then a single one in grad.

------
seibelj
Even in hard sciences, there are papers that everyone in the niche knows is
false, but fear to speak out about or publish a paper refuting it for fear of
their career being damaged.

My wife is a biochemist, and during her PhD they were working on an area of
research with a handful of labs publishing about it. One lab in particular was
known for a decent amount of questionable publications, but the PI was a big
deal and no one would officially question anything. So they would whisper
amongst each other and just ignore that paper (and all the additional papers
built on top of it) because they all knew it was bullshit.

~~~
zorga
Wow, that's a clear sign science has become too political. Proving other
scientists wrong used to be how careers were made, not something to fear.

~~~
danharaj
Science has always been political, within itself and in relation to the rest
of society.

[https://en.m.wikipedia.org/wiki/Oil_drop_experiment](https://en.m.wikipedia.org/wiki/Oil_drop_experiment)

~~~
whatshisface
Your link is not a story about politics, it's a story about psychological
bias.

~~~
wpietri
Political has multiple meanings in English, and I think this use is
legitimate.

This could be just anchoring bias. But I think it's much more likely that
people after Millikan did not want to look bad in front of their professional
peers for fear of social and professional consequences. Which I think is
something we can reasonably call political.

~~~
whatshisface
But I think it's more likely that the bias was internal, in which case it
could not be reasonably called political. Without arguing further we can agree
that the parent's claim was unfounded, because the linked article did not
demonstrate that the motivation was social.

(The reason I know we can agree is that unfounded is not the same as false, it
just means it's a non-statement.)

~~~
wpietri
We cannot agree on that. Feynman's talk is mainly about social phenomena, so
it's entirely reasonable to assume that this anecdote is also meant to
illustrate a social point:
[http://calteches.library.caltech.edu/51/2/CargoCult.htm](http://calteches.library.caltech.edu/51/2/CargoCult.htm)

------
HaukeHi
I've started a crowdfunding campaign that aims to tackle the replication
crisis:

Check out

[https://www.lets-fund.org](https://www.lets-fund.org)

We're raising funds for Professor Chris Chambers at Cardiff University.
Chambers is a leading proponent of a new and better way of doing and
publishing research, called ‘Registered Reports’, where scientific papers are
peer-reviewed before the results are known.

This might:

-make science more theory-driven, open and transparent

-find methodological weaknesses prior to publication

-get more papers published that fail to confirm the original hypothesis

-increase the credibility of non-randomized natural experiments using observational data

The funds will free up his time to focus on accelerating the widespread
adoption of Registered Reports by leading scientific journals.

This ‘meta-research’ project might be exceptionally high-impact because we can
cause a paradigm shift in scientific culture, where Registered Reports become
the gold standard for hypothesis-driven science.

------
jcaldas
It's somewhat heart-warming to read the comments here about machine learning.
I did my PhD in machine learning from 2007 to 2012, and the main reason I left
research was because of the widespread fraud.

Most papers reported an improved performance over some other methods in very
specific data sets, but source code was almost always not provided. Once, I
dug so deeply into a very highly cited paper that I understood not only that
the results were faked, but precisely the tricks that were used to fake them.

I believe scientific fraud arises primarily from two causes:

\- Publish or perish. Everyone's desperate to publish. Some Principal
Investigators have a new paper roughly every other week!

\- Careerism. For some highly ambitious people, publishing papers comes before
everything else, even if that means committing fraud. This happens even with
highly successful researchers, who have the occasional brilliant, highly cited
paper, but who also publish a lot of incremental, dubious work.

P.S. Mildly off-topic, but I love the Ethereum research community at
[https://ethresear.ch/](https://ethresear.ch/) , precisely because it is so
open and transparent! I wish an equivalent community existed for machine
learning.

~~~
seibelj
One thing I love about Ethereum is that it is self-funded, open, and basically
separated from mainstream academia. They created their own money, convinced
everyone that it had value, and then used it to self-fund their own
fundamental research. It's an incredible alternative to academic research.

~~~
jcaldas
I suspect Ethereum itself may provide a feasible basis for supporting open,
transparent research.

------
oldgradstudent
There's no replication crisis. The replication failures are the only thing
that is properly done.

We have a pseudoscience crisis.

And as others pointed out, it is not limited to psychology.

~~~
tmalsburg2
There is no easy way to tell whether something is real or pseudoscience. A lot
of things that we consider real today where perceived as pseudoscience in
their time. Science is largly a gigantic collection of dead ends that look
silly in hindsight and finding out what's good science and what's not is an
integral part of the scientific process. The idea that there can be a science
of just Newtons, Einsteins, and Feynmans is naive.

~~~
viridian
The problem is that replication studies are almost nonexistent for most
discoveries. Science is to some extent a business, and very few people with a
budget have any interest in allocating resources to replicating the work
someone else did, for no to little gain.

------
stanfordkid
The replication failure in psychology is inevitable. The whole field boils
down to taking an incredibly complex system (human beings) ... and measuring a
few variables on them across a few people.

Most of the "statistical" justifications stem from assuming distributions are
normal and samples are uniformly collected (both false).

This is why I trust the works of clinical psychologists like Jung and Freud so
much more seriously than the current approaches.

It just makes a lot more sense to observe something complex descriptively,
rather than start with a yes/no hypothesis and then test it with a few dozen
people.

~~~
FiberBundle
I'm not an expert on experimental psychology, but I'm fairly certain that the
principles of how experiments are conducted in the sense that they control for
other factors, choose sufficiently large samples etc., are pretty sound. An
often mentioned problem in psychology research is "p-hacking", where
researchers manipulate the data in a way that validates their hypotheses
statistically, which seems like a more logical reason for a reproducibility
crisis than the reasons you mentioned.

Also just because there are some problems in the field should not cause you to
dismiss science completely in favor of completely subjective speculating about
how the mind works, which certainly also has value, but I don't think it's
warranted to dismiss actual science in favor of it.

~~~
stanfordkid
But they're just not -- that's exactly my point. If you test a bunch of
Harvard undergrads (the most common subject type in these studies) -- that's
not a uniform sample. In any way whatsoever.

Furthermore, even if you get a high p-value ... your experimental design can
introduce so much noise.

Even when you do find these high p-value correlations -- they don't really
ever say much about "why" ... because they can't. So much of that is the study
design.

Let's say I make a study on how well people navigate a maze under the
influence of alcohol vs. not. You can probably get a high p-value that they do
worse while under the influence of alcohol... but it reveals very little about
why.

So much of that is dependent on the _particular_ maze. In fact I bet I could
design a maze specifically to prove whatever conclusion I wanted. That is the
whole problem in psychology. There is very little focus given to actually
characterizing cognition in any meaningful way.

I like the clinical psychologists because they attempt to do exactly that,
even though they have less "scientific" findings.

------
mcguire
" _But skeptics have argued that the misleadingly named “crisis” has more
mundane explanations. ... Third, people vary, and two groups of scientists
might end up with very different results if they do the same experiment on two
different groups of volunteers._ "

Isn't that the definition of a failure to replicate?

~~~
anvandare
Yes, but the difference is subtle. The theory might still be true after all,
but only for a particular subset.

"We've tested the theory that 'swans are white', and it turns out the swans in
our test were all black". (The conclusion isn't that swans aren't white, it's
that _some_ swans are, while others are black.)

(add.: the article mentions the 'WEIRD' category. Obviously the
applicability/set range of a theory matters. If it only holds for "people who,
between 2 and 5 pm on Sunday are walking around in Reading with a bowler
hat.". Well, that's nice, but not very useful.)

~~~
hn_throwaway_99
But the issue is still major, because the results are nearly always announced
with broad applicability. I.e. the headline was "The simple act of smiling can
make you happier!" not "The simple act of smiling can make you happier if you
are an 18-22 year old college student living in Virginia." Plus, a lot of
these explanations seem really statistically dubious. The whole point of
applying p-values is to determine the probability that your sample represents
population at large. If you did a very poor job picking that sample and
extrapolating to a larger population that wasn't warranted, that's a real
problem in and of itself.

~~~
anvandare
Absolutely. And it's sadly also correlated to the overall poor quality of
science news reporting in our societies. One of my favorite PhD comics:
[http://phdcomics.com/comics/archive.php?comicid=1174](http://phdcomics.com/comics/archive.php?comicid=1174)

------
jdoliner
There's no such thing as a replication crisis in science, that's just a
conflict-averse way of saying that some "scientists" aren't actually doing
science. It's not limited to Psychology and this type of conflict-averse
approach is a big part of the problem. Other scientists, the "good ones," if
there are indeed any good ones left, need to realize that what's at stake is
the trust of their profession as a whole. If you're not willing to expose
false research in your field, then you're no longer working in science.

~~~
dustycat
Science (academic work) is a career for many people, rather than a vocation.
Rewards include high status, stable income, opportunities to travel, long
vacations, etc.

You can say it's not science but most "scientists" are just normal people with
everyday priorities, rather than Einstein-like people who go on doing science
work while working as patent clerks.

------
ineedasername
I don't see how people can doubt this. Positive results get published. Even
without p hacking that means a significant number of experiments at the p
<0.05 level are published when they're one of the 1 in 20 due to chance.
Combine that with p hacking and small samples with small magnitude effects...
well replication is going to be a problem. Part of the solution here might be
more impetus and incentive to publish negative results.

~~~
mamon
I’ve always thought that p<0.05 is ridiculously low bar, missing at least one
zero after the dot.

~~~
vharuck
It should be more than enough _if_ the experiment's replicated rnough times.

~~~
ineedasername
Yes, and with larger samples. Too many small sample experiments which, at
most, should be considered a "pilot" study, and when results are positive even
with p < 0.01 they should still only be considered "suggestive" until
replicated, preferably with larger samples.

------
lifeisstillgood
It's positive news mostly - they took 15,000 volunteers and ran 28 "big"
studies across them, working with original teams to get the nuances right. So
they had larger samples than originals and ran experiments multiple times
across multiple samples.

Where a study failed in one it mostly failed everywhere, and where it worked
it mostly worked in all the experiments.

So, yes, priming someone with the number 32 won't get them to bet it in the
casino, but it does mean that social science can be experimental - you can
build and run lab scale social science - a positive !

~~~
ALittleLight
This is a really optimistic take. I agree with you, ultimately, this is
positive because it shows that it's possible for psychology to be done and
have widespread meaningful results.

In the short term though this seems extremely negative as it indicates that
the psychology field and researchers we have today are producing substandard
results. Two parts of this article increase my negative feelings. First, they
mention that these 28 studies were selected because they were well known and
influential. If well known and influential studies only have a 50%
reproducibility rate - what about the average study?

Another point I find very troubling is that people betting online were able to
predict, with high accuracy, which studies were accurate and which weren't.
This tells me that the problem is an "Emperor's New Clothes" type of problem
and not just a challenge with the fundamental difficulty of the subject since
the subject is so yielding to the public's intuition. Instead of a problem
with the fundamental difficulty, the problem seems to be with the psychology
of the researchers - being unable to diagnose and prevent methodological flaws
in their research. This class of failure is much less excusable to me.

~~~
repolfx
The article that goes deeper into the betting markets says that they suspect
many of the traders were people invested in the reproducibility crisis and
related research, that is, they are more knowledgable than the general public
would be.

That doesn't take away from your point, but the problem here may be more
refined than "the public" vs "psychologists". It may be more like "some
psychologists" vs "journal editors and other psychologists".

~~~
ALittleLight
What I wrote was influenced by the fact that I had taken a "Psychology
Replication Quiz" which I'll link to below. The quiz presents you with the
thesis of a study and asks whether you think it will replicate. When I took it
the first time I got ten questions correct out of ten. I took the quiz again a
moment before writing this comment and got nine out of ten (I think I was
hurrying this time). The average person taking the quiz gets seven out ten
correct.

The depressing part about this quiz is how easy it is. You just have to think
for a moment if what they are describing matches what you know about human
nature. Some statements just feel very intuitive and likely, and those
replicate. Other statements seem quite wild and abstract, and those don't
replicate.

I haven't seen a list of the 28 papers they looked at here, but I'd bet it is
similar. Some of the findings are incredible (and not reproducible) while
others are intuitive and reproducible. I'd encourage you to take the quiz
below and see if you feel the same way about the results as I do.

Psychology Replication Quiz - [https://80000hours.org/psychology-replication-
quiz/](https://80000hours.org/psychology-replication-quiz/)

------
Invictus0
I'm not sure that there's necessarily a problem with psychology itself, but
just that the number of participants in studies is consistently too small.
Just like how physicists increasingly have to spend more money to get more
accurate results (LIGO, LHC, redefinition of the kg, etc.), psychology needs
to realize that 1000+ study participants is necessary to get valid results.
Unlike physics, though, where it's easy to show that results are bad when the
equipment is bad, the nature of psychology makes it easy to publish spurious
results without enough support. When the psych community starts to demand
higher N values, you expect fewer, lengthier, more costly, but more accurate
results.

------
beefman
Paper is here: [https://psyarxiv.com/9654g/](https://psyarxiv.com/9654g/)

~~~
entwife
Thanks. It's usually more informative to read the original source.

------
75dvtwin
50% replication success rate.

I suspect, the authors of the article are a bit too generous to call the
original work 'sloppy'. And that the alternative would be: no experiments with
conclusions' \-- and that's, in their view, is worse.

I disagree with that, I think 'being lied to', is worse than to 'not being
told'.

What if the original work was purposely fake just get a grant from some group
(political or commercial)?

If that's the case, than the field of psychology is a composed of big number
of 'full-stack con-artists'.

------
gboudrias
As someone who is both a developper and a psychology student, your average
coder knows about as much psychology as your average psychologist knows about
coding. That is to say, not a whole lot.

I am saddened but not surprised by psychology's replication crisis. But really
it applies to all of academia, psychology is only in the spotlight because
some of branches have less than stellar methodology in the first place. Social
psychology gets criticized a lot, partly because it's one of the only branches
rigorous enough to even be tested!

Obviously I'm not in favour of this. But it's a very big machine that needs
fixing, and I don't think it starts with tweaking the details; drastic changes
are required if we are to gain back the lost credibility. To me, this applies
both to social and hard sciences. Since this is a general problem with
academia, either your field has already been called out or it has yet to be. I
don't believe the current system offers an alternative (yet), though I'm
cautiously optimistic about preregistration.

------
SeanLuke
> * A mention of the marshmallow test was removed from an early paragraph,
> since the circumstances there differ from those of other failed
> replications.

I tried digging into the study webpage, but holy cow it's not organized in an
easily digestable fashion.

So does anyone know what happened with the marshmallow test replication?

~~~
ddebernardy
It failed.

[https://www.theatlantic.com/family/archive/2018/06/marshmall...](https://www.theatlantic.com/family/archive/2018/06/marshmallow-
test/561779/)

~~~
hn_throwaway_99
Reading your article, I wouldn't necessarily interpret it as "failed":

> Instead, it suggests that the capacity to hold out for a second marshmallow
> is shaped in large part by a child’s social and economic background—and, in
> turn, that that background, not the ability to delay gratification, is
> what’s behind kids’ long-term success.

I.e. it would still seem to say that the capability to delay gratification is
tied to economic success, except that's really a confounding variable. It's
also very possible that the ability to delay gratification _that one gets_
from being more affluent is also a factor in future success.

~~~
taeric
Both of those, though, spell failure for the original experiment. It claimed
that the predictor was delayed gratification. That is false as false as the
predictor of future success being how successful you are in 10 years. :)

------
stuart78
"Third, people vary, and two groups of scientists might end up with very
different results if they do the same experiment on two different groups of
volunteers."

This seems to me to be such an important key to understanding this.
Considering even myself, I'm not sure I would behave the same way as a subject
from one study to the next (pretending there was a way to blind myself). And
when you look across regional and global cultural variation, it must have an
effect on participants.

The results are less headline-getting if they seem to be overly qualified
(e.g. we find that midwestern white women with the middle name Susan have a
positive psychological response to salsa dancing) but a deeper appreciation
for the complexity and diversity of individual psychology would seem superior
to the current state of affairs.

------
aj7
They succeed in replicating “half the time.” If they did a second replication
trial, I wonder what the overlap of replicated studies with the first
replication trial.

~~~
ThrowMeDown01
They did?

> _And they repeated those experiments many times over, with volunteers from
> 36 different countries, to see if the studies would replicate in some
> cultures and contexts but not others._

So different sets of people, and different researchers (the article doesn't
say it, but I doubt the same researchers traveled across the globe
repeatedly).

------
forapurpose
I haven't looked at this study, but prior replication studies have found that
the effects were reproduced in almost every study, but that the effects were
weaker in the replication than in the original study.[0]

That raises different questions than 'psychology studies are all wrong', and
I'm still waiting to see a serious discussion of them. So far, what I see is
news (studies not replicated) and condemnation (like this article), from which
I don't learn much.

[0] That's not necessarily a condemnation of the researchers: Imagine a set of
studies that, each time they are performed, have a 50-50 chance of producing
the strong effects or weak effects. Which studies in that set are going to be
the ones that you read about? It could be selection by randomness.

------
niceworkbuddy
We are seeing actual progress in science! This is good.

------
viburnum
Is there a list of the successfully replicated experiments? I haven't been
able to find it.

~~~
clord
I'd hope for a journal the only publishes replicated results. I think there is
limited funding for this right now.

------
LifeLiverTransp
It might sound conspiracy. So i will try to formulate it, right away like it.
Why- if i have exploits and knowledge on human weekneses of the human mind-
would anyone share those?

Basically, in a world, where the mind-hardware can not be fixed and improved.
He who goes full black-hat first rules forever unchallenged, even unpercived?
So wouldnt it be reasoNobel, to set free a quack science, to keep those with
similar thirst for powerfull knowledge busy? Maybe its tabula rasa time when
it comes to psychology. Only replicated studies count, and everything before
simply - never existed - and with it, works and titles referencing to it.

------
narrator
Economics is in worse shape in some regards. Nobel prize winner Paul Romer has
called out most macroeconomics as being completely detached from reality
because economic concepts and conditions that are talked about in scholarly
papers cannot be identified in the real world by any method. [1]

[1] [https://paulromer.net/trouble-with-macroeconomics-
update/ind...](https://paulromer.net/trouble-with-macroeconomics-
update/index.html)

------
chiefalchemist
It seems to me how the test subjects are selected likely matters. I presuming
most studies advertise (in some form) for test subjects. That ad appeals to
some but not to others, and that leaning could influence the results. In a
way, the study starts there, not the lab.

If the study were a landing page, with a call to action, the conversions are
influenced by the ad.

No doubt, 50% replication feels ugly. None the less, the fact is, the studies
are slightly different. Maybe in theory that shouldn't matter, but perhaps the
reality is it does. Again, per the PPC analogy.

------
patsplat
That and we're all fools for doubting the scientists diagnosing ADHD
[https://news.ycombinator.com/item?id=18562975](https://news.ycombinator.com/item?id=18562975)
/eyesroll

------
londonlover
dang, what an absolute horrible time for this to occur as public faith in
institutions is deteriorating.

~~~
pessimizer
Public faith in institutions is deteriorating because they do not do this. The
more shit that gets thrown out, the better institutions look.

------
Voyage_wanderer
Assuming we are living in a simulation. Every our attempt of understanding
rules and breaking through the simulation should be considered as a security
breach by another side. So it gets fixed. Hence no replication. It had been
fixed, dumbs... it’s my theory and I will stick with it...

------
crimsonalucard
They successfully replicated a study involving replicating other studies.

------
jamp897
Psychology in the west is really a form of cultural anthropology, in that it’s
just measuring behaviors for a specific set of people but not the underline
mind directly. So this would make sense IMO that you can’t repeat many of the
studies because they’re not grounded in the fundamentals of the mind but a
layer abstracted a way by cultures and habits. Akin to strudying a Toyota car
and trying to generalize all of physics from it.

~~~
pessimizer
I have a more cynical view: it's a field that almost exclusively deals in the
creation and enforcement of behavioral norms; entirely political, verging on
religious in its continual reification of theory-theory. I don't know what
that has to do with the West, though.

Behaviorism is the only real psychology.

~~~
quickthrower2
What's your basis for this view? Any good reads you can recommend?

~~~
jamp897
The core issue is that you need to establish a norm, but without an absolute
point of reference such as enlightenment it’s like trying to walk strait in a
dessert which is not possible. But if you bring this point up of going in
circles they act just like Plato describes in his Cave Alegory. So it’s not a
new issue. I don’t know if anyone has written a book specifically about
psychology being a faux science, like eugenics in a way, but someone should.

