Hacker News new | past | comments | ask | show | jobs | submit login
Laypeople can predict which social-science studies will replicate successfully (sagepub.com)
150 points by apsec112 39 days ago | hide | past | favorite | 69 comments

According to the "method" section, 138 out of 257 candidate (233 final) participants in the study were first-year psychology students - I can't help but wonder if this might have biased the study's results. The investigators also classified graduate students as laypeople, which doesn't quite feel right to me.

Not exactly the same, but there's this interesting quiz that presents a number of studies that appeared in Science and Nature and asks you to guess which failed to replicate: https://80000hours.org/psychology-replication-quiz/ .

After taking that the result in this paper doesn't surprise me. I'm educated, but not in Psychology, and I think most people can tell which studies are more likely to fail to replicate.

The last time I took this quiz I got them all right. I just used the heuristic "Would I expect this to be true regardless of whether this paper existed?"

Yeah, that works unless the quiz is cherry-picked to demonstrate the opposite.

So things they do you believe are common sense fail to replicate?

The reverse. Things that seem to be common sense are true, and things that seem surprising and weird are false.

As an example:

"If subjects hold a heavier clipboard they feel more authoritative" - that's a pretty unexpected and surprising result. I expect it won't replicate.

"People will give more to a charity if the overheard for that charity is already covered." That feels pretty common sense to me. If you're donating, you probably want your money to go to the cause, and not the overhead of the organization gathering donations etc.

Great quiz! I got 14/15 right. Pretty easy to tell I thought - it seems like all the bullshit ones are "people who do done activity then behave differently when doing some completely unrelated task". Do psychologists dream up interesting self-help style headlines and then try and prove them? "Spending 5 minutes a day thinking about sex could make you live 10 years longer!"

Yeah, they’re all priming studies. Afaik that was hot shit for a while but has been mostly discredited. See “power poses” for a dumb way it trickled into pop culture.

My six year old got nearly all of these correct (including one I guessed incorrectly).

Well, "laypeople" in this case necessarily refers to people who can actually read the actual study instead of some pop-sci article about it. (That excludes many of the journalists writing these pop sci articles and practically all of the editors writing the headlines. /sarcasm)

Actually not so:

"The materials for each study included a short description of the research question, its operationalization, and the key finding. These descriptions were inspired by those provided in the SSRP and ML2, but were rephrased to be comprehensible by laypeople. In the description-only condition, solely these descriptive texts were provided; in the description-plus-evidence condition, the Bayes factor and its verbal interpretation (e.g., “moderate evidence”) were added to the description of each study."

Well most social science studies are conducted over the same sample - it just proves that all their results are biased.

As a layperson thus I can predict that this study won't replicate. (liar paradox)

You're only 50% accurate though.

I'm in a superposition state of being right and wrong

Besides, this is more or less an ode to "common sense" which is in my experience actually not so common.

And definitely there is some form of age group bias, if the large majority were students (first year or otherwise) the sample must be dominated by 20-25 years people.

Maybe (or maybe not) a sample with more aged people (i.e. with more life experience) would have reached more than 59%.

This is extremely common in social science, for obvious reasons (imagine you're a social scientist and need to find 200 people to answer your questionnaire and you don't have any money...), and I wouldn't be surprised if it has some part in the "replication crisis".

A personal favourite example is that meme darling, the Dunning-Kruger effect. In addition to often being an extremely simplified version of the original proposition, which is a lot more subtle than "dumb people think they're smart", it's often presented as if (social!) science has uncovered a universal trait in the human race. But really this effect is based on very few studies, and they were mostly done on American college students. I wonder how well it replicates...

For accidental reasons I've been around psychology students for a long time and rhere was a joke along the lines of "psychology is the study of people who study psychology".

Students are expected to participate in a few studies now and then as they will be on the other side at some point.

Maybe this particular study, but I have to say I believe it in general.

I remember working for a manufacturing company where I drove a forklift (during the summer).

One of my most vivid memories is walking into the breakroom and seeing on the television a scrolling banner that declared homosexual couples were less likely to have children than heterosexual couples.

And I remember everyone in that room laughing their asses off. And it's not as if no one in that room realized the reason for the study was so they could take into account things like adoption, etc. It's that the result was so obvious, even taking that into account, that it was amazing that someone was PAID to come to a conclusion that everyone knew without the money.

And this is the crux of the problem with "science". It wants to be "interesting", so it will literally try to drum up something against what "everyone knows".

So the idea that what everyone over generations "knows" is generally more applicable than "science" is not surprising at all.

A lot of the things that "everyone knows" are wrong. The point of science is to have a systematic way to tell the difference between things that seem intuitively true and things that are actually true.

I did a quick search and found a study about percentages of couples raising children [1]. They have a table by couple type and marital status. The difference isn't as stark as you might think.

Looking at the smallest difference in the table, married female/female partnerships have a 30.2% chance of currently raising children, while married male/female partnerships have a 38.7% chance. Is it obvious to you that this would be so close? Would you be surprised if there were countries where the numbers are reversed?

[1] https://williamsinstitute.law.ucla.edu/publications/same-sex...

I also doubt that the point of the study was to find out which type of couple had "more" children - even if that is what the news picked up.

It is obviously important for population and demographic research to know just how age, type of couple, divorce, income, etc. effects family size with actual estimates.

The issue is that you don't really need a study to open up a table of statistics in demographics department.

Your post just did a study on that topic. Should you get funded for that?

I spent a minute looking up the results of existing studies. If a professor at a public university spent a minute to look up an answer for a reporter, I'm ok with that professor having a salary.

The qualitative factor is obvious. But the quantitative may hit that sweet spot. Studying just how many fewer children they have and how that interplays with free time, work and wealth. Or how the decreased contact with the younger generation changes your worldview and psychology.

The scientific process should actually result in people verifying things that seem obvious all the time. Admitting that you might be wrong and checking it even though it seems obvious was a major factor in starting to really increase the density of correct knowledge.

How can we know that eggs are tasty unless a 3rd party tells us!

I won't quote the science, I'll simply say that things are knowable outside of a system...

There's still a reason to quantify the taste of eggs. It can tell you how many eggs to bring on an expedition to minimize waste, or how to improve the flavor of eggs or other things, or predict the revenue of a new egg farm. The techniques used to quantify tastiness could then be applied to newly developed foods.

Criticizing scientific research by reading a news report is like criticizing a computer by asking your great-parent's opinion about the command line.

But still 41% didn’t think it would replicate. Proving a result to 41% of the population still seems useful.

You pulled those numbers out of your ass...

The truth is that 99% of the population believes that homosexual couples will have less children then heterosexual...

The surprising result for the uneducated? That homosexual couples have any children...

>You pulled those numbers out of your ass...

They're using the percentage in the original comment/study and transferring it to the recent example.

That doesn't mean that 41% would not be convinced it could be replicated though since it's likely example related.

> Maybe this particular study, but I have to say I believe it in general.

recursion :D

TIL: People are able to predict studies from people in their fields who probably a have a similar bias.

I was surprised to read about the low reproducibility rate, how can you call it social "science" if you can't reproduce it?

> how can you call it social "science" if you can't reproduce it?

NB: this is not specific to the social sciences. psychology, cognitive sciences, medicine(!), and other fields have a 'replication crisis' also. there are some really good papers on this. i am not aware of systematic queries into CS/ML/DL papers, but from my own experience: when you pick up a paper that claims 9x% accuracy, the chance that you are able to replicate that number is in line with replication % in other fields.

That's very interesting! I remember this happening a lot with sports / nutrition studies. I'm not surprised to see medicine in there as well. Overall, understanding the brain is hard, understanding our bodies is hard, it's normal that we can come up with a limited amount of consistently reproducible knowledge per year.

From all my friends who pursued STEM, I assume this is happening because of the pressure on people pursuing higher education at all costs, not finding enough qualified jobs, getting stuck in academia and printing low quality papers to get a promotion / keep getting another public grant.

If you add in to the mix results influenced by the current political climate, what your peers think, what your sponsors want to prove for financial gain, the situation gets bleak very fast.

Irreproducible papers are the abandoned OSS projects of software engineers in private tech + corruption + public money, a recipe for disaster.

I don't think that's quite fair. Irreproducible papers have supposedly gone through a fairly rigorous review process by their peers, and are usually published by someone with a PhD (or who will soon have one) at an institution of higher learning. Many papers (which often end up being irreproducible) use language that says "This is the thing I found, and it is/is not true."

OSS is often someone learning, or who has an itch to scratch, and put some code out there that may or may not solve your problem or work right. The "lack of warranty" clause in most open-source licensing is really important here, because it's largely designed to say "hey, I did a thing but make no guarantees so use at your own risk", whereas a published paper says "hey, a number of experts and people who should know all agree that this is a thoroughly-researched and well-thought-out position, and you can probably consider it to be true (or nearly so) and base some of your decisions on it." I think that change in context is really important to consider when making the analogy here.

> Irreproducible papers have supposedly gone through a fairly rigorous review process by their peers

This is a myth. Modern "peer review" simply means checking to see if the claim is interesting and if the proper Word or LaTeX template was used. Peer review meant replication in the distant past before the modern Grants and Impact Factor system was built.

> This is a myth. Modern "peer review" simply means checking to see if the claim is interesting and if the proper Word or LaTeX template was used.

I got peer review from two reviewers on a paper submitted to Frontiers last week--it consisted of much more than that. Sure, the reviewers couldn't replicate our exact work themselves, but the paper's methodology was also heavily considered, and both reviewers asked for modifications to be made on different parts of the paper.

It's an imperfect system, but it's not as if there's a rubber stamp floating around. The goals of the the peer review system vs. open source are fundamentally different, and the way the products for both are treated should reflect that.

I'm curious how much of this is lay people evaluating plausibility vs. politics. I only know the mandatory minimum of social science I was required to take in college but if you understand that gender and lgbtq+ matters are de facto immune to real criticism then you can pretty easily guess which hypothesis are likely to be right/wrong by whether or not them being correct would validate a negative interpretation of any of those groups. Basically anything that harshes the vibe is a no go in American social science unless you can use it to criticize white people or cis people, if you know that then you can predict most big stuff since everything else is just rhetorical frameworks masquerading as science.

Read the abstract of the studies involved. None of the ones evaluated negatively involve anything like you mention. No gender studies, lgbt, things like that. The most woke one was "Analytic Thinking Promotes Religious Disbelief".

But the newest bad one was 2012 and you could say most of the bad ones had a socialist lean.

Citations needed? This comment smells a lot like:


I'll believe this study when it's replicated successfully.

I'm no expert, but I predict it will be.

59% possible


They claim that "laypeople" can predict if the study is reproducible, but they only got a 59% accuracy. It's technically true, but it's too close to 50%. I can't find how many of the 27 studies there "reproducible" and how many no. Specially because half of the subject were first year students of psychology that should have a minimal idea of the subject.

And the 67% when the people is informed about the strength of evidence is even less impressive. How much accuracy would have a parrot that just repeat the information?

To be fair they translate the strength to the evidence from a numeric scale to a simple words scale, and the participants should translate it back to a numeric scale. The real question is how good is people doing this task with random numbers, without additional information like the description of the study.

Considering they only got it right 60% of the time, that still means there’s a large proportion that are unintuitive, so I don’t think it makes sense to trumpet this as if it’s super clear cut which are obvious and which aren’t.

Heh, pretty clever humour if intentional :-)

Yeah, this checks out

Edit: but seriously, I would assume that this applies to hard sciences as well, if you replace “laypeople” with “domain experts”. Plausible hypotheses are cheap; designing a study and collecting data to prove them is hard. Many studies confirm something that feels obvious to researchers in the field, but being able to write that thing down as fact with a citation allows people to move a step forward. I would say it’s the extreme outlier study where an implausible hypothesis is proven true.

Why is this surprising at all?

Surely it’s the expected outcome.

The opposite would be something like ‘most social science results are counterintuitive’.

59% of the time, that doesn’t seem very interesting at all. So ~10% of studies where results are strong enough to be replicated also have predictable outcomes (presumably these are just simple, obvious situations), the rest are a coin toss.

I actually would think it's the other way around: ~10% of studies that sound like obvious nonsense and have p of ~0.049 are probably NOT going to replicate.

Isn't this why we have initiation? Since we don't ever have complete knowledge, our brain has to predict/guess at some sort of probability curve for an event happening all the time.

So much of our political rhetoric is just basic one dimensional math. "have and have nots", "wage gap", and so on.

The World would be a better place if more people understood exponential growth, statistics, game theory, and so on. It hampers our politics, when the rhetoric needs to be reduced into basic algebra.

Just trying to imagine some rhetoric recast in the language of exponential growth, statistics and game theory....

"In a sense we've come to our nation's capital to cash a check. When the architects of our republic wrote the magnificent words of the Constitution and the Declaration of Independence, they were making a costly signal of commitment. But it's obvious today that America has failed to support the separating equilibrium... We have also come to this hallowed spot to remind America of the fierce urgency of hyperbolic discounting.... I have a dream that my four little children will one day live in a nation where their skin colour will not be a sufficient statistic of the content of their character!"

"Reproducibility crisis" doesnt really cover it. Would a layperson term like "bullshit" be more accurate?

See related recent discussion about what’s wrong with social science, which also touches on prediction markets for replicability: https://news.ycombinator.com/item?id=24447724

I know hackernews is not the place for this, but my mobile browser truncated studies to stud, and the headline about a successfully replicating social-science stud had me stumped for a moment.

The problem with all this is that the interesting study is the one that replicates that people didn't expect to.

I find that I can usually predict which topics on HN are likeliest to spark low-effort comments :)




4chan is a case study in why arguing with people arguing in bad faith is a waste of time

> It is a slam dunk to get hoaxes accepted, if they follow the party line:

That's a real bad example. If you claim to be doing a detailed survey of dog behavior in a park people will reasonably assume that you've put in the work, regardless of the surrounding politics. The authors did not do that; their entire "research" about canine rape culture was simply made up. That's obviously unethical behavior in any academic context.

Behavior leading to reproducibility issues is also unethical. P hacking etc is just as bad as simply making up experimental data from scratch. It’s at a minimum wasting everyone’s time that’s reading that junk, but a waste of any funding sources for that research.

The only reasonable option I see is a culture of reproducing results from independent data and blacklisting people when a high percentage of their papers turn out to be bogus. Because what’s happening today in many fields is simply wasting everyone’s time and money.

I don't know, anything that implied dogs being raped would have been rejected by me, a layperson who grew up with dogs.

Your argument seems to be more about "because some people were duped, it's ok to assume reasonable people were duped" and that doesn't seem right to me.

Even obvious quackery like: https://en.wikipedia.org/wiki/Masaru_Emoto still wastes people’s time. Worse it tends to get spread around and keeps showing up.

So just because something is obviously fake doesn’t really mean much in context. After all the goal of science is to step beyond people’s intuition to discover what’s actually happening. Thus at some level people need to actually consider very odd ideas as possible and then collect data etc.

And yet almost anyone in the street could have identified this study as nonsense. If they would have identified it as a hoax is another question. But it is difficult to identify anything as a hoax when many "serious" studies look exactly like that.

> The authors did not do that; their entire "research" about canine rape culture was simply made up. That's obviously unethical behavior in any academic context.

It was so incredibly absurd that the hoax ought to have been obvious from the abstract; that the journal editors and reviewers could not recognize the hideously obvious hoax right in front of their noses is the scandal.

That would make social sciences half-bullshit and half-obvious :)

“Are the social sciences merely dressed up pseudoscience?”


The ones making conjectures divorced from a strong grounding in biological purpose likely are.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact