Hacker News new | past | comments | ask | show | jobs | submit login
An erroneous paper on religion and generosity is retracted (psychologytoday.com)
213 points by okfine 16 days ago | hide | past | web | favorite | 158 comments

I think it's situations like this that result in society having such a hard time with 'science'. The term has been so heavily co-opted by fields that just don't have sufficient rigger for the term to hold weight. Yet, on various topics, we have this publicized attack of "your a science denier!". At the end of the day, there are two 'types' of science, one where I can take the results and make accurate predictions, and one where I can't. The later just amounts to 'our best guess' where the accuracy is entirely unknown. If we want the general populous to 'trust science', we need to stop calling the later science. In short, if you can't repeat and predict, stop calling it science.

Scientists usually try to distance themselves by saying those are soft science or even pseudo-science. This leads to the embarrassment of the demarcation problem[1] which is that no one can give a bright-line rule[2] to distinguish between the "real" science and pseudo-science. All of the demarcation criteria that have been proposed (such as Popper's falsifiability[3]) are inadequate in one way or another. In particular, they don't seem to capture the reasons a scientist would give about why a particular nutrition or social science paper is bad. The scientist would say things like, "Well, your sample size is small and not representative of anything except psych undergrads, you didn't control for age or gender, the participants and experimenters weren't properly blinded, you tested 15 hypothesis and only reported the p-value for the one that was under 0.05, and even that is wrong because you didn't apply Yate's continuity correction on your chi-squared test, AND NONE OF THAT EVEN MATTERS because the effect size you report is too small to be of practical consequence!" Nothing in there about the hypothesis not being testable; yet this is the kind of stuff that really separately the wheat from the chaff.

So we're left with a "No True Scotsman fallacy" where have to say that some science is "good" and some is "bad" and the only way to tell is to ask someone knowledgeable to evaluate each paper on a case by case basis. Not terrible useful to the layman.

And why do we want any kind of science to automatically get respect anyway? Good science is good because its already been subjected to an incredible degree of scrutiny. It will hold up to a little more. The real problem is disingenuous, bad faith arguments which are allowed to dominate the conversation. The real problem is to teach the general public to distinguish between sincere, good faith arguments and patent bullshit. This is much more difficult than it sounds because bullshit can easily conform to any merely superficial characteristics.

[1]: https://en.wikipedia.org/wiki/Demarcation_problem

[2]: https://en.wikipedia.org/wiki/Bright-line_rule

[3]: https://en.wikipedia.org/wiki/Falsifiability

Why not have a general checklist with a minimum set of requirements for scientific papers that are relevant across all branches of science? The people putting their names on the paper would have to show that they followed everything or give reasons for skipping a step. The receiving journals would have their editors re-check the checklist. As part of reporting the results of the paper, the level of completeness of the checklist would also be in the report.

Yes, the checklist would not be all-encompassing or foolproof, and there would likely be revisions to the checklist, and maybe even domain-specific variants, but it would be an extra level caution that the media could report or choose to ignore at their will. Over time, the apparent level of scientific rigour would improve. No, it’s not bulletproof and there will be people who will try to meet the checklist and still present erroneous conclusions as reliable science, but it would be an improvement in the status quo for a layperson who is aware and values said checklist.

There are things like CONSORT[1] which kind of do this. Statisticians like Fisher[2] have a ton of good general advice of the design of experiments. (A plug for The Lady Tasting Tea[3] and the 7 Pillars of Statistical Wisdom[4] feels appropriate here.)

On the whole though, most of the things you should and should not do are so domain specific its very hard to give much useful advice at the level of "all science." Right now this seems to work because researchers are so eager to anticipate objections and and avoid unnecessary arguments during peer review they stick slavishly stick to the same methods used by seminal papers in their field, and this has the same effect as running down a checklist.

There probably is a case to be made for using an actual checklist, though[5].

[1]: http://www.consort-statement.org/

[2]: https://en.wikipedia.org/wiki/Design_of_experiments#Fisher's...

[3]: https://en.wikipedia.org/wiki/The_Lady_Tasting_Tea

[4]: https://www.goodreads.com/book/show/27311742-the-seven-pilla...

[5]: http://atulgawande.com/book/the-checklist-manifesto/

> which is that no one can give a bright-line rule[2] to distinguish between the "real" science and pseudo-science

"if you can't repeat and predict, stop calling it science" seems like a nice bright line.

Peer-review obviously isn't enough, I'd like to see peer-replicated studies become a thing.

What about astronomy, cosmology, archeology, paleontology, volcanology, evolutionary biology, cladistics, macroeconomics, etc., which do not allow us to "repeat and predict," as you say?

Some of these cases can be rescued by considering "retrodiction"[1][2] as valid substitute for prediction in the right circumstances, but not all.

I personally think the analysis of the Mott problem[3] points the way to the solution to some of these kinds of issues. That is, a prediction can take the form of a likelihood function which assigns high probabilities to certain combinations of events and low probabilities to others. Theories with low perplexity[4] can be considered correct even if they can't make predictions, and the study of such theories can be scientific. But as far as I know I am the only one who thinks so.

[1]: https://en.wikipedia.org/wiki/Retrodiction

[2]: https://afdave.wordpress.com/2007/09/04/sir-karl-popper-and-...

[3]: https://en.wikipedia.org/wiki/Mott_problem

[4]: https://en.wikipedia.org/wiki/Perplexity

> What about ... which do not allow us to "repeat and predict," as you say?

"if you can't repeat and predict, stop calling it science" seems like a nice bright line.

For what it's worth, there is science meeting the "repeat and predict" definition that can be done in each of the fields you list.

Astronomy / cosmology absolutely makes testable predictions.

Not that I necessarily agree, but the suggestion was to find a different word for these fields.

(Also, most of the fields you listed do allow for replication.)

I think a lot of what you are talking about is due to a huge amount of 'papers' being nothing but coloration tests. Again, I think terminology is of utmost importance when discussing public perception so I would rather not talk about this as 'science'. It's 'research'.

I don't see a "No True Scotsman fallacy" here because I think I define it quite clearly: Can you provide accurate predictions? I'll concede that there exists a bit of grey area in that question, but the answer is heavily bi-modal.

even statisticians call social science - psychology especially - a pseudo science

For me personally, situations like this make me trust science that much more. If someone finds that a previously published paper contained errors, as any human work is likely to, and the response is to go back and correct those errors or publicize whatever was wrong with the research, well that's science working as intended. That's freaking awesome.

What I get really, really, really frustrated with is the attitude that some people have where retractions of stories or articles makes them trust the source less, rather than more. If a journal publishes a retraction, I have that much more faith in that journal, and that much more faith in the scientific process in general.

Similarly: news outlets retracting stories or issuing corrections makes me trust those outlets that much more, because it shows that they care about making a best effort to present truth, and they care more about their reputation as being a source of facts than they do about whatever short-term backlash there might be.

Any media outlet, journal, web site, or other publication that doesn't regularly issue retractions and corrections is not to be trusted. It'd be better if the erroneous information didn't get out there in the first place, but there's no scenario where everybody gets everything right the first time, all the time- it's going to happen.

You can really see the different perspectives highlighted from people posting to social media in response to things like changing estimates of the age of the universe. There's always a contingent of outraged morons screaming stuff like, "now they're saying it's 5 billion years older than they thought it was before, and people continue to trust them?!? All those scientists are such hypocrites for doubting my belief in anti-vax/Austrian economics/chemtrails/Noah's Ark/etc., how dare they!" They act like changing your mind based on new data, or admitting you were wrong about something is a sign that you shouldn't trust someone, whereas I would say the ability to constantly revise your beliefs is a fundamental requirement for trusting someone's judgement. The inability to do so is a reason to not trust anything someone says.

Scientists bear a public responsibility. Studies like these are used to justify public policy, opinion pieces, etc. If a civil engineer makes a mistake that impacts the public (say a malfunctioning building structure) that could have been prevented were it not for lack of controls, he or she would be investigated, and potentially be basically blacklisted in their field. If a doctor made a negligent mistake because he was unaware of how to use a tool and a patient injured, he would be held accountable.

In this case, social science analyses are used to answer questions of major public importance. Governments are constantly trying to reduce suicide rates, and make their populaces happier. Papers making claims that religion makes children selfish and unhappy are used to make public policy. In this case, if this paper were used as justification for legislation, we now know that the policies it would tend to suggest would be bad for the population. Someone, somewhere probably ought to be hold accountable in the same way as any other professional. I do hope that journals take appropriate precautions with this researcher in the future, and that the peer reviewers assigned to this case are duly sanctioned. This is a complete failure as professionals.

There are situations in which scientists can and do get things wrong through no fault of their own. For example, during the highly publicized EM drive tests a while back, an initial NASA report indicated that thrust was observed after careful evaluation. This is fine... they reported what they saw. However, after some additional engineering and measurement tuning and stronger sensors, the thrust was attributed to another source, so the claims were retracted. This is science. At every step the scientists demonstrated competence and professionalism. Nowhere did anyone say 'oops we forgot to use the sensor the right way that's why it didn't work, and in the meantime our paper was used to engineer other solutions'. There is a fundamental difference between being wrong and misrepresenting what you saw, whether through mistake or ignorance.

I've always thought this is what gives science a strong position. Science can always reverse itself and change positions (180 degrees if needed) to support the latest findings.

Politics, religion, etc. don't have the same advantage.

Well then, I'll create a journal which constantly publishes garbage and instantly retracts 99% of it. By your logic, the remaining 1% should be ultra, iron-clad trustworthy! :)

So here's the issue. There is a segment of the population that is skeptical and a segment of the population that is not. When the skeptical segment expresses distrust of a scientific paper and names specific grievances, oftentimes the unskeptical population will accuse them immediately of being 'anti-science'. This is a major problem, because when -- inevitably -- some problem in a paper is acknowledged by the wider 'science' community, the unskeptical 'pro-science' people who have framed the entire issue as being 'for' or 'against' science automatically make the less educated skeptical people all the more skeptical of science, until they actually become the caricature they were believed to be.

The solution to this of course is for everyone to remain more skeptical when it comes to science, and -- for the times when you do have reason to believe particular studies -- to respond in good faith to other skeptics, rather than to resort to name calling. If you are unable to defend why a certain piece of research should be believed, you should probably either accept that you are not educated enough to be able to comment and thus are also not doing 'science', or that the guy you disagree with may have a good point and the researcher in question bears the burden of proof.

On it's own, I think scientists making bad research and then retracting it would not cause people to distrust science. The 24-hour pop science news cycle combined with the massive rise of scientism as a religion (and the skeptics the equivalent of a medieval atheist) has.

> There is a segment of the population that is skeptical and a segment of the population that is not.

I think you missed the biggest segment: not paying attention. It's for these folks where terminology and clarity is important.

I'd divided the world into more than that. For me most of the world is superstitious (I realize that's my judgement). In my experience 95% of the people I meet believe in homeopathy, acupuncture, aroma therapy (not just that things smell nice but they they have strong medicinal benefits), spirits, ghosts, astrology, blood type personality, supreme beings, MSG causing headache, changes in weather causing colds, airconditioning making people sick, blowing fans causing sudden death, and a million other discredited things. Some of those people overlap with the type of skeptics you mention but most are not

That's not the problem here.

A statistical error was made, published, and then corrected.

That's possible in physics, chemistry, or biology just as well.

> That's possible in physics, chemistry, or biology just as well.

Yet it seems to happen far less often, or at the very least, in less publicly impacting ways. Part of this is the media, but part of it is the participants themselves. If you discover something new in physics for example, either the public doesn't care, or they don't really know, they just get a faster iPhone processor a couple years down the line when the predictions hold. The folks in your field are even skeptical at first glance, "do you have a 6 sigma result?". "Ok, well, lets talk then, but I still wanna see it reproduced". Psychology on the other hand, some person does a half assed 'study' and uses it to claim knowledge of some important aspect of humanity.

I'd really encourage looking at the Higgs discovery press conference as a perfect example. To my recollection, there was little to no mentioned of the Higgs, just cold hard facts, perhaps at the end there was a 'this is consistent with the Higgs'. Only months later were many of those involved even comfortable enough with their level of certainty to really say 'this is the Higgs'. They are searching out knowledge, and don't want to declare having found it unless they are certain. This is _good_, it's what we want science to be, the summation of our current, highly confident, view of the world. We don't want 'science' to encompass all untested and unproven hypothesis about the world.

The difficult point which I have to concede is that terminology is important. Certainly many folks in physics that maybe haven't hit on a predictive result would like to be recognized as scientists, and rightly so if they are on the path toward this endeavor. But the populous is simple, they want clearly defined words, they want 'science' to be known fact. Outside introducing new words, I don't know how to resolve this. If we don't define 'science' as denoting that which is rigorous, then we can't use "science denier" as a term, regardless of the topic.

Your argument seems to be purely emotional disdain.

> Yet it seems to happen far less often

Citation needed. And if so, that is a reason to draw a line where on one side is "not-science"? That is just absurd. Does a car A stops being a means of transport when it subjectively breaks down more often than car B?

> very least, in less publicly impacting ways

Excuse me?

* A literal century ago someone failed to translate a German study so know about every child in the western civilization gets a good dose of distrust in science when they get indoctrinated that the tongue has separated regions for taste which is ridiculously easy to refute for yourself in about 15 seconds.

* The coup of the cereal industry to fund some studies telling everyone that breakfast is the most important meal of the day still misguiding health guides today.

* Schrödinger telling the world how stupid it would be to assume quantum principles in the visible world, still happily recited with the complete opposite meaning by about 500 media entities per day.

* Scientific entities failing to have any impact on people about the dangers of X-rays until people got impotent from having their shoes measured via X-rays in the local shopping mall

To be clear, this is not intended as some sort of smear campaign to science itself. I want to illustrate that all science is vulnerable to even dumb mistakes and that this dumb social sciences ain't real meme is only slowing down much overdue conversation!

You're supporting my point here, all your examples did not make an accurate prediction. Thus, they are not known 'facts'.

Your definition of 'science' is 'the best we know', mine, and I think what is meaningful for public discourse is 'this is true'.

Sure, but what's your definition of "true"?

Derivable from first principles? General consensus? Observed once and seems to fit with the current model?

"One idea is truer than another if it allows us to explain and understand more of our experience.

The idea that the sun and stars move around the Earth explained only why they move across the sky, but the idea that the Earth orbits the sun while rotating on its axis is more true, because it explains also why we have seasons. Strictly speaking, however, we will never know whether the Earth really revolves around the sun; another, even truer, theory could conceivably come along.

In support of his view, James pointed out that in practice all scientific theories are approximations. Rarely, if ever, does one theory explain all the facts of experience. Instead, one theory often does well with one set of phenomena while the other theory does well with another set.

A scientific theory that explains more is truer than one that explains less, and the truer theory is preferred. Kuhn might add that even a paradigm that explains no more phenomena than a rival but explains those phenomena better is preferred—as for example Copernicus’ heliocentric model of the solar system was preferred to Ptolemy’s geocentric model, because Copernicus’ model was simpler and more elegant that the cumbersome epicycles of Ptolemy’s model, even though at the time the two models fitted astronomical data about equally well. If scientists prefer theories that explain more phenomena and paradigms that make more sense of our experience more plausibly, then the progress of science no longer seems so unreasonable. It is the result of selection, the exercise of scientists’ preference for theories and paradigms that make better sense of our experience."

Taken from the book "Understanding Behaviorism" by William M. Baum.

I don't really now how to make it more clear: Can I make accurate predictions from it? That's it.

I tend to agree. There is a plausible hypothesis to explain it too. Namely, that many social scientists don't particularly like doing statistics or calculations or that they are not naturally very good at it or that they are not very conscientious. None of these factors contribute to reliable analyses. People mostly go into social sciences when they are more interesting in people than in statistics or calculations.

A study friend of mine who studied an exact science ended up working in the social science department where he also was asked to teach some research methods class. Suffice it to say that even PhD students are not very good at it. Even at things where you would suppose they would be closer to their core competency, i.e., they were not that good at avoiding the pitfall of putting leading questions in a questionaire.

> Yet it seems to happen far less often, or at the very least, in less publicly impacting ways.

Does it? Do you have any data to indicate that, or even a good way to define your terms?

This is not simply a matter of social science being soft. This was caused by exactly 1 thing: researchers not releasing the code they write or use. Their code was bad, and it produced an incorrect result. There is probably TONS of this in research from the past 30 years, and most of it goes uncaught because, for some insanely unwise reason, the scientific community has tolerated researchers keeping their procedures, tools, and experimental processes secret... but only when software is involved.

Science isn't just a matter of what people believe. It often gets considered when governments create policy. Bad science can kill millions. I've run into many 'bugs' in the algorithm descriptions published in computer science and mathematics papers, I shudder to think what the code backing nontechnical research is like.

How does your criticism apply to this scenario? The conclusion of the study (if correct) would in fact provide some predictive power: one would be able to make some predictions about a random child's generosity given their religious upbringing. It should also be repeatable, assuming the methodology for gathering and analyzing the data was published.

The experiment may be repeatable, but that does not mean it's reproducible or even useful. Many of these "soft science" fields are plagued by

1) replication failures

2) scientific errors both accidental (coding errors) and intentional (p-hacking)

I think part of it has to be physics envy, where many disciplines are attempting to adopt either the statistical techniques used by physicists or at least similar language and ideology in order to conduct "experiments" in fields as varied as sociology and psychology.

This seems like cargo cult science. Who says that, say, measuring the tendency of churchgoers to donate to a cause will reveal a truth similar to the mass of an electron -- something that can be accurately measured once and then you know a good measurement will return the same result? It's not at all clear that any real insight is gained of either philanthropy or church attendance from such an "experiment", nor is it clear to me that these fields contain statements with truth values at all, at least truth values as would be expected by mathematicians, physicists, chemists, etc.

But by adopting the methodology of science to fields which may not have yield scientific results to yield, a lot of people are creating a body of fake knowledge, or the appearance of knowledge.

They didn't confirm the model worked by testing the predictions. No successful predictions = not worthy of being considered as 'fact'.

Have you never tried to use a tool from a cs paper and found that their code didn't even build or barely worked? This isn't a question of rigorous and non-rigorous fields. This is a challenge with artifact evaluation in all fields.

I'd encourage you to read some of these 'papers', there is often nothing to evaluate, nothing to make predictions on. Usually, just a survey and a statistical correlation test.

I think the bigger issue is intellectually lazy readers and media which don't really care to understand nuances or spend any effort reviewing and fact checking things.

You can't control labels, no one can control what is labeled science or not, we just need to educate people, demand more of our media, and breed a culture that take more pride in truth, accuracy, precision, and intellect.

> I think the bigger issue is intellectually lazy readers and media which don't really care to understand nuances or spend any effort reviewing and fact checking things.

Sure, I don't disagree, but do you have any path toward this 'enlightenment'? I mean, the average college attending student has a literacy level of about 7th grade. How would you propose we move from there to a point where even the average, much less most, voting age adults can not only digest a paper but also be capable of poking holes in its reasoning?

> You can't control labels

Of course you can, it's what politics and marketing are all about.

I partially agree. Perhaps you comment is related to Brieman's Two Cultures? Paper: https://projecteuclid.org/euclid.ss/1009213726

I think you're right that they are two different things, but I'm not sure that later isn't science.

Perhaps the real problem is that reality does not punish being wrong and reward being right enough.

Another thing that needs to stop is the over use of the word "theory" where "hypothesis" is more accurate. But everyone want to be a scientist with big ideas.

I think you're looking for the word rigor

Computer Science


uh... rigour

Rigour is the correct spelling in the UK.

that's what i thought, google only showed the medical definition at first, and i trusted it

"The paper received a great deal of attention, and was covered by over 80 media outlets including The Economist, the Boston Globe, the Los Angeles Times, and Scientific American."

And how many of these will cover the retraction? A dozen at most? And all those articles will be sitting out there, getting cited and read on occasion.

Really sad to see and feels like it's becoming more common (maybe just because I'm paying closer attention). If it fits the narrative, accept first, retract later. It would be interesting to see view statistics on the original article vs the retraction.

It's not just narrative fitting. There is also a strong bias towards publishing results that seem surprising because that gets more readers. Of course, that also biases toward wrong results because wrong results are likely to be surprising.

"wrong results are [more likely] to be surprising".

This case is interesting because there's a large population who would find these (unproven as it turns out) results confirmatory rather than unexpected.

In the end they were neither.

Narrative fitting and surprise bias are largely orthogonal biases. Both are at work in this case. In the western press it is trendy to paint religion in a negative light, which is the narrative the paper reinforces. Imagine a bogus paper painting in an unfavorable light one of the western virtuous identities: not white, not religious, not male, not hetero. For example, "The Negative Association between Atheism and Children’s Altruism across the World". Or "The Negative Association between Homosexual Parents and Children’s Altruism across the World". Such papers will never be covered by over 80 media outlets without questioning, in spite of being surprising with respect of the narrative.

It's not sad, it's great.

The history of science is full of drama where most issues took multiple generations to resolve. It's easy to forget, in those interim periods, people would build all kinds of castles on total BS all the time that cost society in so many ways.

Today stuff gets resolved faster and that's a good thing. People, qualified or not, who get carried away by hype or bias look foolish much much faster. And thanks to how hard it is to erase mistakes from the internet good luck rebuilding lost cred.

This is a very reassuring viewpoint. I think the onus of acceptance relies on each individual perception and how they can relate to such facts. I will disagree on your usage of 'foolish' because I don't think it's in anyone's best interest to declare foolishness, but to paint a wholistic, optimistic picture of a realistic future. It is my opinion to embrace misconceptions about past mistakes that can have astounding detrimental effects and not to look down, but to unabashedly represent a truth and allow others to accept it.

I collected some headlines in 2015:

Study Shows Non-Religious Kids Are More Altruistic and Generous Than Non-Religious Ones (TIME)

Religious children more punitive, less likely to display altruism (Boing Boing)

Study: Religious children less altruistic, more mean than nonreligious kids (Chicago Sun-Times)

Religious Kids Aren’t as Good at Sharing, Study Finds (Yahoo Parenting)

Study: Religion Makes Children Less Generous (Newser)

Religion doesn’t make kids more generous or altruistic, study finds (LA Times)

Religious children are meaner than their secular counterparts, study finds (The Guardian)

Being religious makes you less generous, study finds (metro.co.uk)

Kinder Without God: Kids Who Grow Up In A Religious Home Less Altruistic Than Those Without Religion (Medical Daily)

Surprise! Science proves kids with religious upbringings are less generous — and so are adults (rawstory.com, the Medical Daily story reposted with a new headline)

The article said that the retraction was covered by 4 outlets.

Boy how common and sad that is. Sensational findings on the front page. Correction saying sensational finding totally wrong buried on page ten or not published at all.

Which furthers the "fake news" narrative because now even more people can point out the BS on the front page.

Why don't reporters get angry about this and demand more and better retractions? This hurts their credibility more than anyone.

Apparently the journal published the original paper in 2015, published someone else's correction in 2016, and only published a retraction of the original paper just last month.

there should be a law that makes them keep reposting the story every year until they reach the numbers from the original release

I think there is no need to be concerned that the retraction won't be widely covered in the long run in this particular case though -- all the religious interest groups will make a big fuss about this...

That's almost like saying it's OK to falsely report about religion in the first place because "the religious interest groups will defend themselves adequately", etc.

False reporting of any sort (including the failure to correct prior false reports, whether intended or not) always causes harm. Truth and principles matter.

Furthermore, reducing ideas to power struggles between those who speak them is to commit both the genetic fallacy and the ad-hominem fallacy.

Why would an idea exist other than to serve a purpose, and why would that purpose be distant from the mindset which benefits the most from it?

> Why would an idea exist other than to serve a purpose

Because ideas are the substrate of thought, belief, motivation, desire, and human life itself. Without ideas we lack the ability to understand anything at all.

Not sure how that relates to my question. Why would an idea exist without someone to give it purpose?

Because it constitutes belief and/or knowledge.

Because it is true, or someone believes it to be so.

Because it carries explanatory power - something we all strongly desire.

You might call those reasons “purposes” - if so, I don’t disagree. But the truth or falsehood of an idea transcends any “purpose” someone may have for speaking it. This is the academic posture: to dispassionately evaluate truth claims without fearing the speaker.

Yes I understand that the naive position may be used as a default position for understanding ideas, and that is fine.

That doesn't stop us from acknowledging the fact that ideas and worldviews have strong ties and are intermingled enough that they almost always warrant an underlying motivation whether that be a noble search for truth or a way of digging further into denial.

A good religious institution should rise above the fray and not even bother fussing about something like this. No-one who would otherwise have accepted Christ would have rejected him based on that study. It's the religious equivalent of a political cartoon--it changes no one's mind, it just fills up space and generates clicks/citations.

"Then Abraham said to him, 'If they do not listen to Moses and the prophets, they will not be persuaded even if someone rises from the dead.'" (Luke 16:31)

No-one who would otherwise have accepted Christ would have rejected him based on that study.

It may be true that reading about the study would not have convinced anyone to become a practicing Christian. But there were undoubtedly people who were on the fence about going back to church, either for themselves or for their kids, who decided not to based on the purported conclusion that it made kids less generous. So just because the study alone might not have convinced anyone to follow Christ, there still might be more practicing Christians if the study had been done correctly (and therefore garnered very little coverage) or if the correction had been covered as widely as the original flawed study.

Releasing your data should be a requirement for publication. If the original author had wanted to keep this a secret he could've withheld his data and nobody would've been able to correct him, there simply would've been discrepant studies.

I see where you're coming from, but would the subjects be comfortable with all their data becoming public?

I think you could, and should anyways, make the data anonymous. Just give every participant a GUID for a participant ID and add a step to purge personally identifiable information. Then you can share records without identity.

That didn’t work for the AOL research several years ago. https://arstechnica.com/tech-policy/2009/09/your-secrets-liv...

Making things like medical records actually anonymous, especially in the face of bad actors, is an unsolved problem.

Anonymizing data is, yes, a difficult problem, but in particular aggregated data can, and has been, reliably anonymized. For example, the problem with this dataset would have been visible in aggregated data (e.g. aggregated by nationality).

Alas, these kinds of problems are not restricted to the social sciences. Case in point, this retraction from a couple of days ago: https://retractionwatch.com/2019/09/25/nature-paper-on-ocean... Very similar to this one really; the paper claimed to overturn our existing knowledge in a way that fitted a narrative people were inclined to believe (in that case: we're all doomed) and was immediately seized on by all the news sites because of it, except the statistics were mucked up and it couldn't show what it claimed to. The fact that it was so surprising should've been even more of a massive warning sign in that case though.

Interesting. Hadn't seen the Retraction Watch website before.

Wonder if it'd be possible to automatically scan new papers added to (say) arxiv.org, for retracted papers in their references?

eg to warn the authors, and maybe eventually automatically as part of the upload process for arxiv.org (and similar)

Conclusion of the new analysis:

In sum, Decety et al. [1] have amassed a large and valuable dataset, but our reanalyses provide different interpretations of the authors’ initial conclusions. Most of the associations they observed with religious affiliation appear to be artifacts of between-country differences, driven primarily by low levels of generosity in Turkey and South Africa. However, children from highly religious households do appear slightly less generous than those from moderately religious ones.


From the article:

Although Decety’s paper had reported that they had controlled for country, they had accidentally not controlled for each country, but just treated it as a single continuous variable so that, for example “Canada” (coded as 2) was twice the “United States” (coded as 1).

I mean I don't even understand how this seemed like a normal thing to do?

The variable for Country should have been treated as a categorical variable, but was instead processed as a numeric variable.

This mistake would be downright trivial to make in R. Just declare that Country is a Factor (which is the built-in type for categorical variables), and then throw the data into a library whose attitude towards errors is to coerce everything to numbers until the warnings go away.

Background: Factors in R are the idiomatic way to work with categorical data, and they work somewhat like C-style enums except the variants come from the data rather than a declaration. So if you take a column of strings in a data frame and cast it to a Factor, it will generate a mapping where the first distinct value is coded as 1, the second distinct value is coded as 2, etc. Then it replaces the strings with their integer equivalents, and saves the mapping off to the side.

I forget the exact rules (if there are rules, R is a bit lawless), but it's not very hard to peek under the hood at the underlying numeric representation. Many built-in operations "know" that Factors are different (e.g. regressing against a Factor will create dummy variables for each variant), but it's up to each library author how 'clever' they want to be.

This makes the most sense to me. I don't work in the dataframes world but without this explanation it seemed like someone would have to go out of their way to make that error.

Right then...

...strong typing: for or against?

(To be fair even strong typing won't save you if you don't use it. But fuuuuuk, what an error. I noted that paper mentally and would have quoted from it)

Yup, I'm all for extremely strong typing. In 40 years of writing code I can't say I've ever had any real trouble with strong typing other than when dealing with libraries that reinvent wheels. Weak typing, though--nuke it from orbit.

It's really very easy to do, roughly the stats equivalent of declaring a variable signed instead of unsigned.

Many algorithms work on both categorical and continuous variables, with different results depending on the variable's type.

At risk of embarrassing my self statistically, what exactly happens when you do this?

I.e., if you're controlling for country, that means you're bucketing by country, and looking at each subset, right? So if country is represented by a non-discrete value... what exactly happens?

So let's pretend there's three types of trees we want to study: Oak, Maple and Aspen, which we code as 0, 1, and 2 for reasons (there are some good reasons to do this).

Statistically, if you treat them as a continuous variable, the estimates you get will act like there's an ordering there, and give you the effect of a one unit increase in tree. So it will tell you the effect of Oak vs. Maple and Maple vs. Aspen, assuming those are proportional and that Oak vs. Aspen will be twice that.

This is...nonsense, for most categorical variables. They don't have a nice, ordinal stepping like that.

In short, ANOVA is usually what you want to do: https://en.wikipedia.org/wiki/One-way_analysis_of_variance

In practice, if you have n countries, you'll add n-1 binary variables to your regression equation. The first country is the reference level (all zeros), for the second country set the first new variable to one, the rest to zero, etc.

So one-hot encoding, plus one "none-hot" base case. Why not just one-hot for all? To save one input?

You've never had a bug in your code that seemed insane after the fact?

This is why we have code review processes. It's long past time for that to be part of formal scientific peer review.

A few cs conferences have artifact evals. Most all research in cs doesn't actually have any sort of code review at all. No field is implementing the thing you are expecting.

I think it will be an uphill battle no doubt, but I think the only alternative would be to share the whole dataset and have reviewers re-implement the analysis to confirm the results. That would also be a huge improvement, but it seems like a much bigger burden on reviewers.

Well theoretically it already is, given you normally have multiple authors and reviewers. It's just done poorly, just as a code review can be done poorly.

This seems like 101, 'another pair of experienced eyes would catch this' kind of malpractice.

Analyzing data in academia seems like a disaster. It's almost guaranteed to produce errors like this.

You have:

- people with no coding experience and, in some cases (especially in social sciences), a strong aversion to math

- code that isn't unit tested, so answering the question, "Did it run correctly?" is often softened into, "Does this look plausible to me?"

- a strong incentive to end up with certain results

I dated a quantitative geneticist for a while, and her coding education was almost zero. She was writing code in R and essentially just changing lines until the output "looked right". It was insanely complicated math, so there was no way to make sure the output was good. The code had to be an exact match for the algorithm she had written out in mathematical notation, and there was essentially no chance of that.

It got worse. She'd write the algorithm in R and then end up with batches that would take, in some cases, years to finish running. Obviously she ended up with even more dubious hacks.

(For anyone curious, she's had a fairly decorated academic career under an acclaimed advisor who reviewed all of this code to some extent, and she's worked with most of the top genetics programs in the US.)

Maybe that can't be true as it took them 3-4 years to close this error?

Simple...data representation is not the same as data meaning.

I teach an introductory stats course and we hammer this in. Categorical data are often represented as numbers or other short indicators for storage purposes. Typically I fmultiple choice the encoding is by the order of the choice options.

I not infrequently see average of gender because male = 0 and female = 1 or vice versa and someone generates a table without thinking.

The bigger issue here seems to be the use of ordinals in the data collection process. For instance, a lot of my CSVs don't have them and R and pandas are perfectly capable of enumerating. Why do you even need to put ordinals in the dataset? Does excel want this sort of thing or something?

Your don't need to people just do... I've kinda always assumed or connects to olden days and efficient storage and memory use. Male is a four character string, 1 is an integer.

Aha! Now we see why there are so many transgendered people these days! :):):)

Agreed, but maybe you have to assume that the scientist knows very little about coding for data science, which is effectively what we're talking about here.

I think a major contributing factor to problems like this is people going into the soft/social sciences being more likely to be math/stats AND programming averse. Meanwhile, all sciences continue their long term trend towards applied math via programming. This leads to people using the math/stats via code without understanding very well what it is they are doing, and, naturally, the end result is lots and lots of mistakes.

Social sciences programs require students to take statistics courses. That's no guarantee that statistics will be correctly applied.

Or that the material will be taught effectively. Or that the students are contextually prepared to understand the material at that point. Or any of a number of other things that go wrong when people suggest that education is the solution/root to a very hard problem :)

Yeah, that goes beyond sloppy and into negligence.

Another, more consequential paper that popped up on HN recently has been retracted as well. The one about ocean warming. The retraction notice is a masterclass on weasel language, worth reading in its own right.


Sometimes I feel weird coding zip codes as strings but this is a great example why. If my program ever treats a zip code like a number I would like it to throw an error. At least in this case the error looks like an accident.

On topic, from yesterday: https://news.ycombinator.com/item?id=21067764

It's another social sciences paper but in this case a co-author has requested a retraction over his strong belief that the paper includes fabricated data. The retraction request has been denied. It differs from this paper in that the data anomalies look intentional.

My rule of thumb is that anything that’s not part of a calculation will be a string. Any chance I ever want to multiply a zip code? Doubtful. String, it is.

> If my program ever treats a zip code like a number I would like it to throw an error.

One interesting thing you can do though, is sort by zipcode. This sorts your mail from East to West in the US. You can use that as a rough estimate of shipping time.

That's really cool, even if I never find a place to use it.

Anyway you can still sort strings.

So? You make sure the zip code is left zero filled and the strings sort fine. Admittedly, numbers would sort a bit faster.

Non-US postal codes often include letters and spaces, and some areas don't use postal codes at all.

All US postal codes have an optional - (hyphen) in them as well, and so you should always encode them as strings anyway.

I wrote a webpage in 2015 about the Decety paper, what it was and how it was presented in the media, which might be of interest. The paper seemed highly suspect in various ways even at the time. I added an update in 2017 that "new analysis shows that the original study failed to adequately control for the children’s nationality". On the (unfinished) page I'm using the paper as an excuse to teach myself basic statistics and research methods.


so a categorical variable got mixed up as a numerical one and produced misleading results.

to the credit of the authors, they released their data sets. -- however, i suspect that proper data exploration and visualisation would have prevented all this. visual inspection would have most likely revealed that there is no visible effect, or even an effect in the opposite direction, and once you see this, all alarm bells should go off if your model predicts otherwise. so i suspect that the authors skipped some basic steps and got carried away by results that promised a nice headline.

Agreed. However, another factor may be that, we tend to scrutinize more closely results that don't line up with what we expected. In fact, in this case it was another researcher who had gotten results that pointed in the opposite direction, who convinced the original researcher to release their data (which, to their credit, they did).

Which, is one reason why having higher and higher percentages of academia and science researchers be from the same part of the political spectrum, is worrisome to me. If you have more diversity in ideology, there is more likely to be someone in each field to have the instinct to scrutinize closely a result which, when scrutinized, won't hold up.

There were even news sites that published articles about the original article AFTER the retraction was announced! The state of science reporting is very sad.

"But when they included their categorically-coded country (1 = US, 2 = Canada, and so on) in their models, it was entered not as fixed effects, with dummy variables for all of the countries except one, but as a continuous measure. This treats the variable as a measure of ‘country-ness’ (for example, Canada is twice as much a country as the US) instead of providing the fixed effects they explicitly intended"

How did this not get caught immediately? If I did a study and found out that kids in Zambia are 47 more times as generous as American kids that'd make me instantly suspicious.

Or maybe the reviewers were all Canadian /s

I don't think it's quite as obvious of an error as you are suggesting.

They were trying to correct for scenarios like this: Hypothetically, Canadians are twice as generous as Americans and twice as religious, but religious Canadians are equally generous as non-religious Canadians and religious Americans are equally generous as non-religious Americans. On the surface, it appears that religious people are more generous, but really it's just that Canadians are more generous.

Instead of treating the countries as discrete groupings, they treated them as points on a spectrum with each country being assigned an arbitrary place on the spectrum.

If #3 happened to be China, they would be assuming that people in China should very similar to people in the US and Canada, because 1 vs. 3 on a scale that goes to 200 is hardly any difference at all, but really the numbers are just arbitrary identifiers.

I'm a data scientist and this is an incredibly embarrassing 'n00b' error to make. If these researchers were using anything other than deep learning, it's almost certain that each parameter of the model was manually selected. That the author made the mistake is bad, that no one caught the error is a disaster.

What likely happens is that since the country variable is basically random, you will get a model where the country has negligible effect on the prediction.

As much as I like spreadsheets and other general purpose numerical analysis tools, I wish there was a restricted subset of their functionality that could be used that was more formally verifiable, to prevent issues like this and Reinhart–Rogoff (see: https://en.wikipedia.org/wiki/Growth_in_a_Time_of_Debt#Metho... ) from being a common occurrence.

If a research does not release the code used in their research, their papers should not be trusted. We wouldn't trust a paper where the researchers hid their experimental methods or analysis - it wouldn't even be allowed to be published. But if their experimental methods or analysis are done in code, those parts are allowed to be a black box and we're supposed to trust them. If we were willing to just trust them, there wouldn't even be a peer review process and it certainly wouldn't deserve the imprimatur of 'science'. This has been a problem for years, and I will personally not be the least bit surprised if we eventually see a death toll attached to it.

I wonder if a religious education resulting in some positive effects might be akin to the "any diet is effective" thing, i.e. it's not the specific upbringing as much as being exposed to _any_ moral thinking in general.

I don't mean to imply that there aren't moral a-religious educators of course, just that it seems likely to have less discussion of ethics for any kid once the temple/church/mosque/whatever is removed.

Wait a second. If the issue with the analysis is as the article says, that is that some countries were weighted far higher, that means that in some countries the original conclusion does hold (probably for a small sample size). Perhaps this is because some religions promote generosity and some do not? Would be interesting to look in to.

Fun to see all these software engineers criticizing scientists for a bug in their code

And even though it was corrected and the result retracted, it was already cited by many papers, the media didn’t report the retraction etc. How much of scientific reporting gets skewed by media interests (controversy and sensationalism) and funding and political interests?

Furthermore, this whole field was called cargo cult science by Feynman, using correlations, p-hacking, data dredging, and more


The best analysis I have seen of the corner we have painted ourselves into


Can we go back to basic principles and all agree that psychology and social studies are not true sciences anyways?

Science is to me something where you extract natural laws that predict phenomena will occur 100% of the time given certain conditions. Physics, chemistry, computer science, most branches of medicine operate this way.

A field like psychology that says "well sometimes people will..." or "we found in 60% of cases that..." is not science. It's a comment upon society maybe, but it does not produce broadly repeatable, predictable results.

Not true science.

I think you're forgetting that the "replication crisis" also occurs in your so called "100% deterministic fields". CS has the issue of ML models, medicine drug studies, etc.

The bottom line is science is hard to do, and even harder to do right. I think all fields do try to follow the scientific method to the best of their ability, but for some fields it is simply harder to do because of the sheer complexity of the subject matter being hard to model, experiment in, and understand.

Those sciences not developed at Edinburgh are not developed by a true scottsman, either.

> most branches of medicine operate this way.

No they don't.

This is fundamentally one of the many disastrous outcomes of scientific publishing being controlled by greedy conglomerates.

For the uninitiated: If you want to publish a scientific paper today, you basically sign up to sign over the rights to any publisher that's interested (please someone publish me). That publisher will then review that paper in a more often than not mostly undisclosed process and publish that paper. Everyone knows that to be a high-regarded publisher one must have a very own typographic formatting: Unreadable font, weird multi column layout to prevent accessibility and tables disregarding standards are a good start. Afterwards, the paper gets published on the publisher website, which again follows as little agreed upon standard as possible. Data is of course excluded, study itself is in PDF. Also put up a fat paywall, don't want those pesky poor people be scientifically literate. Give a few cents of your $70 fee to the authors, it is not an unethical business you're doing here!

THIS is the systemic error people seem to be so happy to ignore because they're neck deep in social science not being real science memes.

This prevents interesting new startups for fact-checking or meta analysis, which are e.g. happening in journalism because that field has a lot of the things science is sadly lacking.

This creates a drift between extremely rich and rather poor countries/unis/humans in scientific ability.

This generates a tar pit for scientific process.

This wastes billions in funds because of people unaware of each other doing redundant studies (and not referencing/refuting/supporting each other in the process neither, of course) because they are literally better search engines to find Harry Potter fan fiction than there are for finding studies.

And finally, this of course allows anything from honest statistical mistakes to snake oil sellers to slip through and doing generations worth of damage, because correcting, fact checking, re-researching, comparing, meta research, anything is slowed to a crawl.

So please stop embracing scientific elitism and gatekeeping for this is exactly what brought us here in first place...

> Decety

What an apt name.

I wonder about the damage to public this unintentional deceit will bring...

Go science! After all, a retraction is part of discovery.

Interesting, waitstaff on reddit always say religious people tip worse or not at all.

Probably because most religious people don't tend to flaunt it everywhere they go, and in my experience the ones that do also tend to be jerks. It's also possible that wait staff tend to remember the ones that do more than the ones that don't because being religious and a jerk is more memorable than religious and not a jerk.

Disclaimer: I am a religious person.

I worked in breakfast and brunch restaurants for a while, and my anecdotal experience is that the post-church rushes (we were across the street from a couple churches) had the worst overall tipping percentages that I would see all week.

One could argue that those are the folks that aren’t religious enough to feel that they should not make others work on a sabbath as a result their personal commerce. The religious folk that make it rain come in on other days of the week.

> In fact, Decety’s paper has continued to be cited in media articles on religion. Just last month two such articles appeared (one on Buzzworthy and one on TruthTheory) citing Decety’s paper that religious children were less generous.

“Media articles” is carrying a lot of weight there

My default presumption is that all results from the "social sciences" are false if they contradict my intuition. Whatever these fields are producing, it's not science in the Popperian or practical sense. A ton of policy has been built on the bad research and wishful thinking of the past few decades, and it's going to take a long time to unwind it all.

The problem, really, has been accelerating for a while. Something really went off the rails after WWII.

My theory is that it's engineering, not science, that confirms our knowledge. You don't really know something about the world until you can use it to do things. WWII-era science was being incorporated into tools (well, weapons), so it had to work.


I'd rather not be classified as insane, thanks.

If you read the OP article, the research suggests religion has positive effects on children. Your response to the corrected findings?

> I'd rather not be classified as insane, thanks.

What would I be classified as if I started worshipping, say, Thanos, and began preaching to everyone about him?

> the research suggests religion has positive effects on children. Your response to the corrected findings?

If it was safe for me to do so, my response would be to go out in my own religious country and film for you all the children throwing stones at animals (which they're encouraged to do so by their elders), all the beggars sleeping next to literal mounds of garbage, shunned by everyone, and the many other problems that religion is supposed to fix.

Given the context, I would classify you as insincere.

What context? Insincere how? I do strongly like a few deities from various fictions. I would gladly eat your ear off about them.

The context is that you clearly believe religion is nonsense, and a destructive force. Therefore if you were to advocate for a religion, you would be either lying or encouraging what you believe to be evil, and therefore evil yourself. Since poverty and ignorance bother you, I think lying would be the more likely explanation.

Liking deities from various fictions is not the same as seeking to open oneself to God.

add: the many other problems that religion is supposed to fix, but it actually produces them or makes them worse.

The negative effects (sexism, homophobia, general bigotry) greatly outweigh any positive effects that might happen. And the things they're generous towards are usually some anti-abortion or anti-LGBT organisations and politicians.

Not everybody wants to appease an imaginary entity called "the classification of typical human behavior as a mental disorder", either. Who would classify it, and how would they enforce it? Would use of force toward a country full of mentally-ill people with their potential being crippled by their manipulative leaders be appropriate medical intervention, or would it be starting a war? One might need a monument and a few hymns to the DSM to encourage the troops in this case.

It is not "typical" human behavior. See all the societies that are doing just fine without pervasive religion.

Curious: why create a new account to post this comment? (unless it's your very first account on HN)

While I'm an atheist and agree with you in part, I don't see how that's related to this story.

There aren't many discussions about religion on HN but a topic related to it should prompt some, and I have no other outlet. Reddit isn't a very good place for it either.

You pretty much torpedoed any chance of a productive discussion on the subject when you opened with "religion should be classified as a mental disorder." That comment is weapons-grade flamebait on a discussion board and probably not something you'd say in mixed company in real life either.

It's the conclusion I've come to after spending literally an entire life in a religious culture.

What would be your argument and which experience is it based upon?

> not something you'd say in mixed company in real life either.

Some people here criticize religion in private company, but would not dare to do so publicly, because of the very real threats to their very life they would face precisely because of religion.

Does the last paragraph of the article irks you a bit? The author is basically saying "Yes, we could abandon religious practices and eliminate all potential sexual abuse in religious communities, but then people would be less likely to volunteer and be slightly less happy" and somehow the author is interpreting this situation as a net loss. I don't get it.

Sounds to me like you’ve got the wrong religion prevalent in your country, which I agree is a bad bad thing. There have been many atrocities committed by people claiming to adhere to some divine mandate (hello crusades!) and that is problematic.

But religion as a whole doesn’t have to be bad. Many religions preach tolerance for others—one of their primary missions is to eliminate the very hate and inequality you have spoken of. Just look at what some religions do to help people because they believe it’s something their God would like them to do: https://newsroom.churchofjesuschrist.org/article/humanitaria...

Again, I’m really sorry to hear that religion has been used to oppress rather than to bond in your country. But the right kind of religion (as this corrected study seems to show) actually improves communities. I hope you can find a solution. :-/

Psychology, sociology and theology (and more logys?) was never meant to be sciences. We can blame the enlightenment for that idea, let’s revert them back to renaissance activities

Psychology and sociology can be made as rigorous as individuals & institutions care to.

Theology, not so much.

Actually most of the (christian) theology (when discussed by theologists) within its confines is very rigorous.

Theology can be rigorous in the sense of its models possessing internal validity (logical consistency, parsimony).

However, because it's basically speculation about unobservable entities, theology completely lacks external validity.

Social sciences can have both.

I hope you do realize the irony in just declaring that something was "never meant to be a science"...

Science is about constantly evolving via falsification of theories.

You might want to base your definition of what can be science on something different than an unsourced appeal to authority from 300 years ago?

There's too much power in them being considered sciences, which is why they'll remain that way.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact