
What happens when scientists admit error - lelf
https://elemental.medium.com/when-science-needs-self-correcting-a130eacb4235
======
sn41
The greatest and in some sense heartbreaking admission of error I know of, is
Frege's letter to Bertrand Russell:

[https://sites.tufts.edu/histmath/files/2015/11/Frege-
Letter-...](https://sites.tufts.edu/histmath/files/2015/11/Frege-Letter-to-
Russell.pdf)

Russell himself says "As I think about acts of integrity and grace, there is
nothing to compare with Frege's dedication for truth."

~~~
Daub
This is beautiful and humbling. An example of a scientist who loved being
wrong was Fred Hoyle. Quote: ‘it is better to be interesting and wrong than
boring and right’.

~~~
amelius
> ‘it is better to be interesting and wrong than boring and right’

This is why fake news is such a hit ;)

------
rixed
> I’ve never heard of a comparable situation.

Yet it is hard to believe scientists would be the ones who only ship flawless
software. And easier to imagine that most often, when one realises a previous
publication's result was indeed impacted by an error and faces the difficult
decision to retract or pretend they haven't noticed, they opt for the other
route. I've found scientists in general to have less ego then average, as
expected from people trained to care only for the facts, but they still
operate in the larger framework of individualistic modern society. It probably
takes a scientist and a woman, whose education generally encourage lesser ego
to begin with, to admit such an error.

Or a software author, but there there is no other alternative to guilt
admittance than total ridicule, since we are drowning everyday in a world of
errors.

~~~
searchableguy
> Yet it is hard to believe scientists would be the ones who only ship
> flawless software.

I mean, most of them don't even pin down their dependencies. I don't want to
touch a lot of python code in the open written by scientists.

Thinking about software reproducibility is the last thing I have noticed on
repositories submitted with a paper on arxiv and other places of publishing.

Not even a requirements.txt or mentioning what they used.

~~~
dgoldstein0
I've spent years learning and practicing good software engineering practices.
Whenever I read about quality issues in scientific code, I wonder: how are
they expected to learn these things?

~~~
searchableguy
I don't think you need to worry about more than a few principles -

1\. Reproducibility

This is very important for scientists so I expect them to know about package
management and containers. You don't need to know _everything_ about it.

Virtual environments, docker and using a good package manager like poetry
(which is just an abstraction over virtualenv) shouldn't take more than a few
weeks. Once you are done with the basics, you can learn more as you build.

2\. Git and documentation

Add another few weeks.

They don't need to write good code. They just need to document it which they
should already be good at. Optimization, scalability, distribution and other
things can be handled by engineers but without documentation, it makes the
work a lot harder.

~~~
sgt101
Every engineering team that picked up a code base has at some point had a
meeting where someone starts ranting about how the code that they are looking
after stinks and it would be better to rewrite it, or move to some
alternative.

This seems to be true if the code is the product of one person working for a
few weeks or 20,000 people who work for 10 years.

A few years ago I had a conversation where the team insisted to me that the
extensive documentation and instructions for a package were worthless because
"good code is self documenting" the leader of the pack had shiny eyes and
banged the table several times.

Be cautious, be kind and look for value in the code that you are using - not
just problems. Also, consider: if you can pick up a code base (written by
someone who knows the science) would it not be more profitable to repair,
document and improve rather than demanding that the scientists learn both your
skills and your way of doing things? If you conclude the latter, would you
(hand on heart) be able to empirically demonstrate that your way of doing
things is the right way?

------
tehlike
I think this can be considered a scientific version of blameless postmortem.

There is no shame in making mistakes, and being honest about it should take us
further as a society.

------
rs23296008n1
The other side of this is that actual science happened. Admitting error is
science being done right. The search for the truth is fundamental and
detecting when it wasn't found is contributing to the greater truth.

Questions around methodology abound but at its core this is science walking
tall. Not hobbling along loaded down with sugar-coated lies about to collapse
into a coma. That this is seen as an exception or extraordinary is quite
illuminating in itself.

------
Nalta
I really hope those grad students that she mentioned didn't get a couple years
added on to their degrees as a result of this. I mean good on her for finding
the error, but I can't imagine what it would be like to be told "cancel your
thesis, I made a mistake and your work is now invalid."

~~~
bpodgursky
Luckily, they were masters' students, so the damage will be limited to a
semester or so at most (in the US, generally a master's thesis isn't the most
important part of a degree). If it was a PhD thesis... I don't want to think
about that.

~~~
OkayPhysicist
Our professor loves to tell the story of how one of his friends in grad school
based his thesis on a series of experiments to be run by a mars probe. One
morning my professor runs into him and he's just completely panicked. My
professor asked what was wrong, and the guy replied "My thesis is a debris
field on Mars".

So, it happens

------
throwaway285524
It's great that she did the right thing and had the paper retracted, but this
is still terrible on so many levels.

Maybe with a strong effect for every single subject a little more skepticism
would have been warranted in the first place? Some manual spot checking if
possible, or using a minimal independent implementation of the analysis code?

Who knows if she'd gotten her grant, her assistant professorship _without_ the
publication of this incorrect finding. Who knows who _didn 't_ get any of that
because they were a bit too careful in their work.

~~~
Thorrez
> Who knows if she'd gotten her grant, her assistant professorship without the
> publication of this incorrect finding. Who knows who didn't get any of that
> because they were a bit too careful in their work.

On the other hand, if she hadn't wasted her time on this useless study she
might have done more useful studies and her career would be better than it is
right now. She might have gotten even better grants if she hadn't made this
error, and maybe fewer other people would have gotten grants. Maybe those
other people being more careful helped them rather than hurt them.

I don't see how speculating like this is very useful.

~~~
darkerside
I think that underplays the difficulty of the work being done here. Or maybe
it just underestimates the amount of luck involved. The downside is clearly
greater than the upside. Why would you even imply the opposite?

~~~
Thorrez
I don't fully understand the relation between the difficulty of the work, the
luck involved, and upside vs downside. I don't even know what you mean by the
"downside", is that the downside of making the mistake, or the downside of not
making the mistake?

I'm saying the downside of making the mistake (for her) might be larger than
the upside of making the mistake. I don't think that's obviously wrong.

But my real main point is that speculation of this kind (in either direction)
isn't very productive.

~~~
darkerside
You're right that there is an opportunity cost here, and that it may be
underestimated by most people, but you just can't give people much imaginary
credit for things they might have done.

~~~
Thorrez
That was sort of my point, we shouldn't be giving imaginary credit to various
people like throwaway285524 was doing. Like I said, I'm not very happy
speculating in either direction.

------
peterlk
Remember when scientists in Italy were sent to prison for not predicting an
earthquake correctly [0]?

I know this isn't really what the article is about, but scientists are allowed
to be wrong unless and until politics is involved.

[0] [https://www.scientificamerican.com/article/italian-
scientist...](https://www.scientificamerican.com/article/italian-scientists-
get/)

~~~
Fragoel2
I live in the city struck by the earthquake and I can tell you that what the
article states is not correct. The scientists went under trial not for a wrong
prediction but for downplaying the possibile risk after thousands of smaller
earthquakes were recorded in the area.

------
jes5199
we still trust software too much. I suspect that most software-controlled
experiments are going to have errors like this! we should require that every
experiment has at least two clean-room implementations of its logic, and a
battery of smoke tests for common mistakes

~~~
robertlagrant
We certainly trust scientists' software too much :-)

------
neonate
[https://archive.md/0ikco](https://archive.md/0ikco)

------
kortilla
The most shocking part of this article was the revelation that the original
article _was not retracted_. The entire conclusion and data were entirely
wrong, yet the paper still stands. That’s really damning for the journal at
least and likely speaks to a pretty bad culture in the wider field.

Kudos to the author for doing the right thing, but the fact that there seems
to be no way to remove a paper that is blatantly false because retractions are
reserved for deliberate misconduct is horrifying. Not only does this setup
long term fucked up incentives (no downside to fraud if you paint it as a
whoopsie), but it also harms all work that had cited that work and anyone
doing literature reviews not realizing the ground other papers were standing
on has dissolved away.

~~~
dwighttk
>The editor and publisher were understanding and ultimately opted not to
retract the paper but to instead publish a revised version of the article,
linked to from the original paper, with the results section updated to reflect
the true (opposite) results.

I don't see what the problem is.

~~~
kortilla
The problem is that the old copy is still floating around and by replacing it
instead of retracting it, it’s not clear which papers are referencing the
bullshit one and which ones are referencing the real results.

Not retracting hid the turd from bibliometric tools that could have easily
notifies you of poisoned papers.

------
vinay_ys
If anyone is curious like I was to read the actual paper and look at the
source code, you can find them here:

[https://osf.io/b94yx/](https://osf.io/b94yx/)

The paper authors made a mistake, fine. But the scientific process and peer
review process should have caught it. It didn't. The author caught it
accidentally and then luckily decided to come forward (bravo!). This begs the
question of robustness of the whole scientific publishing process. I hope they
adopt the practice of doing a blameless RCA and improve the scientific and
academic peer review process.

~~~
asdkjh345fd
>This begs the question of robustness of the whole scientific publishing
process

It raises the question. Begging the question is an unrelated logical fallacy.
Unfortunately, there have been a ton of examples of the peer-review process
being essentially useless. Things like people deliberately putting things in
to test if the paper is even being read and none of the reviewers notice it.

~~~
vinay_ys
"This begs the question" is same as "This raises the question" – What is the
logical fallacy in how it is phrased?

~~~
asdkjh345fd
>"This begs the question" is same as "This raises the question"

No it isn't, that is a common error in modern English. Raising the question
means bringing a question into focus. Begging the question means to assume the
conclusion is correct in the premise.

[https://en.wikipedia.org/wiki/Begging_the_question](https://en.wikipedia.org/wiki/Begging_the_question)

~~~
vinay_ys
Thanks for pointing out that wiki link. It lead me to another link

[https://www.merriam-webster.com/words-at-play/beg-the-
questi...](https://www.merriam-webster.com/words-at-play/beg-the-question)

which seems to indicate my usage is quite acceptable in modern English.

~~~
asdkjh345fd
Normalizing errors in language is how language degrades. We no longer have a
word for literally, because people use literally to mean figuratively. We are
losing the ability to talk about begging the question, because people think it
means raising the question. The fact that English has degraded is not a reason
to give up and let it get worse.

~~~
renewiltord
As a prescriptivist, surely you do not wish to assign "begging the question"
that meaning then since it's wholly from a mistranslation from Latin
[https://languagelog.ldc.upenn.edu/nll/?p=2290](https://languagelog.ldc.upenn.edu/nll/?p=2290)

That information is also present in the Wikipedia link.

~~~
asdkjh345fd
No, I do not wish to assign that meaning, that meaning was already assigned a
long time ago. And you don't need to be a prescriptivist to want a language
that is capable of communicating our thoughts with each other.

~~~
renewiltord
Haha, no, it's not a pejorative. It's literally one of two schools of thought:
descriptivism vs. prescriptivism. What you're describing is prescriptivism.

~~~
asdkjh345fd
I didn't suggest it was pejorative, and I know what it is. There are not two
school of thought, that's a false dichotomy. That is like saying "there are
two schools of thought, catholic and protestant". You can want to prevent the
decline of language without insisting on trying to bring back already lost
definitions like a prescriptivist.

------
jbj
Well done by the author and thanks for sharing, especilly because of the
immense mental pressure. I think it is great that the journal was cooporative
for this. It would be interesting to see a single journal implement a very
easy undo-button for peer reviewed research and see how that reflects over a
series of years compared to the current model. Although very ridgid, the
scientiffic ecosystem is very robust, and we now see friction being removed
with efficient pre-print servers. I remember some early covid papers were
redrawn from bioRxiv by authors request.

------
lexpar
What a nightmare. I bet this raises the hair on the back of the necks of a lot
of other researchers. So much time and momentum can be invested into a
research track like this.

~~~
Pfhreak
Is this a nightmare? I read a story of someone who had a very reasonable,
strong emotional response to the error, but ultimately got credit for coming
clean and republished their results with new data. (And a different
conclusion.)

This is exactly how I'd expect something like this to work -- the author isn't
a bad person because they made an error. The co-authors aren't bad people
because they failed to catch it. Software and science are _hard_ , mistakes
are _going to happen_.

If anything, I think the researchers learned valuable lessons, and are better
researchers as a result. They have an anecdote they can share with more junior
researchers about this frightening thing that happened to them, and use that
to grow more people.

We should celebrate people who take the time to handle their mistakes properly
and share the lessons openly.

~~~
lexpar
I don't think she's a bad person, and I appreciate her response to the
situation.

But building a year (years?) around a project, attending multiple conferences
for it, bringing up a couple students on a research idea, then finding it's
all a software bug? That's a nightmare.

------
dschuetz
Kudos for writing an article about that. Most scientists just don't care.
Reproducible results are still considered overrated.

------
unexaminedlife
For me this story illuminates pretty clearly the fact that free-market ideals
and academia don't mesh that well.

------
jzer0cool
Is there a convenient place to input web url to get at the article text? I saw
someone post a few days ago ...

~~~
DonCopal
You're thinking of Outline.com

~~~
jzer0cool
Yes! This is what I was looking for.

------
squarefoot
Now if only politicians learned a thing or two from this guy. Not possible
however: scientists talk to brains, politicians talk to bellies and appeal to
instincts, which require them to appear strong by showing self confidence and
never admit any errors.

------
nednar
I'm surprised if updating papers to correct mistakes is not the common path.
Especially with software being a crucial part of it. I mean, how many software
products do we know of that shipped flawlessly in v1.0?

------
Gatsky
So this error was in the code that actually ran the experiment, not the
analysis. The experiment was effectively doomed from the beginning. The
mistake amounts to failing to calibrate and test your experimental apparatus,
which is a little hard to forgive, since it would have taken hardly any time
or effort. Perhaps I’m being harsh but scientific enquiry is too important to
be satisfied with conducting unrigorous experiments then apologising, which of
course one could do ad infinitum.

~~~
GiveOver
I agree. I write code all the time and I've never had a single bug. She was
careless

------
seemslegit
> The data set was gorgeous — every single one of the 96 participants showed
> the effect.

Ideally, that would have been an uh-oh moment.

------
pdar4123
Pro tip - When “every single one of the 96 participants show[s] the effect“ u
have a technical artifact. This is scientific common sense.

~~~
danieltillett
Not really. If you are counting fingers and toes you would not be surprised if
all 96 participants had 10 of each. It is all about context.

~~~
pdar4123
But you would be surprised if they all had 11, 10 is your null hypothesis.
It’s not context, it’s proper controls and the scientific method. This was
very very careless

~~~
jhanschoo
Your rebuttal stands but not your point, for the comment you were replying to
didn't properly contextualize their argument in a scientific experiment. It's
more like serve participants canned vegetable soup, but in the non-control
group warm it up before serving. Here you'd reasonably expect the non-control
to be universally more favorable.

They did have a null hypothesis and a control group. The problem was that the
non-control was different from the control in a way they were careless with,
that the scientific methodology does not catch (that peer review and
replication ought to have caught). e.g. if the favorable results were better
explained by subjects seeing soup taken out of the pot vs. from the can rather
than the temperature itself.

------
chaps
Since when does medium require sign-in to read? Anyone have a different link
to the post?

~~~
jtvjan
Disable JavaScript or use a text browser like Lynx. I've put the article in
this paste:
[https://pastebin.com/raw/pe6dfUtJ](https://pastebin.com/raw/pe6dfUtJ)

------
z3t4
This could possible be gamed/abused. Make great discovery - get grants and
tenure - only to retract the finding - and keep the money.

------
gfrff
She said this never happens. This happened all the time. It's just that she
was the first scientist to take the honest route and do a retraction rather
than letting it slip. I'm impressed with her integrity, but surprised by her
naivete.

~~~
SubiculumCode
She is not the first scientist to fess to a mistake. Heck, one of my
colleagues went through the same thing last year, and he also did the right
thing, and other scientists also appreciated his open honesty about the
matter, even though his mistake meant the retraction of a high impact article.

------
danieltillett
While I applaud Julia for retracting the paper, I do wonder if she handed back
the NIH grant she was given off the back of this false result. Some researcher
just missed out on a NIH grant that would have been funded if not for this
error.

~~~
renewiltord
The purpose of this grant was to run follow-up studies and to work on
productionizing the intervention. The purpose of follow-up studies is to
detect if result extends. The follow-up work detected result was due to
experiment setup error. Grant worked.

Actually, encountering this I think I suddenly understand why people get upset
about Kickstarter projects not delivering. It is a misunderstanding of a
probabilistic situation.

~~~
danieltillett
If the NIH grant scheme was not a zero sum game then this would be reasonable.
A scientist with real results missed out on a grant that they would have got
if Julia had not made this error.

There is a fundamental issue here that is not being discussed is that is the
science funding system incentivises people rushing results out and not check
if they are real. If Julia had been more careful she would have found the
results were false, not published, and not received a grant. So much of the
problems with a lack of reproducibility in science is the direct result of
people rushing to get out “novel” data and not checking if it is real because
if they do they don’t get funded.

~~~
renewiltord
This is just an optimization problem with where you place the incentives. The
follow-up study funding is not exclusively for false null hypothesis rejection
due to chance. It's also to allow for some amount of error in experimental
design or practice. No one wants 0% experimental design / practice error
because that will harm total experiments done per unit time which is also a
thing we want to go up. No one wants only certain results because we want
novel results / unit time to go up too. It's just multivariate non-linear
optimization.

Yes, someone whose conclusion was not due to error in programming may have
gone without a grant. That's perfectly all right. We optimize policy in the
aggregate.

You can make an argument that says that we're not correctly placed on the
manifold but no single situation will be convincing for that and you'll need
to make some case that a policy change to move to some other point in this
space will yield a positive total improvement.

~~~
danieltillett
True, but I would personally place more emphasis on avoiding false data being
published and less on novelty. If the consequence of publishing something
wrong were greater, and the costs of taking the time to do things right less,
then we might see more science we can trust published.

~~~
cycomanic
Not when people's livelihoods are on the line. You would simply get people to
not report.

What you are asking for is the equivalent of saying "firing programmers when a
bug is found will result in better software"

In science it will result in much less interesting science being done, because
this will desincentivize risk.

The big problem in science funding is that it already incentivizes low risk
projects (you should have preliminary results that proof it works...) so let's
not make it worse

~~~
danieltillett
I am arguing that the current funding structure encourages poor quality
science. Right now it is better to pump out papers as quickly as possible
rather than taking the time to check the data is not flawed.

From a software perspective it is like paying developers by the number of
lines of code delivered and telling them to not worry about bugs. Not an
approach that is likely to deliver quality software.

What we need in science is to find a way that people have the time and
incentives to publish results that are as accurate as possible.

Somewhat off topic I do wonder if we would get better software if we fired
programmers if bugs were found. I suspect we would not get much code written,
but the code that was written would be very high quality.

