
GitHub issue calling for retraction of Imperial College study for codebase flaws - jMyles
https://github.com/mrc-ide/covid-sim/issues/165
======
jgrahamc
I wrote a paper with co-authors asking for open source code in science
([https://www.nature.com/articles/nature10836](https://www.nature.com/articles/nature10836))
but this GitHub issue is stupid. Calling for retraction because you don't like
the code smell is dumb. Work on the code and see what its flaws are before
asking for retraction.

Yes, I wish the original, apparently C code, had been released but let's fix
the bugs in the code.

~~~
emerongi
This is the correct approach. We, the "software engineers", can jump in and
help. If it eventually comes out that the results were wrong then the papers
should be retracted. At the moment there is no proof of that.

That's the beauty of open-sourcing the code: people can help or verify.
Needlessly shitting on other people's work without any proof is just
disheartening to me.

~~~
einszwei
I'm reminded of the idiom "Don't throw the baby out with the bathwater". And I
agree that software engineers should more actively contribute their expertise
here.

With that said, if a research paper's main contribution is a model whose
results were evaluated using a code with major bugs then retraction of the
paper is only professional.

~~~
acqq
> if a research paper's main contribution is a model whose results were
> evaluated using buggy code

There's no proof of that at all in this case. The person complaining didn't
find any scientifically valid problem, only that he personally for some not
cleanly stated reason doesn't like the tests provided, which is clearly
stupid.

Since when should any tests be such that some random guy on github must like
them? If the code was used by the experts, and they maybe even used it for
years, who says that these experts are in any way obliged to publish all their
logs of their use of that code (which doesn't have to even exist in
publishable form)?

Even if all that they possibly iteratively did for potentially years could be
condensed to some tests, who says that it wouldn't take too long (as in,
months, years) for that? It's practically just a matter of good will that the
code is published at all, in any form at all.

~~~
einszwei
Apologies if my earlier comment came off as insinuating that code used for
research paper was buggy. That was not my intent.

------
ageitgey
Academic code has always been a mess (from the standpoint of professional
software developers). Anyone who has ever worked with code from academia knows
this. If you think this code is messy, check out any random Matlab simulation
strung together by any random series of grad students who aren't professional
developers in any random lab.

People in industry don't always do better, either. Look at the notebook code
shared on Github by Kevin Systrom for the rt.live site. It's just as messy.
But the point is that he shared it and that helps everyone see what is going
on and lets anyone improve on the work.

Publicly sharing model code on Github is a recent development and is amazing
for everyone. It gets the work out faster, allows for feedback on methodology
and coding style and generally lets everyone get smarter over time. We
shouldn't discourage this.

Posting retraction demands in Github issues based on coding style is not
helpful and shows a lack of communication skills. What is helpful is providing
useful, specific bug reports, submitting improvements, or sharing your own,
improved model. If you want to demand a retraction, do it via alternate means.

Remember, this crisis appeared suddenly on the world stage and whoever had a
model sitting around got thrust into the forefront of public discussion. But
scientists don't make policy. Policy makers make policy. A github issue is not
the place to blame someone for a policy you disagree with. That only
discourages people from sharing their work which would be a loss for everyone.

~~~
toyg
_> shows a lack of communication skills_

Your reading is very charitable. The political debate is a complete gutter,
and everyone is ready to jump at absolutely anything that can advance his
argument in the slightest.

As you can see in this very thread, jMyles is hardly unbiased on the matter.
He's just playing politics, code standards are an excuse.

------
KenoFischer
I think this sets a bad example. In many fields the current standard is that
code isn't published at all. If bad coding practices will cause you to get
bullied into retracting your paper, that's just another argument for why
people won't publish their code. Of course if somebody does find a big that
invalidates the results of the paper given the stated assumptions, that's
another matter, but that doesn't seem to be what's alleged here.

~~~
nradov
The standard needs to change. Publication of code and data ought to be
required.

~~~
denzil_correa
Why exactly? We could also probably require a website to test the application,
a mobile app on all platforms and then a native OS apps on all platforms. I
apologise for being facetious but writing code is a serious, specialized task.
It should be left to the people who specialise in it, scientists aren't the
specialists for it.

I think it's important to understand that we only require information to
generate algorithms written in the paper. As such, assumptions and limitations
should be listed enough to generate the material in the paper. One way to do
so is code and data, it is certainly not the only way.

~~~
OJFord
Do reviewers of such papers write their own implementations, and verify they
get (roughly) the same results?

~~~
tastroder
Not generally, no. Even if they had access to the code at the time of review,
which is hardly ever the case in my experience, there's simply no time. I also
don't see a point in doing so, most papers only describe minute changes to
some state of the art thing that's well established so you trust in the
academic integrity and experience of authors to get that part right and
usually focus on parts that matter over things like the ones this issue brings
up.

Even for real reproductions, that require work just like they do in any other
scientific domain, the points in that issue seem pretty irrelevant. There's
things like FAIR standards on reproducibility but as far as the status quo
goes that repo doesn't look too bad. I could not care less if tests for some
project are badly written, at least it's in written in a non-obscure language
and shows a somewhat sane structure. What's next? Calling for redactions
because somebody didn't follow the same tabs vs spaces paradigm?

There's nothing in that issue w.r.t. whatever scientific finding this was used
for and this general phenomenon is really fascinating. Instead of a
constructive discussion with the technical or scientific folk that put their
work out there, or engagement with the politicians that drew conclusions based
on those scientific findings, you get github issues and Twitter threads mixing
a bunch of unrelated concerns.

~~~
metreo
I think the matter at hand is much less trivial than tabs v. spaces.

------
emerongi
You're trying to apply software development techniques to a discipline where
testing is usually done in a much different way. Yes, we should all try to be
perfect and write the most magnificent code with the most bulletproof testing
methods, but reality is different.

How about you write tests that clearly prove that the results from this
simulation are absolutely wrong? The code is right there! And you're a
"software engineer"! Then start talking about retracting papers.

~~~
Gibbon1
> a discipline where testing is usually done in a much different way

This reminds me of the horror mainstream code monkeys express when they see
the source code for automotive firmware. Thing is automotive people functional
test the shit out of everything. And they don't let jr developers refactor
proven good code because they don't like the way it looks.

------
rvz
Well I would rather have the code already available for transparency and then
we can discuss about the quality later than it not to be open-source at all.

I wouldn't expect research scientists to write an elegant and well structured
program that ticks all the coding practices boxes at the first time as they
are not "software engineers" unless that's their area of research or
expertise. They want to present their results and the "code quality" comes
secondary to them which can be done later. Surely the Linux source code wasn't
cleanly structured in its first open-source release.

However, a quick skim at the source, one may suggest that the authors were
writing C++ in a style of a C programmer. Maybe one can run a clang-analyzer
on the source to find all sorts of issues, I guess.

------
mikekchar
The last comment in the issue (as of this posting) posted to a comment from
John Carmack on Twitter:
[https://twitter.com/ID_AA_Carmack/status/1254872368763277313](https://twitter.com/ID_AA_Carmack/status/1254872368763277313)

I specialise in legacy code. I've seen my fair share of abysmal code that
works. This is pretty awful, but far from the worst I've ever seen in my
career -- even in systems where public safety was critical. I've not looked at
this code in any real detail, but I suspect John Carmack (who has doubtless
seen a fair amount of complex C code in his life) has it right.

On the plus side, if anyone wants to practice refactoring gnarly C code:
here's your opportunity.

~~~
acqq
The twits point that Carmack personally worked on that very code (to prepare
it for the Github release)!

He explicitly writes there:

"it turned out that it _fared a lot better_ going through the gauntlet of code
analysis tools I hit it with _than a lot of more modern code. There is
something to be said for straightforward C code._ Bugs were found and fixed,
but generally _in paths that weren 't enabled or hit._"

"the performance scaling using OpenMP was already pretty good, and this was
not the place for one of my dramatic system refactorings. Mostly, I was just a
code janitor for a few weeks, but I was happy to be able to help a little."

and

"I can’t vouch for the actual algorithms, but _the software engineering seems
fine._ "

As shown, he even points that the believes "a lot of more modern code" would
have been worse than that "straightforward C", which matches my experiences.

------
andrepd
Of all the things wrong with this study, bad code is probably the least (bar
actual bugs which mess with results). In fact, it's par on course for code in
non-cs academia. There's such a thing as the "physicist code" stereotype.

~~~
pnako
It's generally not used to push policy affecting billions of people, though.

But as I said in another comment, let's blame whoever listened to that garbage
fire of a model, not the model itself. We don't listen to Minecraft modders
for structural engineering advice either, and we did, they would not be the
ones to blame for collapsing buildings.

~~~
metalliqaz
The policy is mostly a reaction to what happened in Wuhan, Iran, and Italy.
Especially Italy. The mass graves filling up are a pretty good motivator, as
it turns out.[1]

[1] [https://www.theguardian.com/world/2020/mar/12/coronavirus-
ir...](https://www.theguardian.com/world/2020/mar/12/coronavirus-iran-mass-
graves-qom)

~~~
Gibbon1
One could say the model was experimentally verified in Wuhan, Iran, Italy and
New York.

------
jspaetzel
This is such a dangerous precedent. This type of call out will only discourage
the release of source code making legitimate peer review more difficult.

------
yufeng66
The issue is crazy. It basically come down to there is no unit test? Unit test
traditionally is not done for numerical code in Fortran or C. We sent people
to the moon on code without unit test. We build atomic bombs on code without
unit test. It doesn’t mean the code is not tested. Those numerical code are
just tested in a different way.

Also remember, all models are wrong, some are useful. Is this particular model
useful? Probably.

------
metalliqaz
This github issue sounds a lot like it is politically motivated. He is clearly
trying to make the leap from "I don't like the tests in this project" to
"public policy is wrong." No actual evidence, just cranky poo-pooing on the
project.

------
nodamage
It's worth noting that this issue was posted by a moderator of the
/r/LockdownSkepticism/ subreddit to push a particular political agenda, and
now this repo is being brigaded by users from said subreddit.

------
fergonco
If only all the studies that do not publish their code got half this level of
criticism.

~~~
s9w
Well most of science doesn't matter. This was used to introduce cataclysmic
policy changes.

~~~
toyg
This was hardly the only element that UK policymakers had to take into
account. By the time this study was published, numbers were already
accelerating and most of Europe was already in lockdown.

~~~
s9w
Hard to know really. I think it's likely the big question at the time was how
bad will this get. And they did run and use this simulation. Frankly people
who are willing to use this for anything should have no say in anything more
important than their own papers.

There is the great (Carl Sagan?) quote "extraordinary claims require
extraordinary evidence". If other parts of the puzzle are of the same quality,
I want to see heads rolling. The PCR test for example.

------
sm_1024
The title is misleading, the GitHub issue doesn't actually point out any flaws
in the codebase.

------
pythondev64
Just because unit tests aren't provded in the codebase, it doesn't mean
testing hasn't been done. Quite often it's difficult to model certain things
in tests, e.g. Linux kernel drivers, embedded devices etc.

However, it seems that the author of the GitHub issue has personal/political
problems with the Imperial researchers because of the lockdown:
[https://pastebin.com/LadbM3E1](https://pastebin.com/LadbM3E1).

~~~
tastroder
Thanks for posting that.

@OP: So let's get this straight, you created that GitHub issue within one hour
of seeing the repository, rallied whatever Discord server that is to join your
cause and put it on HN for reach?

As an academic and software engineer, please point me to where I can file an
issue for you to retract your issue.

------
boublepop
This is politically motivated and just plain dumb. Someone with the intention
of finding holes posted it with the assumption that the code must be wrong
because it’s build in a way that makes it hard for that individual to find
potential errors. That doesn’t mean that there are critical errors and this
“issue” should not have been submitted unless there are, but the author is
impatient and has a foregone conclusion so the issue was submitted and pushed
in social media instead of doing the actual work of looking for critical
errors. And likely there are no critical errors, only minor ones as is typical
of heavy simulation code bases.

You don’t do testing via unit and integration tests, you do testing through
simulating known systems and building confidence in code correctness over
time. And by reviewing the mathematics behind the models, and reviewing that
the code matches the math.

That’s harder for outsiders who want to discredit you to dig into that and
point fingers at it. But not at all impossible, however in the science
community we don’t start pointing fingers until we have actual criticism to
back our critique.

------
tgsovlerkhgsel
> The tests in this project, being limited to broad, "smoke test"-style
> assertions

Doesn't this put the code far above most academic code by _having_ some tests?

------
ubercow13
What's wrong with comparing the code results to a hash of a known result?

~~~
metreo
You aren't validating the result its self!

~~~
ubercow13
It is if you know that result is valid, for example if you have validated it
some other way and this is a regression test, or you calculated it by hand.

------
cjhopman
This is the only time I've ever actually been embarrassed to be part of this
profession. Maybe other cases i can just say it's limited to this other
subgroup that doesn't include me...

------
chkaloon
I suppose it's only fair that if software engineers think they can play
epidemiologists, that epidemiologists could think they can play software
engineers.

------
tjmtjmtjm
Welp, this is the best argument for regulating the title of “Engineer” I’ve
ever seen.

------
pnako
Blame the politicians who listened to some random guy with some code instead
of the random guy with some code himself.

~~~
metalliqaz
"random guy"

lol

------
jMyles
The study in question is the Imperial College study projecting the spread of
COVID-19. It used (apparently an earlier version of) this code to generate the
data for its analysis.

This study was, of course, the basis for lockdowns around the world.

~~~
toyg
_> This study was, of course, the basis for lockdowns around the world._

[citation needed]

Afaik this was a lone study that might have been relevant only to the UK
debate (and very late at that). Italy had already been in full lockdown for
weeks before it was published, quite a few other European countries had gone
in shutdown or closed borders, and obviously Asian countries had already taken
countermeasures for months.

Being so cavalier with the truth is, of course, why intelligent people don't
take seriously a lot of anti-science attitudes.

~~~
ubercow13
The FT claim it was:
[https://www.ft.com/content/16764a22-69ca-11ea-a3c9-1fe6fedcc...](https://www.ft.com/content/16764a22-69ca-11ea-a3c9-1fe6fedcca75)

~~~
fceccon
The timelime doesn't match tho. The article is dated 19/03 and mentions a
study released the previous week (so 12/03 or late). The italian lockdown
started on the 22/02 [0] for a few towns, and a couple of weeks later was
extended to the entire peninsula.

[0] [https://metro.co.uk/2020/02/25/towns-italy-lockdown-
coronavi...](https://metro.co.uk/2020/02/25/towns-italy-lockdown-
coronavirus-12298246/)

~~~
ubercow13
Indeed, I forgot how late the UK/US policy changed. The claim that this study
affected policy around the world doesn't seem to hold up.

