
Code review of the Imperial College London Covid-19 modelling - jimmcslim
https://lockdownsceptics.org/code-review-of-fergusons-model/
======
spamizbad
John Carmack's response:
[https://twitter.com/ID_AA_Carmack/status/1258192134752145412](https://twitter.com/ID_AA_Carmack/status/1258192134752145412)

~~~
Flenser
There's 3 tweets in that thread:

> It is disappointing, but not really unexpected, to see this take on the
> epidemic simulation code release: [https://lockdownsceptics.org/code-review-
> of-fergusons-model/](https://lockdownsceptics.org/code-review-of-fergusons-
> model/) If we accept it at face value, we have a retired software engineer
> making the case that non-determinism in a simulation shows incompetence. I
> am all \

> for deterministic-everything, but issues with random seeds and parallel
> floating point reductions are extremely widespread, and a great deal of
> science gets done with non-deterministic simulations. Heck, professional
> software engineering struggles mightily with just making \

> completely reproducable builds. I take no position on the mechanics of the
> simulation or the conclusions drawn from the runs, but implicitly
> encouraging scientists to keep their code secret if it isn't "perfect" is
> damaging.

------
dekhn
I have pretty deep experience in modelling code across a wide range of
dimensions (academic, industrial and government, supercomputing to
microcontroller, internet ads to protein design). I've seen a lot of modelling
codes, and run a bunch of them, written some of them, and helped people fix
problems in them.

Throughout that time I've seen a wide range of modelling quality. Very few
people can churn out really nice code that solves useful problem, update that
software for a wide range of uses over the years, keep it documented, pay down
technical debt, fix bugs, write great tests, and make sure the numerics are
excellent. Often times these things are built by people who are experts, but
spend most of their time in meetings explaining the situation to politicians,
or running labs and publishing papers.

Having read this particular article, many of the problems I see being
complained about are typical and happen in industry frequently, even in highly
functional orgs with strong incentives to build high quality software.
Further, it really just seems like the author had a very strong position about
lockdowns, and tried to make a quantitative/technie takedown of some code that
was used to make some decisions. The article really drips of that kind of
animosity. i see a number of technical errors and ambiguity which make it
unconvincing.

That said, we _could_ have far better codes. In principle, all the data and
support libraries would be open, the pipelines to produce the data
reproducible, maintainable, and well-tested, and anybody would be able to
write a simple notebook that reproducibly model their hypotheses for large
populations in a way that a large number of people could inspect and come to
their own conclusions.

~~~
downerending
Crappy code abounds, and yes, the authors seem to be grinding an axe. That
notwithstanding, a reputable scientist, seeing that this code was being used
to make decisions that would directly affect millions of people in a life-and-
death way, would stand up and proclaim, "This is _just a fucking prototype and
not to be used for important policy decisions_." Not doing so is pretty
unethical. (Far more so than breaking quarantine for a little nookie...)

------
HarryHirsch
_Investigation reveals the truth: the code produces critically different
results, even for identical starting seeds and parameters_

The fellow claims to have been at Google, yet seems not to have heard about
non-determinism in parallelized code? Are we supposed to throw away all
molecular dynamics results that were not run on a single core?!

And what's with the beef about the mathematics? Yes, it's apparently poorly
determined, but that is a feature, you can extract uncertainty from it. An
engineer, even if it's a software "engineer" should have heard about
experimental error!

~~~
smsm42
> heard about non-determinism in parallelized code

There's a term for it. It's "race condition". It is usually considered a bad
thing. Exactly because it makes the result of the code unpredictable, unless
you take special care to control it. Now they question is - did the authors of
the model do it? Given that their advice is "just run it single-threaded" \-
no they didn't.

~~~
HarryHirsch
In MD, you can eliminate race conditions like a computer scientist would or
you can listen to the underlying physics and ignore them when appropiate. In
real life, we depend on weak causality, not on strong causality. A throw of
dice will demonstrate the fact.

~~~
smsm42
A throw of dice will demonstrate the fact that we shouldn't care about bugs in
models caused by race conditions? How does that even make sense?

~~~
rovolo
The "race condition" is that the random numbers fed to each thread are
nondeterministic given a specific seed. Even though a specific run can't be
reproduced due to the nondeterminism, the random values aren't causing math
bugs.

Let's say we are doing a stock market simulation. You have 2 shares of company
A and 5 shares of company B. Your profit depends on the change in share price.

    
    
        srand(seed); // generates: [rand_0, rand_1, ...]
        total_profit = 2 * profit(rand()) + 5 * profit(rand());
    

The race condition can result in two different outcomes given the same seed:

    
    
        total_profit = 2 * profit(rand_0) + 5 * profit(rand_1);
        total_profit = 2 * profit(rand_1) + 5 * profit(rand_0);
    

Both of those outcomes are valid. Even though there's a reproducibility bug,
it doesn't affect the correctness of the model.

------
s9w
The code is a dumpster fire. I wouldn't trust it to manage my shopping list
and would get in deep shit at work for something like that. Not sure if that
article itself makes good points, but the code alone should anyone nope out.
Publishing any results from this feels criminal, let alone something world-
changing as their reports.

------
rontoes
This is an incredibly poor take. It's well known that academic software
doesn't follow great software engineering practices. The model is an
incredibly complex piece of software that attempts to model stochastic
behaviour. That it isn't fully deterministic (with a fixed random seed)
doesn't make thee research invalid and shouldn't discredit research that
builds on the model.

>On a personal level, I’d go further and suggest that all academic
epidemiology be defunded. This sort of work is best done by the insurance
sector.

This is another level of crazy.

~~~
vikinghckr
>This is an incredibly poor take.

No it isn't.

> It's well known that academic software doesn't follow great software
> engineering practices.

That doesn't make it acceptable. Also, it's also "well known" that vast
majority of academic work has zero tangible effect on society. This isn't one
of those works. It's possibly the most important piece of academic work that
has happened in recent memory. So the bar for this is MUCH higher than typical
academic research work.

>. That it isn't fully deterministic (with a fixed random seed) doesn't make
thee research invalid and shouldn't discredit research that builds on the
model.

It does make it invalid, when the difference between runs is as big as 80,000
estimated deaths which can lead to dramatically different government policies.

> This is another level of crazy.

No it's not. Academia is way behind the industry when it comes to modeling the
economy and the real world.

~~~
spamizbad
> No it's not. Academia is way behind the industry when it comes to modeling
> the economy and the real world.

The insurance industry is expected to ask the government for bailouts because
none of their models can account for the fallout from this, just like AIG did
during the '08 crisis.

~~~
twelve40
Not to protect the insurance industry, which I know nothing about, but how
does their hypothetical failure, misery and bailout relate to, or validate the
study in question?

~~~
arkades
Criticism: Academia fails to model the real world, in contrast to industry
("is way behind industry").

Response: An entire industry of competing and well-funded actors that
specialize in modeling and predicting expensive failure states also failed to
model the world accurately.

Implication of Response: Where is the evidence that this is inferior to
industry?

~~~
no-s
>>Implication of Response: Where is the evidence that this is inferior to
industry?

A meta implication is that expert models were insufficient to inform
decisions.

------
hartator
Actual code: [https://github.com/mrc-ide/covid-sim](https://github.com/mrc-
ide/covid-sim)

~~~
dwaltrip
Thanks for the link.

As I randomly clicked through, I wasn't _that_ horrified until I got to this
file:

[https://github.com/mrc-ide/covid-
sim/blob/master/src/SetupMo...](https://github.com/mrc-ide/covid-
sim/blob/master/src/SetupModel.cpp)

Of course, it's absolutely insane to suggest that all of academic epidemiology
should be defunded.

~~~
smsm42
> I got to this file

I have no idea how ever anyone could make sure such code is correct. What I
suspect happened is it was tweaked until it generated the results that seemed
plausible, and then left at that. But if somewhere in that huge haystack
there's some > that should be <, only by sheer luck somebody could find it.
And since there's no unit tests (I was pleasantly surprised to see "test"
directory but of course no unit tests there) it's probably impossible to find
small errors if they still produce plausible results.

> Of course, it's absolutely insane to suggest that all of academic
> epidemiology should be defunded.

True, article probably would be much better if it stayed within the area of
expertise of the writer.

~~~
lbeltrame
I'm hoping that the result of this critique, whether the model is correct or
not (which is beyond the scope of the critique), will push for better software
writing in academia.

There's some discussion in the GH issues, too: [https://github.com/mrc-
ide/covid-sim/issues/165](https://github.com/mrc-ide/covid-sim/issues/165)

I'm hoping it doesn't get hijacked and turns into a flamefest, which would
distract from the main point.

------
scarmig
It's disappointing to see this flagged.

Imagine the Trump administration was using a model with similar code quality
to justify reopening the economy. We would not be making excuses for it, or
flagging articles that criticize it. In fact, if I told you today that I was
running the model with a certain random seed and getting a result that said
the US would have <5k additional deaths in 2020, how would you even dispute
that?

The biggest issue with this link is that it's less a code review than a rant
about the code quality, but the code is so badly written that it's hard to
know where to start doing a code review.

------
lbeltrame
IMO this should be unflagged and the link changed to [https://github.com/mrc-
ide/covid-sim/issues/165](https://github.com/mrc-ide/covid-sim/issues/165)
which is at least non-partisan unlike the very domain name of this submission.

~~~
TMWNN
This thread has (again) been flagged.

------
aphextron
>"lockdownsceptics.org"

It seems their conclusion is foredrawn, so I wouldn't expect much scientific
rigor here. This article reeks of the kind of cherry-picking and straw man
arguments you see in climate change denial.

~~~
smsm42
> It seems their conclusion is foredrawn

Said the person who made a conclusion about the article by looking at the
domain name of the site it is published in. Oh the irony!

------
avs733
Well at least they are upfront about their bias...and I must tell you, as an
academic, I'm SHOCKED, SHOCKED I tell you to find ugly code in a research
paper. And from my prior experience working as a defense contractor, I'm
SHOCKED SHOCKED I tell you to that its possible for governments to possibly
use messy code for decisions making. But what I'm most shocked by is that a
software engineer thinks them not understanding a model from another field is
a valid critique of it (so shocked I laughed).

If you are curious about the reliability of this particular source you can
start at the end and read backwards:

"On a personal level, I’d go further and suggest that all academic
epidemiology be defunded. This sort of work is best done by the insurance
sector. Insurers employ modellers and data scientists, but also employ
managers whose job is to decide whether a model is accurate enough for real
world usage and professional software engineers to ensure model software is
properly tested, understandable and so on. Academic efforts don’t have these
people, and the results speak for themselves."

Sue may be a great programmer but she regularly steps out of commenting on the
code to comment on the model...which is like the guy who designed the tractor
wheels telling the farmer what to plant. While readability of code may be
nice, not a single comment she makes actually disputes the findings of the
model. Some things in nature are actually stochastic (I'm sorry...RANDOM
::clutches pearls::). If you think the entire world is deterministic
great...show me the computing power and societal knowledge that can account
for those variables. And if your solution to global health is capitalism, look
at the gates foundation, if you wanna look at what other approaches have
gotten us, look at the Carter Center and Jonas Salk. Epidemiologists didn't
build the ObamaCare website...

In the past 100 years, epidemiology has:

* Brought us a world where infectitious diseases are no longer the SINGLE LARGEST CAUSE OF DEATH to the human species

* Made workplaces safer, and the food chain safer

* Reduced the maternal death rate by 90%

* Eliminated small pox entirely and eliminated multiple other diseases practically

I don't know Sue, but I'm team Salk not team Sue here...what a loon.

~~~
smsm42
You are right on many points here, but I think you are completely wrong on
randomness point. Randomness from the model, randomness from real word being
non-deterministic and randomness from bugs in the code are qualitatively
different. Yes, you can't make real-world experiment to produce exactly the
same values all the time (if you did, you'd probably suspect something is
wrong...). Yes, model can be probabilistic and return different results. But
if different results come from race conditions in the code, and these are
significantly different, then the model doesn't model what it is supposed to
model, it's no longer the model of the real world. It's a model or the real
world plus buggy code with random CPU scheduling. It's fine if we lived in a
computer simulation running on a buggy VM with race conditions - then it would
model our would perfectly. But since there's no indications we are in such
world, the error introduced by this bug - and it looks it's not negligible at
all - makes model diverge from the real world, and decisions regarding the
real world made based on this wrong model are also wrong. It's a big problem,
and no amount of other deficiencies of the article - which you appropriately
highlighted - diminishes this very serious problem.

------
alanfranz
The article is partisan, and yet it DOES make some valid points.

------
DiogenesKynikos
Rather than defunding epidemiology, how about giving them enough money to hire
professional programmers?

It's a good investment, given how much is at stake.

~~~
no-s
>>Rather than defunding epidemiology, how about giving them enough money to
hire professional programmers?

See "Moral Hazard". We should not expect integrity issues to be solved by
throwing money at the perpetrators. It's more reasonable to suppose they would
spend extra funding on hookers and crack cocaine, or some similar
academic/bureaucratic equivalent.

If you have money/time/skills to throw at the problem, I implore you to take
up a clean room open source effort to reproduce the model.

~~~
DiogenesKynikos
Who are the "perpetrators" here, and what "integrity" issues are you talking
about?

Those are pretty accusatory words you're using. This is a general problem in
scientific research. The scientists care about and know a lot about the
science and mathematics of their models, but they don't have formal training
in programming, software development, etc. They prioritize hiring graduate
students who know about the science, and they frankly don't have the budget to
poach programmers from industry. Do you know what usually happens to people in
graduate science programs who are more interested in programming than the
science? They end up at tech firms, making 3-4x as much as they would if they
pursued a postdoc.

Research that is incredibly important to society is being done on a shoe-
string. The "perpetrators," in my eyes, are the politicians who have failed to
properly fund epidemiology, virology, public health, and other vital fields.
Something is wrong in a society that can spend God knows how much on
increasing the accuracy of ad-targeting algorithms by 1%, but which can't
afford to fund a FTE programmer for its leading epidemiological research
group.

------
TechBro8615
It’s very disappointing to see this submission flagged. Further, the general
trend of politicization of the scientific process is highly worrying. I fear
the response to the pandemic is being driven by an emotionally political
monoculture rather than data, facts and critical reasoning.

Gailileo would not be happy to see this.

~~~
quotemstr
We're in the midst mass hysteria --- the kind of thing you normally see in
history books. What do you expect? What's more disturbing to me is seeing how
otherwise-reasonable people are excusing bugs in this model (bugs that they
would never accept in other software) with flimsy rationales presented
aggressively.

You have to keep in mind how people actually _think_. Most of human reasoning
capacity doesn't go into the beautiful work of intellectual exploration.
Instead, most of the time, we use the faculty of reason for the crude and
dirty job of finding support for our preconceptions. When facts conflict with
one of these preconceptions, we invent excuses to discard the facts --- the
smarter the person, the more elaborate the excuses. Facts seldom change
anyone's mind.

In certain circles (including, unfortunately, circles composed of smart and
influential people) it's become taboo to oppose the lockdowns, in part because
the pro- and anti-lockdown positions have developed certain partisan
connotations in the United States. At the same time, the weight of
accumulating evidence (empty hospital, declining infection rates, repeated
large modeling errors, high serum antibody prevalence, a very steep age curve,
and large economic losses) is beginning to make the pro-lockdown stance
untenable. But a lot of pro-lockdown people, having become emotionally
invested in their position, are unwilling to back down. Consequently, we see a
ton of elaborate excuses for discarding, ignoring, and suppressing information
that further weakens the pro-lockdown position.

The HN community is composed disproportionately from these circles, many of
which claim to make up the "respectable" part of society. The HN flags on this
article are part of this process of rejecting ideas that conflict with dearly-
held preconceptions. The author of the code review post makes valid points and
the snide dismissals in the comments on this article say more about the makeup
of HN commentariat than they do about the code review itself.

