Hacker News new | past | comments | ask | show | jobs | submit login
Forecasting s-curves is hard (constancecrozier.com)
266 points by osipov 3 months ago | hide | past | favorite | 150 comments

Leaving modeling up to the professionals is completely the wrong lesson. Time series forecasting from business, epidemiological or economic data forecasting is hard so you should always take the results with a large grain of salt. The professionals get it wrong too and just like the weather nearby points from the model are more likely to be accurate than distant ones.

Smoothing helps. Estimating error helps. Domain knowledge helps. Experience in applying the model helps. There are techniques other than the one demonstrated in the article. Sometimes those help. There are also professionals who create models which are used to support the bias of the modeler or the modelers employer (i.e. cherry picking).

Politicians and pundits more often than not take the results of these models and draw conclusions which are unwarranted or at least highly uncertain without mentioning the uncertainty or only giving it lip service.

Yes. On one hand, I tire of numerous hot takes that start with "I'm not an epidemiologist, but..." which is followed by opinion un-anchored by practical experience and are coupled with a disregard for what has come before (community context). There are also a number of people in adjacent fields who are taking advantage of the situation to advance their careers.

On the other hand, I realize most experts carry the baggage of unrecognized assumptions in their toolkit, which need to be challenged (at the first principles level) by people outside the domain who don't carry the same baggage.

The ideal situation would be to leverage the expertise of both epidemiologists and non-epidemiologists on a common platform (that is not Twitter) and have them check and build on top each other's work.

This is maybe the only sane answer I have ever seen on these kinds of topics. I wish I could have come up with something this succinct.

Uncertainty is part of the business for something like this. Good faith, open communication is essential to public expense.

That’s one of the reasons why some of the governors in the US are getting accolades as their approach to communication contrasts against other political figures with a more political focused approach.

Capturing and interpreting risk appropriately is more important than being right. The various models scoped the problem, but it’s hard to model things like a complete breakdown of federal response.

> Time series forecasting […] is hard […]

> Leaving modeling up to the professionals is completely the wrong lesson.

I don’t understand what you’re saying. Nobody is advocating that amateurs be precluded from modeling to their heart’s content.

So what are you advocating? One should discard models from professionals? One should prefer models from non-experts? One should crowdsource models? One should make policy decisions based on gut instinct?

> I don’t understand what you’re saying.

I think what GP means is, leaving it upto professionals alone is a recipe for disaster because of the inherent corruption, greed, biases, and prejudices at play.

> So what are you advocating?

Not speaking for GP, but simpler models are more likely to be a good approximation and more than enough, anyway [0]? Simplicity helps because it stops the pundits and politicians from sort of running away with it?

[0] Newton's theory of gravitation has been supplanted by Einstein's theory of relativity and yet Newton's theory remains generally "empirically adequate". Indeed, Newton's theory generally has excellent predictive power. Yet Newton's theory is not an approximation of Einstein's theory. For illustration, consider an apple falling down from a tree. Under Newton's theory, the apple falls because Earth exerts a force on the apple—what is called "the force of gravity". Under Einstein's theory, Earth does not exert any force on the apple. Hence, Newton's theory might be regarded as being, in some sense, completely wrong but extremely useful (The usefulness of Newton's theory comes partly from being vastly simpler, both mathematically and computationally, than Einstein's theory). https://en.wikipedia.org/wiki/All_models_are_wrong

> because of the inherent corruption, greed, biases, and prejudices at play.

You and I must be thinking of different professionals.

At any rate, here are two very very very simple models of this epidemic: a) cases will grow linearly, b) cases will grow exponentially.

Super simple. extremely different outcome. Now, one still has to make policy decisions.

I think the solution is not fewer professionals, but more and better professionals.

> You and I must be thinking of different professionals.

Not sure about you, but I suspect they're thinking of professional politicians and the professionals politicians hire.

> You and I must be thinking of different professionals.

That's the kind of bias he was probably thinking of.

Both of these are wrong, and the exponential one becomes more wrong faster, while being a better match for the beginning of the curve.

Exponential processes don't exist in nature, they always break into some other curve at some point.

This is a useless bit of pedantry. The exponential portion of the sigmoid curve is the only part that matters to human health, survival, and economic well-being.

No it's absolutely not!!

Epidemics are (more) properly modelled by the Gompertz function[1], and the position of the "transition" between the exponential portion and non-exponential points it the important point.

That is exactly what the person you are replying to is saying ("they always break into some other curve at some point.").

[1] https://en.wikipedia.org/wiki/Gompertz_function

Is the Gompertz function the function behind the standard "epi curve"? It appears from the CDC web site and other places that the epi curve is the standard way epidemics are modeled. The CDC even has a simple tutorial for creating them: https://www.cdc.gov/training/quicklearns/createepi/index.htm...


An epi curve is created from observations and shows new infections per day.

The Gompertz function is a mathematical formula and can be used to estimate total infections up to a date.

However, if you take the epi curve and plot the aggregate number of infections you'll find a curve that the Gompertz function approximates quite well.

Then why, pray tell, is the title of the article we are discussing not "Forecasting exponential curves is hard"?

The entire point is that forecasting exponentials is easy but becomes badly wrong at some point and that point is difficult to forecast

If Einstein's theory roughly says that massive objects deform space proportional to their mass, the effect of which is that massive objects (all else held equal) will tend to move towards each other more so than they would otherwise; meanwhile Newton's theory says that massive objects attract each other proportional to the product of their masses divided by the square of their distance; and Newton's theory does in the vast majority of useful cases yield approximately the same numerical result as Einstein's theory, then how is Newton's theory not an approximation of Einstein's theory?

Consider the argument from the point of mathematics of it all, and the entire genre of relativity which is a vehement departure from classical physics.

Newton's model captures the essence but isn't the eternal truth. Relativity proved that. Today, singularities have been proven to exist that point to missing pieces in Einstein's theory.

The point is, simpler models are a good approximation even though they don't hold true under scrutiny, so, at what point do we stop obsessing over details and complex models professionals come up with or the policy-makers seek because surely there's law of diminishing returns at play here?

The law of diminishing returns only works in linear spaces. When you reach that point the only progress you can make is to create a better model of reality that would include these wraps.

yes, I do think that there is nothing wrong with fitting the data and asking questions

suggesting that only an "expert" can possibly fit the data correctly is the wrong conclusion

It's certainly true that forcasting using exponential fits is bound to have huge error bars, but it also seems foolish to "show" that it can't be accurately done early on a curve without noticing that it is fairly easy to fit (and relatively stable) once it becomes non-exponential (e.g. once rate of change is constant).

Of course in many cases there are external bounds that are similarly useful, for example when you know the maxima (e.g. full population size). Then the question to the early data is simply whether a fit to an exponential is "better" than successively higher order polynomials.

Currently, this forcasting information seems not particularly helpful at all for fitting epidemiological data for Covid19 since nowhere do we have data for a case where strict distancing hasn't been an effect of high death and hospitalization rates well before 50% of the population would provide any effective immunity.

True. Also, if you read Superforecasting[1] you'll learn about several experiments where "non-experts" beat experts by a lot by just being well informed. Even when the experts had access to confidential data.

[1] https://www.amazon.com/Superforecasting-Science-Prediction-P...

Superforecasters don't have a great record on the pandemic:


I’m very skeptical of evidentiary patterns laid out in NYT bestseller style books.

You are right to be. Tetlock spun the conclusions of the project far boyond what was warranted, especially because the project was heavily flawed.

I participated in the initial part of the project, figured out the winning strategy very quickly, and then dropped my participation immediately. How to be "super at forecasting" in Tetlock's eyes? Have lots of spare time in the middle of the day when professional are at work, because that is when questions get posted, and easy points can get made. (To win a prize that is less than the hourly rate of a professional.) Hmm, no wonder that the conclusion was that people with no expertise but plenty of spare time can beat the pros...

You mean beat the people that actually do that for a living?

His project[1] was part of program created by IARPA[2] so I wouldn't be that skeptical



Are these what the book is based off of? Or are these just the credentials of the author? In the former case I would find that more compelling to be worth looking into than just the link to the amazon book. In any case, I wasn’t trying to specifically dismiss the book (which I have not read) but just pointing out that linking to book of that sort doesn’t give me (personally) the feeling that there is much weight behind the citation. The other links you provided do the opposite!

Yes the book is based on that project.


Cool so does this mean climate models, with enormous look-ahead windows, are complete garbage?


(Sorry. I take it you would like them to be complete garbage.)

There are two main ways in which a model can give bad results.

1. It can be a fundamentally bad model. E.g., Kelvin had a perfectly reasonable model of the long-term evolution of the temperature of the earth from which he deduced that it couldn't be more than a few hundred million years old, probably less. The problem was that he didn't know about radioactivity, which turns out to make a tremendous difference.

2. It can be a reasonable model but the parameters can be hard to estimate accurately, and this can lead to inaccurate predictions. In some cases, small errors in the parameters lead to large errors in the predictions. For instance, if you try to forecast the weather a year ahead then you will get all the details wrong.

This article is about the second kind of problem, and that's the one that shows the most dramatic degradation as you look at longer and longer intervals.

Different properties of the model output can show very different sensitivities to the inputs. For instance, suppose you make a model of the solar system and try to predict where the planets will be ten million years from now. I expect (though I haven't actually done the relevant calculations) you will find that tiny errors in your measurements of the planets' present positions and momenta will lead to very large errors in estimating how far around their orbits the planets are, and probably also in estimating the precession of the orbits. But I bet that almost every simulation run will put the planets' distances from the sun at about the same values, which will be about the same as they are now, and I would bet very heavily that those distances will be a pretty good match for reality. (Barring utterly out-of-model events like the human race getting technologically advanced enough to move planets around and finding it useful to do so.)

Or suppose you make a model of the weather and try to predict things five years from now. The model will do very poorly at predicting which days will be rainy in (say) New York, and if you run it many times with tiny changes in its parameters it will give completely different predictions. But it will very consistently tell you that in New York it'll be warmer in July than in January, and it will almost certainly be right; it will do pretty well at predicting rainfall in the Sahara Desert, and it will almost certainly be right about that. Some things behave chaotically, some things not so much.

So, what about those climate models? Weather is chaotic: small variations quickly lead to huge changes. Climate is much less so: to whatever extent the models are fundamentally accurate, small changes to the parameters tend to produce small changes to outputs like global mean temperature ten years later.

How fundamentally accurate are they? They seem to be pretty decent; when you go back and look at the predictions of old models for what are now past years, they do quite well. (To predict the climate you need to predict human behaviour, because human behaviour affects the climate. Some of the predictions of human behaviour that went into the models were wrong. So you can get a sort of halfway house by running an old model but using now-historical information about things like how much CO2 we would put into the atmosphere. Unsurprisingly, this improves their performance relative to just using the old models as they stand.) See e.g. https://agupubs.onlinelibrary.wiley.com/doi/abs/10.1029/2019... for some analysis.

So: no, the sort of effects described in the article don't mean that climate models are complete garbage. The other sort of problem (fundamental errors in the model itself as opposed to its parameters) could make them garbage, but the evidence available so far suggests that even decades-old models were pretty good in this respect.

You wrote so much, it’s a shame you insist on claiming the models are not wrong.

The right answer is that, just like COVID models, of course the climate models will turn out to be completely wrong in the exact details, in the scale, in the timeframe, and possibly even in the endpoints.

COVID is actually supremely easier to predict than climate. The two fundamental problems with most COVID models are that 1) the positive test counts don’t mean what you think they do, and 2) because of that, everything that is based on positive test counts being relatively meaningful is therefore totally wrong.

The climate models, since they were created by humans and not God, will likewise have almost innumerable errors which will cause all sorts of problems both minor and major with their predictions. This is universally true with all models of complex systems, and therefore should be totally uncontroversial, unfortunately the models have become like sacred cows.

Personally, just like I was with COVID, I am confident that paradigm-shifting events will intercede that will totally invalidate the model one way or another.

For COVID the paradigm shift was that the majority of cases are asymptomatic. Most people are still coming to terms with this. I heard it recently best described to think of SARS-CoV-2 as actually two viruses. One, much less prevalent, is serious and deadly. The other, much more prevalent, is mild to the point of being unremarkable. Getting one confers immunity to both. This is, I stress, a mental model, not a scientific explanation of what’s actually happening (which does not presently exist).

The point is that the things that make models wrong are often “external” to the model, the unknowables. For Kelvin it was radioactivity. His model was total “sound” just like the climate models are “sound” in terms of everything that is presently knowable. But complex systems, particularly ones with humans in the loop, have a way of surfacing curveballs the model did not, and could not, anticipate which have outsized effects in the endpoints.

The lesson with climate, just as it is being shown with COVID, is that dramatic action to address the problem (“flatten the curve”) with almost incomprehensible direct costs does not pay off.

With climate, just like with COVID, a less blunt instrument will achieve a better outcome for orders of magnitude less cost.

The sad part is that the top 2% couldn’t give a fuck about the cost either way. It’s the bottom 75% which will suffer from terrible policy.

The lesson with climate, just as it is being shown with COVID, is that dramatic action to address the problem (“flatten the curve”) with almost incomprehensible direct costs does not pay off.

Surely the lesson from Covid-19 is "early intervention works". Compare for eg the number of deaths in Japan/Australia/South Korea with Italy/US/UK.

I guess I’m not sure what definition of “works” you are using?

Humans have a tendency to see the thing in front of them and claim, I did X, and then Y happened, so X must have helped cause Y. When in fact no such thing has been proven.

I think there are a lot of things different about those examples you gave, and absolutely no scientific way to draw a conclusion about a particular intervention having a particular effect.

Here’s an interesting comparison between Australia and Canada for example, who both responded similarly but have ended up in markedly different situations. [1]

So instead I would offer a different framework for the discussion.

In the end there are only two simple questions with COVID. 1) When we hit the endpoint, what is the total prevalence in the population, and is it lower than the natural saturation point?

So my guess for #1 is that at the endpoint, we will see that basically everyone who could have gotten it did get it. To put it another way, I suspect only the highest risk population taking the most extraordinary precautions will end up not having antibodies to SARS-CoV-2, everyone else at some point being naturally exposed.

In other words, I assume we will reach whatever the top of the S curve for COVID would naturally be based on the biological nature of the virus. Essentially, I am assuming that vaccines will not play a large role, because it will have played out before they become available, and/or because we don’t have any coronavirus vaccines and it’s probably harder to make one than we think.

2) How long do we stretch things out to get to that endpoint, and therefore, how many lives are saved through increased availability of medical care and, on the flip side, how much economic cost is incurred?

My hope here—based on some very preliminary data—is that we are actually much further along than the testing might indicate (e.g. 5% overall prevalence and much higher in certain metros, vs 0.2%)

From a health care perspective I think that the only effective treatments we have today are minimally invasive and extremely scalable — oxygen therapy and frequent repositioning. We are learning that ventilators have proven to be basically lethal to COVID patients, therefore increased availability and usage of ventilators presumably just increases the death rate. There have been no controlled studies either way at this point so no one can say for sure.

At this point because we have no effective cure or treatment, and because initial antibody testing seems to indicate COVID is extremely widespread despite the shutdown, I am very skeptical of the current approach paying off both in terms of delaying spread & saving lives through higher availability of care.

[1] - https://www.theglobeandmail.com/opinion/article-canada-and-a...

> Here’s an interesting comparison between Australia and Canada for example, who both responded similarly but have ended up in markedly different situations.

Well no - Australia required quarantine of all returning overseas travellers from March 15, Canada advised self isolation from March 25. That's a huge difference.

Again - correct, thoughtful interventions worked.

> Essentially, I am assuming that vaccines will not play a large role, because it will have played out before they become available, and/or because we don’t have any coronavirus vaccines and it’s probably harder to make one than we think.

I've seen the "we don’t have any coronavirus vaccines" meme too. People don't seem to realise we've only had one serious coronavirus before (SARS) and it isn't anywhere as nearly contagious. There just hasn't been much work done on coronavirus vaccines before.

We do however have anti-viral treatments (eg Tamilflu), and it's likely we'll either get a working vaccine or a anti-viral drug that provides some relief in the next 12-18 months.

> My hope here—based on some very preliminary data—is that we are actually much further along than the testing might indicate (e.g. 5% overall prevalence and much higher in certain metros, vs 0.2%)

Either way, it's a long way off the required 80% for herd immunity.

> We are learning that ventilators have proven to be basically lethal to COVID patients

This is absolutely untrue. There have been people sharing data showing that the mortality rate of people on ventilators is higher than those who aren't, but that is because ventilators are only used on people who are closer to death.

It's true that in some cases oxygen is sufficient for some people.

Regarding Australia, yes, they discuss that specifically in the article I linked. That particular intervention is a lot easier to enforce when you don’t have a long land border with a country going through an outbreak, and a very large number of citizens presently in the neighboring country. Australia’s borders are closed, and they never really had a lot of community spread. Canada had significant community spread, and earlier, and a larger population returning from hotspot areas.

The point is a careful and thoughtful examination shows that a country’s geographic, social, and economic “natural affordances” are largely responsible for the differing COVID case loads, much less than the effective dollar amount “spent” on interventions. Spending on an intervention includes the economic hit as well as the direct cost. E.g. Australia was able to get a lot more bang for their buck because they’re an island and didn’t have a large segment of their population traveling in hotspots.

I think we agree on the vaccines. The point isn’t there’s something specific about coronavirus. It’s not a “meme” that we don’t have a coronavirus vaccine. We actually don’t have one. That just means we might not have as much experience ramping up a new one as we do, for example, with influenza, and there may be things we need to learn along the way.

A working vaccine or treatment in 12-18 months is a long way off.

If we are anywhere near 5% prevalence that means we are a lot closer to 65% (when herd immunity starts being noticing beneficial) than you might think, since it is non-linear.

Regarding ventilators, even in the general (non-COVID) case if you find yourself on one you’re about 50/50 not going to walk away from it. But with COVID it appears ventilators exacerbate the swelling in the lungs and in NY 80%+ of ventilated patients have died. [1]. That’s what I mean by “proven to be basically lethal”. We know the vast majority of COVID patients on ventilators die, far more than just the average ventilator death rate for pneumonia, using the same O2 saturation markers.

So doctors are moving away from ventilators and no longer using the typical O2 saturation cutoffs for ventilating because interestingly patients with lower O2 levels are not in the typical level of distress. It seems the CO2 is still getting out even though the oxygen is not getting in... more like hypoxia.

> It's true that in some cases oxygen is sufficient for some people.

We can agree on more than that. Luckily the fact is that oxygen therapy is sufficient for the vast majority of cases. Just do the math on hospitalizations and deaths in NY with the known variable that 80% that were ventilated died. QED the vast majority were not ventilated.

[1] - https://www.nytimes.com/aponline/2020/04/08/health/ap-us-med...

> It’s not a “meme” that we don’t have a coronavirus vaccine.

I was referring to the idea that is going around that "we don’t have any coronavirus vaccines" is somehow significant and means it is especially hard to make one.

> Regarding ventilators

I think the NY Times article is good, but it's important to note that this is evolving and treatments are changing. But saying "some doctors say they're trying to keep patients off ventilators as long as possible, and turning to other techniques instead" is very different to "ventilators have proven to be basically lethal to COVID patients".

There’s no way around the fact than an 80% fatality rate of ventilated COVID patients is absolutely staggering.

That makes ventilation an absolutely last ditch effort. It means if the ~60,000 ventilators already in US hospitals before this started were each used just once ever to ventilate a COVID patient, you would expect 48,000 of those patients to die.

I’m sure this also has a pretty significant effect on the ICU utilization rate as a percentage of hospitalizations.

And this is besides the fact that early serological testing points to the overall hospitalization rate being significantly lower than 10%.

I don't think there is much evidence as to the causality there.

You seem to believe that the ventilators are causing deaths. There's plenty of evidence to show that once Covid-19 takes hold it is particularly viscous because of the number of things that it does to the body.

Some of the things that go wrong in some cases include: lung damage (obviously), kidney damage, liver damage, strokes, encephalitis, seizures, heart damage, meningitis, bloody diarrhea etc[1].

I agree the death date of people on ventilators are horrifying. But I don't think there is much evidence that ventilators are causing that rather than being administered to people who already have serious symptoms.

[1] https://www.sciencemag.org/news/2020/04/how-does-coronavirus...

I agree there is not nearly enough evidence to draw a conclusion. That's why I said, there have been no controlled studies either way at this point so no one can say for sure.

All we have is anecdotal reports, plus the fact that significantly more COVID patients die on ventilation than the average pneumonia patient who is [invasively] ventilated.

Since doctors were ventilating patients at the same O2 saturation points as usual, we can say at least that something is markedly different with COVID induced ARDS from typical ARDS that makes invasive ventilation particularly insufficient, if not actively harmful.

In case you haven't seen it, those studies indicating a higher rate of infection (the ~5% ones) are almost certainly wrong.


Yes, they are. So what would the correct policy be if we have no reliable model? Well as all existential threats should be treated, with extreme risk aversion.

I wish more of society would apply this wisdom to climate change modelling.

This is accounted for in more professional methods which estimate error. During a period of exponential growth, the error 6 weeks out is very sensitive to tiny errors occurring in immediate measurement.

It's not so much that fitting exponential or S-curves is hard as much as even a very good fit is likely to have very significant error bars.

The hard parts about modeling the current situation involve the poor and limited data. The dynamics of the system are highly variable and unobserved and thus require clever tricks to reduce uncertainty by expert knowledge, clever alternative signals, and leveraging other models.

So ultimately, it's just not the forecasting but the response. People rarely handle uncertainty very well and a situation with uncertain exponential growth ends up with exponential uncertainty. Making policy decisions where the predicted outcome spans orders of magnitude is tremendously challenging.

Exactly. And I have a very recent example, yesterday somebody responded to one of my replies with:

"Cuomo claimed to need 40k ventilators after the lockdown was put in place. He ended up needing only a fraction of that. Obviously, without the lockdown, there would likely be more needed. Of course, modeling has errors, but this is an extremely large error, and one that has direct policy implications." (1)

My response was that it wasn't "a large error" but what only 6 more days of exponential growth would have made still insufficient:


Also as an illustration of the "uncertainty" one can see the shaded areas clearly here:


As I write this, the model on that page was last time updated based on the data up to the 15th of April, only 4 days ago, and its best estimate for number of deaths in USA on 19th of April was 37,056. Also as I check the https://www.worldometers.info/coronavirus/country/us/ it's already 40,423. But the 95% error area for 19th is 39,721-56,108. It's that hard. Also at this moment the projection for 1st of June is 60,262 (34,063-140,106) and we can compare all these values after some later updates, e.g. in 6 and 12 days.

1) careful observers know that there is one more talked about person who is used the same pseudo-argument before.

Rechecked the page: currently data up to 20th April, 22th of April page updated:

June 01, 2020 projection: 67,444 (48,058-122,203)

Today's number of deaths in the U.S., according to worldometers: 47,808. Today is April 23.

Next recheck: currently death data up to 25th April.

June 01, 2020 projection: 73,621 (56,562-128,167)

Total number of deaths in the U.S., worldometers: 56,803. Today is April 28.

Next: currently last death non-projected data: May 01: 65,249.

June 01, 2020 projection: 112,073 (91,586-155,417)

Total number of deaths in the U.S., worldometers: 74,114. Today is May 06.

I was thinking the same thing. Exponentials are so sensitive to almost nothing at the beginning and then it is too late to react. The only solution to not be late in reacting is to over react.

I was correct in my predictions of the disease coming to North America and the stages it would entail but I was wrong about the timeline because so as it happened it happened so fast.

Similar situation, and I knew it could happen very fast (3 day doubling periods do that), but without data I still wouldn't have predicted (and can't now) where it would get bad and where it would middle around.

What we do now know is that it changes VERY fast so you can't wait for your ICUs to fill up or a lot of people will die... you have to act early. And if it came once this fast... it can again, if our guard goes down (and it will).

> In other words, data enthusiasts (such as myself) should leave the modelling up to the professionals.

This is really important to understand, and unfortunately often overlooked, especially by economists, who often work with good data and solid mathematical background but no prior domain knowledge and no reference to literature of the given field.

I have not seen a single useful model from professional epidemiologists so far -- at least nothing that would guide me better than "look at China and Italy and consider that it might also happen here."

I would argue that they're all useful but its very tough to be accurate when confounding actions are occurring to the problem. For example, in the US, the increase in social distancing measures and mask usage couldn't be easily baked into the original models. Forecasting accurately is very tough. Look at what's been said about Renaissance and its Medallion fund, they're right a little over 50% (they just leverage very heavily). Think about that, the most consistent and dominate hedgefund might only be right about 51% of the time. The issue with just using historical data is the data might be different. But I'm sure all of their models were using data or at least some domain knowledge from the spread in Italy and China.

>> For example, in the US, the increase in social distancing measures and mask usage couldn't be easily baked into the original models.

The IHME model - one of the most cited models - explicitly factored this into their model.

"Factored in" covers a lot of ground.

Here, for reference is the IHME model preprint: https://www.medrxiv.org/content/10.1101/2020.03.27.20043752v...

The parameterization of social distancing is on pages 3-4. They considered four coarse categories of interventions, with the total effect given by a ad hoc score of 1, 0.67, 0.334, or 0.

It should be blindingly obvious that can be improved upon with more data and work. Moreover, the effects of social distancing aren't even constant--compliance certainly varies from place to place and over time (enforcement changes, people get restless).

"Starting April 17, we began using mobile phone data to better assess the impact of social distancing across states and countries. These data revealed that social distancing was happening to a larger degree than previously understood, and even before social distancing mandates went into effect."

This suggests that social distancing happened 1-2 weeks sooner than their model anticipated. This will give you a much smaller number.

What is the policy you believe should have been followed instead? Are you saying that social distancing should have been abandoned? Why do you believe that this would result in fewer deaths?

No, I am saying they factored it into their models and also undershot their priors that were readily available. If they looked at OpenTable data, they would have realized people were voluntarily socially distancing before state authorities told them they should. In fact, OpenTable data showed that even on the day that the Mayor of NYC told people to go to sit-down restaurants and see movies in person (in early March), y/o/y data showed a net 30% reduction in reservations for restaurants in NYC.

People were voluntarily taking precaution well before our government bureaucracies enforced it, and it was available in public datasets. IHME just didn't look hard enough, apparently. Despite what the media keeps pushing by finding pockets of idiots, people aren't stupid. The most at-risk populations realize this and tend to shelter in place on their own and take precautions.

de Blasio on March 11th telling people to go out: https://ny.eater.com/2020/3/11/21175497/coronavirus-nyc-rest...

OpenTable data showing a voluntary reduction of restaurant activity despite de Blasio's encouragements: https://ibb.co/r2R9xnT

What policies are you saying were misguided and should not have been taken based on the predicted elevated death toll?

How does one evaluate the OpenTable data and feed it into the model they were using to estimate the amount of social distancing?

Extended lockdowns are positively correlated with deaths per million residents.


I'm saying that lengthy lockdowns, shelter-in-place, and shutting down non-storefront businesses did not provide much value, if any, and possibly had a negative effect.

The IHME trended into 95%+ CI territory over half the time. So when posters here say "well the variance was high," that's accounted for in the confidence intervals, which they also missed on wildly.

The models justified harsh authoritarian action, and the actual data way, way undershot it. Popular sentiment is "oh well, it saved lives at least" but few, if any, are looking at the acute - and more importantly, chronic - economic costs of these policies that may or may not have even helped beyond just telling people what the risks were.

>> How does one evaluate the OpenTable data and feed it into the model they were using to estimate the amount of social distancing?

Pretty simple; it was clear that news of COVID-19 alone was enough to cause people to stop going to restaurants and start basic social distancing protocols on their own without government mandating it.

EDIT: Shockingly, a bunch of downvotes without explanation are forthcoming.

One thing that you and others arguing in the same way seem to neglect is that there is a huge economical cost to a significant fraction of your society dying. This is not an either or situation. I want to see your economic models that the economy would have been fine with 0.5-1% of the population dying within a short period (actually everywhere the health system got overloaded the rate was significantly higher). Let's not even talk about the loss in confidence in the government because they failed to react.

This is a great view point that I'd never thought of or seen put forward. Is there any good information into the economic impact of the Spanish Flu or even the Bubonic Plauge?

Regarding the positive correlation bit, I hate to bring it up because it's almost a meme, but this is a prime example of people not understanding correlation and causation. This is why we need domain experts and not some people twiddling with numbers and saying "aha".

Present your data showing that authoritarian-led social distancing procedures helped, then. The null hypothesis is that an intervention does not work. I would like to see evidence that shows that these procedures had a meaningful effect in comparison to the trend of what was already occurring.

The number of people that think they understand the scientific method is astoundingly higher than the number that actually do.

Correct. This appeal to authority fallacy being touted has given us models like the UW-IHME model which is a complete disaster in both results and in methodology. And domestic policy is being based off these models that have two orders of magnitude (at least) of absolute error, and were outside their 95% CI over half of the time. This is unacceptable.



Remember that Greece's economy will not recover to it's 2008 for 20 years because of an excel spreadsheet with errors in it.

Japan's Nikkei index is still down from the late 1980s, and likely to stay down for many more years to come.

The chronic effects of shutting down the economy are vast. People think money can be regenerated; it can. But what Americans have seemingly forgotten is the past where American autoworkers feared Japanese efficiency and where the Japanese economy looked to crush ours. Today, the Japanese salaryman works at least 50% more hours and his purchasing power is less than his counterpart in the United States (just look at patio11's posts on it).


It's not a popular take because of recency bias (the same thing happened on 9/11 with the PATRIOT Act and always doe when outliers happen), but the data will be pretty clear in the end.

May I ask which models you have seen so far?

I'm interested in them.

What does useful mean? I would argue most models are useless without a good understanding of the parameters going into that model. That is where the domainspecific knowledge comes in. Fitting some curve to data is useless unless you know what the parameters correspond to, and how they influence the parameters of the curve.

I panicked a bit in March, partly because I watched a video of a mathematics YouTuber, who I think is pretentious. Usually I don't watch his videos, because I think he's pretentious. But I watched his video about exponential growth and epidemics. At the end there is the text "The only thing to fear is the lack of fear itself." (the opposite of what Franklin D. Roosevelt said).

I would have panicked less if I had read the text of this hacker news story, instead.

no, I really don't think so. no epidemiologist actually has seen anything like what we are seeing today, they are not all that well prepared.

this disease defies the models, youth is unaffected etc.

If they were prepared well then they would not insist on imposing the exact same rules for NYC as for rural NY State. The two places could not be any more different.

Define "like". People haven't seen the exact scenario, but people - epidemiologists and others - have studied the Spanish Flu, and MERS, SARS, H1N1, seasonal flus, etc., which all have useful similarities.

Given how bad the data about the current pandemic is, why do you think data about previous pandemics was any better?

Because there has been more time to collect and analyze the data about the previous ones.

And because there's far less political pressure to obfuscate figures of past epidemics.

Do your know why the Spanish Flu was called this way ? Because Spain was one of the few countries not involved in WWI, and then didn't had the strong level of censorship that was ubiquitous at this time.

Pressure to obfuscate the impact of the epidemic where way higher then.

Yes but we know quite a lot about its spread and impact now. This is like the fog of war, its hard to see what is happening while a battle is underway but afterward many different accounts can be pieced together to gain an accurate understanding of what happened.

Agree, as I agreed with frank2's comment upthread too.

none of these diseases are anything like this,

no person under 18 has died of the disease in Italy, 3% or less of cases in population under 20 etc. many people don't show symptoms but are quite infectious etc

which disease is like that? none actually.

"It may not be surprising that in the exponential growth phase the estimate is very bad, but even in the linear phase (when 40+ points are available) the correct curve has not been found. In fact, it is only once the data starts to level-off that the correct s-curve is found. This is especially unhelpful when you consider that it can be quite hard to tell which part of the curve you on; hindsight is 20-20."

Does this perhaps explain why we are so bad at calling the top of economic bubbles and similar phenomena? Maybe there literally is not enough information until the very end. It's not that we're dumb. It's that we can't do the mathematically impossible.

I have a sense that this might be a very important and profound principle that might explain a lot of seemingly irrational behaviors.

I believe our supposed inability to predict economic bubbles (and other financial crises) is mostly just tautological: those crises we can predict never happen because they are "self-falsifying prophesies": If enough people believe the stock or housing market is overheating, they will stop buying or might even speculate against further rising prices.

Thus, any bubble that does manage to grow to significant size before bursting is necessarily "unforeseen". (of course because of that thing with the monkeys, the keyboards, and online message boards, the will always be plenty of people that did see it coming, but I would wait to buy their book until they do it a second time).

I know it's always trendy to hate on economists (some people have taken that idea all the way to creating cryptocurrencies and reinventing economics along the way). But comparing, say, the 2008 crisis with the 1920 or even the 1970s, I can't shake the feeling that maybe economists have become slightly better over time. The gold standard fandom that was all the rage for a while essentially rests on the idea that interventions by central banks are worse than doing nothing, and the evidence now seems overwhelming that we can do better than this (admittedly low) benchmark.

I think Taleb makes a similar point in his books - if you can forsee an economic crisis, you take steps to avoid it and therefore it doesn't happen. But then everyone asks why you were wrong in the first place, as the crisis didn't happen

Economists are the analyzers, not the players. I’m not under the impression that it’s popular to dislike economists. I disagree that bubbles are unpredictable because the assumption that the market is self regulating is the exact opposite of the market’s actual nature.

Maybe. But there also seems to be a tendency not to admit that any exponential phenomenon in the real world (Moore's Law, economic growth, etc) is always the climbing part of an s-curve. Call it wishful thinking, or optimism, or self-delusion, according to your preferences :)

For Moore's Law it has always been clear that it was going to end due to the size of atoms, it just slowed down earlier and faster than expected. EUV is at least 10 years late and no elegant solution has been found. It is still very difficult, complicated and expensive. I know somebody who works on EUV sources. The optics are even more difficult than the light sources. For the whole system, you get milliwatts on the wafer for megawatts of electrical power to the EUV source. But it's starting to work just well enough that it makes sense to use it.

> However, in my experience “intuition” and “mathematics” can often be hard to reconcile.

They can be hard to do, but you must absolutely reconcile them for your effort to be fruitful. Until they are separated, your intuition or your mathematics is wrong, and you can't know which is.

What's even harder is that you're usually trying to forecast for a reason other than pure academic curiosity. Like, should I be worried about the coronavirus? Should we stay inside? When can we reopen business? But the decisions people take determine the curve and the predicted curve determines people's decisions. If you decide not to social distance, you're changing the future. For that reason, forecasting is better done under different scenarios. For example, don't tell people approximately X (+/- a lot) will die. Tell them approximately X (+/- a little less) will die if we don't social distance and approximately Y (+/- a little less) will die if 95% of us self-quarantine.

It's harder than just fitting a curve, but that's kind of the point. If you want actionable predictions, it is harder.

Is hard because for example to predict case load, at any point on the curve you could have policy changes, surprise increase of case load for various reasons(sudden increase of tests), these are just a few of possible hundreds of high impact events that can impact length and slope at any point

Yes, exactly. It's tempting to try to compute coefficients X weeks in, so we can forecast X+26 weeks in the future.

But the coefficients aren't fixed. Coefficients change drastically due to public policy, individual actions in response to news and social media, culture of local communities, degree of compliance with public policy, travel between regions with different rates of infection, etc.

So you're not fitting a curve to the data, you're modeling dynamic human actions which are not nearly as easy to forecast.

Well, considering all s-curves are exponentials at the beginning, it naturally flows that you can only make an accurate model after the inflexion point.

Since we're on HN, I'll plug a question I've had for a long time:

Do IPOs/liquidity events/exits/etc all happen when the right price has been determined; ie when we can see what size the company will be, and it is no longer useful to capitalize it further? When all growth paths have been explored. Is that time the inflexion point?

What does the conversation look like with the VCs when they see the inflexion point?

> considering all s-curves are exponentials at the beginning

Sorry, but I hear a lot of people saying this and it's driving me crazy. S-curves are S-curves from the beginning, not exponentials. It can be useful to use an exponential growth model at the beginning of the curve for short-term forecasting, but these two models will diverge dramatically at the S-curve inflection point.

Not that we shouldn't plan as if exponential growth will occur in a crisis like we're in now, but many people I know don't understand these dynamics and it has lead to a lot of undue panic.

There are different kinds of S-curves.

Some of them are basically exponential at the beginning (the logistic curve). This makes perfect sense in modelling an infectious disease - initially, all contacts are susceptible. Thus you have exponential growth.

Others, for example the CDF of a normal (which is often confused with the logistic), is not exponential at the beginning, but decays much faster (the derivative is exp(-x^2)).

Arctan decays much slower (derivative is 1/(1+x^2)).

So, S-curves are sort of linear in the middle, but sort of exponential or exp(-x^2) or 1/(1+x^2) or hyperbolic at the ends, and which one of them it is tells us a lot about their behaviour.

Note also that without any social distancing or other measures and a sustained Rt of, say, 3, the inflection point will only occur once about a third of the population are infected, and thus it makes perfect sense to model it exponentially while still being below 5%, say.

The thing is: a good fit at the tail end (with bad data) may very well be a very bad fit for the global (nonlinear) curve. Because you cannot linearize the data or the model.

Non-linear fitting is VERY hard to do well. And with pathetic data like now, it is doomed.

Some s-curves are never well-modeled by an exponential for any substantial part (of course the exponential will be approximately linear over a short enough range but then you might as well use a line), like for example arctan. Not all s-curves are logistic functions (although in the context of epidemiology, then I guess people usually just mean logistic by s-curve).

Thank you for this response. All smooth curves are also approximately linear at all points, but that doesn't mean that we can't usefully predict and model an appropriate non-linear fit.

Not disagreeing, what you say seems right to me, but it ("All smooth curves ...") also seems like the sort of result that there might be a counterexample to, like a smooth curve that at no scale has a linear approximation. Maybe a fractal with a smoothly varying generator??

For curves of infection rates I don't doubt your verity.

“Smooth” is a term of art in math that means continuously differentiable infinitely many times. These functions are a subset of those functions that are differentiable once.

Differentiable once means, by definition, approximable everywhere by a line at small enough scale.

Yes, I know that much.

Imagine a sine wave, except when you look at it at 1000x magnification it's a sin + cos. It looks smooth, and at x = π you think that the derivative will be -1 but in fact it's 0 because you don't have a sine crossover there you have a cosine trough.

Except at 1000000 times magnification (ie another 1000) the cosine curve that forms that apparent sine curve is itself a sine curve. So everything is switched again.

f(x) is something like sin(x)+cos(ax)/a+sin(a^2x)/a^2+cos(a^3x)/a^3+ ... (sin(x.a^j)/a^j+cos(x.a^j+1)/a^j+1+ ...

for a=some arbitrary large number. Something like that, I'm a bit rusty, sorry.

At whatever scale you look at the curve the derivative is always wrong: you zoom in on the sine, at the peak it's got a cosine, so the d/dx is -1; but zoom in and the cosine has a sine at the crossing point, so the d/dx is 0; but zoom in and ...

The curve is provably smooth, it's sinuses all the way down, but nowhere can you tell the derivative as it's fractal???

That's what I had in mind.

Anyway, I thought their might be clever curves of that type.

A curve with the properties you describe is not smooth, by definition, since it is not differentiable.

> At whatever scale you look at the curve the derivative is always wrong: you zoom in on the sine, at the peak it's got a cosine, so the d/dx is -1; but zoom in and the cosine has a sine at the crossing point, so the d/dx is 0; but zoom in and ...

This is pretty much the definition of something not being differentiable. "Differentiable" means that the approximations to the derivative (i.e., difference quotients) converge to some fixed value as the scale they're measured at approaches the infinitely small.

You might be interested in the Weierstrass function, which seems to be the sort of thing you're getting at with your idea: https://en.wikipedia.org/wiki/Weierstrass_function . Continuous everywhere, but differentiable nowhere.

Edit: the specific function you wrote down is not differentiable (at least not everywhere). For example, at x=0, its derivative, if it had one, should be cos(0) + cos (a * 0) + cos (a^2 * 0) + ... , but that series clearly diverges.

cos(0) being 1, sin (0) begin 0; that series is 0 + 10e-3 + 0 + 10e-9 + 0 + 10e-15 + ... when x=0 (excuse my sloppy notation, I'm on a phone). Looks convergent to me, somewhere between 0.001000001000001 and 0.001000001000002 ?

Have you studied fractal dimensions formally, might I know the background you're speaking from?

Yes, I'm imagining, as a first example, something akin to a Weierstrass function but without the discontinuities.


Yes, the original function converges (not its derivative). The terms in the derivative are no longer divided by a^k. The a^k from the argument to cos/sin cancels then out (it becomes a multiplier, because of the chain rule).

So the series for the derivative is 0 + 1 + 0 + 1 + ..., which doesn’t converge.

Not sure what this has to do with fractal dimensions. This is a simple question of definitions. The word “smooth”, in math, literally implies, by definition, that the function is everywhere approximable by lines.

If you don’t think it does, can you state the formal definition of “smooth” that you’re using?

The Weierstrass function doesn’t have discontinuities, btw.

> “Smooth” is a term of art in math that means continuously differentiable infinitely many times. These functions include, as a subset, functions which are differentiable once.

Of course you knew what you meant, but, for anyone who's confused, the containment goes the other way.

You’re right. Edited to fix the error.

the unstated condition there is that the window for approximating linearity must be flexible for a given error envelope - smaller windows for higher rates of change of the curve. so while technically correct, it isn’t always practically useful.

> Do IPOs/liquidity events/exits/etc all happen when the right price has been determined; ie when we can see what size the company will be, and it is no longer useful to capitalize it further? When all growth paths have been explored. Is that time the inflexion point?

To answer this question, you need to ask the question: what reason is the company going public?

For example, in the dot com boom, companies went public with a business plan (almost). They were clearly looking for funding to expand on their business and were nowhere near the inflection point.

In contrast, with WeWork, SoftBank was pretty clearly looking to unload radioactive garbage on the public.

So the answer is: it depends.

>In contrast, with WeWork, SoftBank was pretty clearly looking to unload radioactive garbage on the public.

So the answer is: it depends.

Ye olde pump n dump

I am bothered the animation does not include confidence intervals or error bars for the fit. The way these confidence intervals would shrink as more data points are available would tell a just as important part of the story.

yes, a better problem definition would in this case reveal that the estimate becomes quite good (in my definition) around the 50% time mark, when the transition from ^x to ^-x takes place. More specifically, I mean that although the stable point is still off by e.g. 25%, the important thing is that the stable time of the curve is well-estimated. you know it's no longer increasing exponentially...

I suppose that the s-curve with 3 parameters that the author is talking about is the logistic function. In general, if you consider a differentiable function of three parameters and try to determine and interval for the values of the parameters of that model then the length of that interval is bound by the ratio of the error in the data over the derivative with respect the parameter. For example estimating the parameter k (wikipedia logistic growth rate) with points such that x near x0 = (wikipedia midpoint of the sigmoid) is hard, since the derivative of the function with respect to k at x=x0 is zero. So mathematically this seems to be a well known fact when one try to estimate parameters from datapoints.

> the length of that interval is bound by the ratio of the error in the data over the derivative with respect the parameter

This is interesting! Could you expand on this a bit? Why is the length of the interval bound by the ratio of the data error over the derivative?

The general case require some work and conditions. But to give a hint, the case of only one parameter is an application of the mean value theorem (1). Suppose a model (y = f(p,x) ) with only one parameter p0 and an exact point (x0,y0) (that is y0=f(p0,x)) and a data point (x0,y1) such that y1-y0=error in the data. And that there is a value p1 of the parameter such that f(p1,x0) = y1, then y1 - y0 = f(p1,x0) - f(p0,x0) = f'(sigma) . (p1-p0), so that p1-p0 = (y1-y0)/f'(sigma) that is (error in the parameter) = (error in the data)/(derivative with respect to the parameter) where sigma is between p0 and p1. The general case is a generalization of this idea using the mean value inequality.

(1) https://en.wikipedia.org/wiki/Mean_value_theorem

I made a R web-app in Shiny which does curve fitting of a Four Parameter Log Logistic function (which is the S-curve discussed in the article) against the John Hopkins data:

https://joelonsql.shinyapps.io/coronalyzer/ https://github.com/joelonsql/coronalyzer

"Sweden FHM" is the default country, which is a different data source, it's using data from the Folkhälsomyndigheten FHM (Swedish Public Health Agency), which is adjusted by death date and not reporting date as the John Hopkins data is.

One thing to note (from looking at the graph) is that the noise seems to be additive with constant standard deviation (and presumably floored floored such that the sum doesn’t go negative).

That means that there is huge relative error initially (we have 10 infections +/- 100), and very little relative error eventually (we have 1000000 infections +/- 100).

I assume the forecasts would be better if the error were multiplicative (in other words, with standard deviation proportional to the current value).

However, I think the main point stands: the forecasts get much better once one approaches the inflection point.

I did the covid19 week1 and week2 kaggle competitions (I think they had 4 of them) below. If you are interested, this is a fun way to play around with the data, and it shows how hard it is.

This I tried: - Weibull Gamma distributions, but it was impossible to find good parameters for the distributions without exploding. It would only work if I put in an additional parameter saying 99% of the population wouldnt get it. It would come up with good shapes of growth but the predictions were far too below actual values in the future.

- Logistic curves. usually great for countries that already ran up the curve but terrible for ones still in the exponential phase (as the article states). Also kind of useless for countries that didnt even begin their journey up the curve.

- light gbm: good for predicting the next day but terrible many days out. It seems other counties curves do not help that much

- SARIMAX: really good but later predictions would explode, like showing 4 million deaths in france, etc.

I tried to get around these by ensembling them together, but overall I did very poorly at predicting coronvavirus. I still want to get better at this, so if anyone has any good suggestions, please share. Also you can check out what other kagglers have done as well

https://www.kaggle.com/c/covid19-global-forecasting-week-1 https://www.kaggle.com/c/covid19-global-forecasting-week-2

> Weibull, logistic, boosting, state-space

Could you write a model that will tell you how fast your car is going, just by using math? Or are there innumerable latent variables which define a state space so large that the only meaningful prediction you could make is a short-term linear extrapolation?

I mean, what would make anyone think that there is any constant parameter set for a model that would predict COVID-19 case evolution, when governments and the population are taking active measures to combat spread, such as perhaps the most dramatic shift in the physical interconnectivity of human society in history? It's fundamentally impossible to predict something that is changing in an unpredictable manner.

The reason why we can make meaningful inferences about the accuracy of weather models is because we have a good understanding of the underlying physics which decribes the time evolution of the system. However, the combination of the chaotic dynamics of that system and the limitations of simulation yields accurate short-term probablistic predictions and inaccurate long-term probabilistic predictions.

SARIMAX, in contrast, doens't know if it's predicting shampoo sales, $GOOG, or epidemiology. Although I am curious what your exogenous regressors and lag/differencing/seasonal orders were, if you'd care to share. Did you fit them via an information criterion approach?

The problem is just too ill-defined to fit directly with a logistic model or really any general-purpose model (such as gbm, etc). You need better priors, either in the form of good, specified dynamics, or in other terms.

You could do a regularized stratified model [0], for example, in the case where you have plenty of data in one country but not in others, or similar cases which incorporate dynamics (log-log models of growth are also pretty good). Overall, I think it's just too hard of a problem as it is very ill-posed—even if we knew the exact dynamics, any small perturbation in the data samples significantly changes the best fit (in other words, the dynamics of the system are 'chaotic').


[0] https://stanford.edu/~boyd/papers/pdf/eigen_strat.pdf

I briefly studied mathematical biology, and remember there being some debate about whether tumor growth was more accurately modeled by logistic growth or Gompertz growth [1]. I'd be curious to know whether your fits get better or worse if you replace your logistic-based model with a Gompertz-based model.

[1] https://en.wikipedia.org/wiki/Gompertz_function

Thanks, Ill take a look into it. It seems hn has no place for discussion these days and will down-vote anything.

I think you're overthinking things.

The OP makes the basic point that epidemic logistic/S-curves like Covid look like exponential growth for a good portion of their trajectory and so will behave like exponential for a reasonable period in the future. And the data is inherently noisy. Trying to get more than that is counter-productive. Basically, the bulk of the data is pointing to massive death later, the factor stopping that is factors external to the data we have now.

Basically, we know the epidemic is ramping now, we know there's some limit to how far it can go but we don't really know that limit 'till we're close. And trying to get more clever than this results in really bad claims (like guessing we're peaking and abandoning caution).

I'd also add testing results have been unreliable based on real world limits to testing (limits to number of tests, general screw-ups). Death data is more reliable than testing data but delayed by infections taking time to kill people.

The "quick and dirty" modeling of Tomas Pueyo has been more successful than clever actions.

See: https://medium.com/@tomaspueyo/coronavirus-act-today-or-peop...

In contrast, https://covid-19.direct has good data but too clever algorithm for spotting peaks. If you browse their data, you get used to garbage "forecasts" of where the peak is.


This article is about estimating a an s-curve in the real world. The point it: You cannot get a sense of the end by observing the beginning... even in the super idealized case of the data coming from a noisy s-curve. This learning obviously transfers over to the real world, where the data is going to be strictly worse than the idealized case (i.e. it won't be a perfect s-curve). Its a great article, with applications to pandemic forecasting.

>S-curves have only three parameters, and so it is perhaps impressive that they fit a variety of systems so well

no it does not. it just so happens that the solution of certain differential equations produces an s-shape. it has nothing to do with having only 3 parameters. Very sophisticated models with many of parameters and conditions can produce this shape.

The author was probably talking about the logistic function being parameterized by 3 parameters, x0, L, and k [1]. The point the author is probably making is that if you are fitting a perfect logistic model to data, 3 data points should be sufficient to determine 3 parameters that unambiguously parameterize the curve.

A separate set of 3 parameters also parameterizes the SIR compartmental model (https://en.wikipedia.org/wiki/Compartmental_models_in_epidem...), the solution of which also looks like a logistic curve. But this is a model-based (dynamics-based) solution whereas one may be interested in just fitting the logistic model based on the assumption that it's going to be logistic.

[1] https://en.wikipedia.org/wiki/Logistic_function

A turning point in my understanding of S Curves came when I encountered an article that showed an S Curve as the cumulative area under a normal distribution.

Which is great if your velocity on the project retains a normal distribution over time. But missing requirements, bad risk analysis, and team dynamic changes can make for a long velocity tail.

Which means the last 10% of the project accounts for the other 90% of the project time.

If however you have managed to do the important work first, you drop less important features and ship 95% of the planned features on time.

It’s a common misconception, but the S curve generated by disease, for example, with exponential growth at the beginning, corresponds not to the normal distribution, but the logistic distribution, which has fatter tails than the normal.

The bell curve drops to zero with exp(-x^2), that is extremely fast.

The logistic distribution (derivative of the S-curve described here) drops to zero with exp(-|x|), that is exponentially, but not as fast as the normal distribution.


Would be intrigued to see what the story is like with priors on the parameters and a credible interval on the output.

I've loved s-curves for decades. Way back in the lab, to analyze ligand-binding assays. And not that long ago, as a litigation consultant.

> For technological changes, can the final level-off be reasonably estimated?

It helps a lot if it's 0% or 100% :) Given that, you get a decent long-term fit, after about half the time to plateau.

Forecasting solutions to a differential equation is hard, especially when there are infinitely many solutions. It is a matter of keeping constants up to date in light of new information. If those constants are wrong, the whole model is essentially useless.

> Forecasting solutions to a differential equation is hard, especially when there are infinitely many solutions.

Not sure what you mean by infinitely many solutions, but this is not true in general. A silly example is

y'(t) = Ct,

where, for most distributions, you would only need a few points to get both C and the initial condition to reasonably high accuracy (~O(sqrt(n))). More complicated examples exist that have much more interesting dynamics, but their general trajectories are just not as sensitive/chaotic (w.r.t. the initial parameters).

> If those constants are wrong, the whole model is essentially useless.

I think what makes this hard is not if they're wrong, but rather, being even just a tiny bit wrong makes the whole future prediction change drastically. In other words, two possible inferred parameters which are statistically indistinguishable given our current observations will yield incredibly different outcomes under many of these models.

> It is a matter of keeping constants up to date in light of new information.

Indeed! :)

> It is a matter of keeping constants up to date in light of new information.

I'd argue it's a matter of calculating confidence intervals for all of your fitted parameters and displaying joint confidence intervals if they display a large degree of interdependence/nonlinearity, as well as showing the fitted residuals beneath the main plot so it's easily apparent what issues the fitted function might have, such as high leverage, or overfitting. The prediction made will be from the best-fit parameters, but the contributions of other possible models can be used to infer a distribution of potential outcomes. A parameter sensitivity analysis wouldn't hurt either.

Then you'll immediately know if you're looking at (or about to publish) junk.

To predict the s-curve for epidemic, you need to know R0.

The upper limit of the curve is 1 - 1/R0. If you use mitigation and effective reproductive number the number is 1 - 1/Rt, where Rt varies with mitigation effort.

One of the only good things to come out of this pandemic is the increased emphasis on S-Curves. Models tend to use the predictive power of exponential curves to estimate the steep part of the S-Curve but these predictions are short term and are best applied to planning scenarios.

What is missing from this article is the relationship between S-Curves and Bell Curves. We can use the Rules of Thumb associated with the Normal Distribution to think about peak growth rate and standard deviations.

The health data.org curve fitting is a decent Fermi Estimate based on observed data. Models are always wrong but sometimes they are useful and I hope we start to discuss the underlying key assumptions used in each case rather than focusing on their imperfect predictive power.

the article correctly points out that 3 parameters need to be estimated but then jumps directly to modeling those 3 parameters with 3 points and that that will always be wrong.

that’s not the right intuition. you could model a logistic curve with just 3 points if the error in those measurements tended toward infinity. the further apart the points are, the less tight the error bars need to be.

the problem with real-world modeling/curve-fitting is that measurements are super noisy and the errors in them are significant.

I tried to do some exploration with the coronavirus data to get some idea on the final number. One of the best plot I found that could tell the final number is plotting the percentage of cases in the next week with the cases till now. It is like negative half parabola and the time it meets the x axis will give the final number per country.

This is the final plot: https://i.imgur.com/o54t0Ts.png. You could make a good guess from the data how many people it will affect in the lifetime per country by continuing the same pattern till it reaches x axis.

Yes, as in the OP, S curves can be challenging to work with, in particular, to use, say, early in the history of smart phones, to make long term projections from the number of smart phones sold each day for each of the last 30 days.

But there is some good news: The data used can vary, and in some cases good projections can be easier to make. Can see, e.g., for COVID-19 the recent





In the third one of those we have that the projection from a FedEx case is the solution to the first order ordinary differential equation initial value problem

y'(t) = k y(t) (b - y(t))

There for data we used y(0) and b. Then we guessed at k. Had we used values of y for the past month, we could have picked a better, likely fairly good, value for k.

Lesson: Fitting an S curve does not have to be terribly bad.

The key here is the b: The S curve of the solution is the logistic curve, and it rises to be asymptotic to b from below. Knowing b helps a LOT! When have b, are no longer doing a projection or extrapolation but nearly just an interpolation -- much better.

For FedEx, the b was the capacity of the fleet. For COVID-19 the b would be the population needed for herd immunity (from recovering from the virus, from therapeutics that confer immunity, and a vaccine that confers immunity).

Knowing b makes the fitting much easier/better. To know b, likely need to look at the real situation, e.g., population of candidate smart phone users, candidate TV set owners, market potential of FedEx (as it was planned at the time), or population needed for herd immunity for the people in some relatively isolated geographic area.

Then in TeX source code, the solution is

y(t) = { y(0) b e^{bkt} \over y(0) \big ( e^{bkt} - 1 \big ) + b}

Can also use a continuous time discrete state space Markov process subordinated to a Poisson process. Here's how that works:

Have some states, right, they are discrete. For FedEx, that would be (i) the number of customers talking about the service and (ii) the number of target customers listening. Then the time to the next customer is much like the time to the next click of a Geiger counter, that is, has exponential distribution, that is, is the time of the next arrival in a Poisson arrival process (e.g., the time of the next arrival at the Google Web site). So at this arrival, the process moves to a new state where we have 1 more current customer and 1 less target customer. Then start again to get the next new customer.

The Markov assumption is that the past and future of the process are conditionally independent given the present state; so that justifies our getting to the next state using only the current state -- given the current state, for predicting the future, everything before that is irrelevant.

What is a Markov process, what satisfies that the Markov assumption, can depend on what we select for the state -- roughly the more we have in the state, the closer we are to Markov. In particular, if we take the whole past history of the process as the state, IIRC every process is Markov. But Markov helps in something like the FedEx application since that state is so simple.

We get to use continuous time since the time to the next change of state is from a Poisson process whose arrival times are the continuum -- that is, we don't have to make time discrete although it is true that the history of the process (one sample path) has state changes only at discrete times.

So, for state change, and for some positive integer n we have some n possible states, then for i, j = 1, 2, ..., n, we can have some p(i,j) which is the probability of jumping from state i to state j, that is, we have an n x n matrix of transition probabilities.

[p(i,j) is the conditional probability of entering state j given that the last state was i.]

For two jumps, square that matrix. Now there is a lot of pretty math -- get some limits and eigenvectors of states, etc. Actually fairly generally there is a closed form solution to the process. Alas, often in practice that closed form is useless because the n and the n x n are so large, maybe n^2 in the trillions. E.g., in a problem I solved for war at sea, there were Red weapons, Blue weapons, on each side some number types and some number of weapons of that type. The states were the combinatorial explosion. Then there were the one on one Red-Blue encounters where one died, the other died, both died, or neither died. The time to an encounter was the next arrival of Poisson processes, also Poisson. Well, that was an example where there was a closed form solution but n and n x n were wildly too large for the closed form solution but running off, say, 500 sample paths via Monte-Carlo was easy to program and fast for the computer. So, sure the software reported the average of the 500 sample paths. On a PC today, my software would be done before could get finger off the mouse button or the Enter key.

This approach is fairly general. And since what I did included attack submarines, SSBN submarines, anti-submarine destroyer ships, long range airplanes, etc., there should be no difficulty building such a model for COVID-19 that included babies, grade school kids, ..., nursing home residents, people at home, people working nearly alone on farms, ....

Back to S curves, IIRC dropping out of the math for the n x n matrix and its powers is an S curve. So, in a broad range of cases, always get an S curve although a different curve depending on, yes, the p(i,j) and the initial state. Uh, when no one is left sick, the Markov process handles that as an absorbing state -- once get there, don't leave.

For the n in the billions, the n x n is really a biggie. So, for the submarine problem I did,

J. Keilson, Green's Function Methods in Probability Theory.

asked "How can you possibly fathom that enormous state space?". That is a good question, and my answer was: "After, say, 5 days, the number of SSBNs left is a random variable. It is bounded. So it has finite variance. So, both the strong and weak laws of large numbers apply. So, run off 500 sample paths, average them, and get the expectation within a gnat's ass nearly all the time. Intuitively, Monte Carlo puts the effort where the action is.". Keilson was offended by "gnat's ass" but liked the math and approved my work for the US Navy. That question and answer are good to keep in mind.

There is more in, say,

Erhan Çinlar, Introduction to Stochastic Processes, ISBN 0-13-498089-1, Prentice-Hall, Englewood Cliffs, NJ, 1975.

For why the arrival times have exponential distribution and why we get a Poisson process, Çinlar has a nice simple, intuitive, useful axiomatic derivation. There is more via the renewal theorem in

William Feller, An Introduction to Probability Theory and Its Applications, Second Edition, Volume II, ISBN 0-471-25709-5, John Wiley & Sons, New York, 1971.

There's quite a bit more to COVID-19 prediction than standard derivation of a logistic curve found in introductory differential equation books. SIR models would perhaps be the simplest place to start.


Nice reference. The discussion is mostly about SIR (for susceptible, infected and recovered) models but also with several generalizations to handle more details, e.g., losing immunity and getting infected again. A lot of this material goes back, say, to 1927.

First cut it appears that the differential equation I gave is an SIR model except with R = 0, i.e., once are a customer of FedEx then remain one.

At one point article finds the logistic curve as I did -- maybe for the same situation.

The article does touch on stochastic models, but I didn't see discussion of a Markov assumption or Poisson process for time to the next infection.

There is an unclear reference to operator spectral radius, and maybe that is related to the eigenvalues, vectors I mentioned.

Whatever, especially with the Wikipedia reference, it appears that we are a step or two beyond the OP.

The work I did and described here I did decades ago and is not at all related to the math of my startup now. So, here I was able to describe some of my old work, but I'm doing other things now.

For the differential equation I gave, the solution as I derived it needed only ordinary calculus and not the additional techniques of differential equations. I just did the derivation in a hurry at FedEx just as a little calculus exercise. I didn't consult a differential equations book. I discovered that the solution was the logistic curve only by accident years later.

The Wikipedia article is nice.

Yes that's I nice way to run into logistic. Some introductory books use that as a motivating example, usually the second one. SIR is too simple to be accurate, but is ok enough to paint with broad brush strokes. Sometimes simple models are still useful as long as one does not read too much into the output.

Statistical epidemiology is quite heavily invested in stochastic differential equation. Wikipedia would hardly suffice as an academic, up to date and an exhaustive bibliography.

Sounds like the next step up would be stochastic optimal control, the field of my Ph.D. dissertation.

Ah, been there, done that, got the T-shirt, and doing other things now!

Heh ! that's why I said this https://news.ycombinator.com/item?id=22937262

BTW you mention Prof. Bertsekas a few times, did you ever run into Pof. Tsitsiklis ? He has done some work in the area of epidemics around 2015. Mentioning just in case you knw him

Perhaps the optimal control problem would pique your interest more, once the differential equations have been nailed down in a form that also incorporates the interventions.

There is no such thing as an 'exponential curve' in our finite world. And the derivative of the sigmoid is a bell curveish thing.

Yes, thats why exponential curves taper off to an S-curve in the real world (e.g. population growth is exponential at first, then tapers off when hitting food limits).

This article is about estimating the more realistic s-curve in the real world. The point it: You cannot get a sense of the end by observing the beginning... even in the super idealized case of the data coming from a perfect s-curve, let alone the real world. Its a great article.

Piecewise. A exponential forecast may be a better model for example for policy making on some time period than a linear one would be.

It's a logistic function, which is different from the “bell curve”. They look pretty much the same, but their mathematical properties are really different so they should not be confused.

When you’re talking about a highly contagious disease it’s more or less a distinction without a difference. It’s exponential until very close to the point where a significant percentage of the vulnerable population is infected. It’s not super interesting to point out that it’s no longer growing exponentially, when half the population has it already.

It is interesting to think about how long the exponential trend will continue before becoming a sigmoid.

The left tail of a sigmoid looks a hell of a lot like an exponential.

The left tail of the derivative of a sigmoid also looks like an exponential.

The reason it is hard is that it is using a stateless model to approximate an inherently stateful and often chaotic process.

Take tech adoption for example. Often incorrectly represented as s-curves, they are the result of the inherent cost/benefit of the technology combined with the stateful diffusion process of communication and inherently human resistance to change. And being inherently stateful, you can have chaotic influences in that diffusion process. For example, the idea of microservice architecture had an absolutely massive diffusion jump the moment Amazon sent out that now-famous email mandating adoption. It wasn't linear, it wasn't exponential...it was a discrete step, and a very large one at that. These are everywhere too, because communication doesn't propogate like bacteria grows, it propagates due to extremely non-linear levels of influence. Bill Smith, 45 year old mid-level programmer for a tiny Midwest bank, will never have the tech-adoption influence levels of a Steve Jobs or Alan May or Linus Torvalds.

A better option for modeling would be to use Monte Carlo methods or systems methods. Something that acknowledges the inherent statefulness of the process.

> Often incorrectly represented as s-curves, they are the result of the inherent cost/benefit of the technology [...]

You're taking from two completely different levels of abstraction here. Tech adoption rather obviously happens in an S-shaped curve, at least sometimes. See the article for examples. And of course that shape, and its exact parameters, are the result of some underlying processes.

These two things aren't contradictory. Outside air temperature follows a roughly sinusoid curve. It's the result of the earth turning and therefore alternating between night and day. But it's still sinusoid.

And the S-curve does acknowledge state, or it would just be exponential.

Yes, there are different approaches to disease modelling, such as agent- and rule-base simulations. Unfortunately, they tend to be really bad because we just don't enough data to satisfactorily simulate societies at the level necessary for this application.

But having an s-shaped curve is not the same thing as being a sigmoid function, in the same way that having a bell-shaped density is not the same thing as having a gaussian distribution. There a tons of processes out there that can approximate well with either of those two, while being extremely different in extrapolation.

One thing that can immediately disprove mathematical sigmoid modeling: curve symmetry. If the early adoption exponential growth is not exactly the same shape as the late adoption slowdown, then you don't have a sigmoid function.

And that's the problem: if you're using a single function to model the result of two (or more) separate processes with distinct mechanics and parameters, you're going to have the same exact pitfalls as you would trying to model a bimodal process with a single probability distribution. Namely that they might fit well with interpolative methods, but completely fail with extrapolative methods (like forecasting!).

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact