
A Twisted Path to Equation-Free Prediction - bpolania
https://www.quantamagazine.org/20151013-chaos-theory-and-ecology/
======
scottfr
When reading about any predictive technique the first thing to focus on before
trying to understand the predictive mechanism is to understand the predictive
accuracy of the model. A paper like this should have extensive discussions of
overfitting, underfitting and overall predictive accuracy. The entirety of
this paper's attempt to address this is:

"Last, to avoid arbitrary fitting and to obtain a robust measure of forecast
skill, we apply a fourfold cross-validation scheme for each model: the model
is fit to three-fourths of the data to predict the remaining one-fourth out-
of-sample, and the procedure is repeated for each one-fourth segment of the
time series."

They are dealing with time series data, which naturally has temporally
correlated observations so a cross-validation like this will give a biased
error estimate that will favor the more complex model (it encourages
overfitting). I haven't delved into the details, but compared to the Ricker
model their model is likely more complex so I have very little confidence in
their claims of accuracy.

Of course, the choice of the Ricker model (developed in the 1950's) is really
a strawman in any case. There are much better competitive techniques they
could have chosen to benchmark their work.

------
tansey
Actual paper:
[http://www.pnas.org/content/112/13/E1569.full](http://www.pnas.org/content/112/13/E1569.full)

The idea (from 5 minutes of skimming plus watching the little 3min tutorial)
seems to be that any multi-dimensional time series implicitly contains all the
information necessary to reconstruct it entirely within a single dimension.
The authors show an example of a 3-dimensional time series with a butterfly-
type pattern that you can reconstruct just by looking at the values of X(t)
and then using X(t-a) and X(t-2a) as the other dimensions. Not sure how that
leads to prediction-- can someone explain that part?

~~~
powera
In complex analysis, this is generally true; in particular for any function on
the real numbers, there is exactly one differentiable function on the entire
complex plane that has the same value.
[https://en.wikipedia.org/wiki/Holomorphic_function](https://en.wikipedia.org/wiki/Holomorphic_function)
. You can also go forward/backward in time with the functions, and knowing the
value ahead of time is predicting it.

On the other hand, there are non-differentiable events all the time in nature.
It's madness to claim that salmon populations contain enough information to
predict earthquakes (that could dam a river and cause a massive population
change).

~~~
kragen
It's probably more accurate to say that predicting earthquakes from salmon
populations is far outside our current modeling capabilities. Undoubtedly the
seismic stresses and crustal movements leading up to earthquakes have numerous
weak causal links to the salmon population: most obviously, they alter the
slopes and flow rates of streams leading into the ocean, altering the nutrient
balance available to the algae there, but also they alter the distances flown
by migratory birds by centimeters per year, and those birds will consequently
eat very slightly different fish, etc. There might not even _be_ a way to
analyze the salmon population in such a way as to tell you how far the
continents have drifted — but declaring that postulating the existence of such
a thing is "madness" seems like being far too sure of yourself.

There's a science-fiction story that may be useful in gaining perspective on
the limits of prediction at
[http://lesswrong.com/lw/qk/that_alien_message/](http://lesswrong.com/lw/qk/that_alien_message/).

The problem with earthquakes, as I see it, is not so much that they are non-
differentiable as that they (like a bird eating or not eating a particular
fish) have very large local derivatives with respect to their precipitating
conditions.

~~~
powera
If you have a theory that invalidates the Heisenberg Uncertainty Principle, go
for it. Otherwise, I'm going to consider all of that pure science fiction.

~~~
kragen
Maybe you're not very familiar with the uncertainty principle, but continental
plates and even individual birds have enough mass that the uncertainty in
their position, for any reasonably large uncertainty in their momentum, is
insignificant.

~~~
powera
I'm claiming that to measure the salmon population well enough to predict
earthquakes, you would have to violate the uncertainty principle.

~~~
kragen
Now _that_ sounds like pure science fiction — specifically, the kind of
technobabble that serves as filler in Star Trek scripts and New Age scam
artist patter. "Captain! We can't measure the salmon population because the
Heisenberg Uncertainty Principle is overwhelming the medium-range laser
scanner array!"

~~~
powera
... are you seriously claiming that you theoretically could predict
earthquakes based solely on the salmon population?

I'm not sure it's worth arguing with you either way.

~~~
kragen
No, I'm seriously claiming that _you have no idea whether_ you theoretically
could predict earthquakes based solely on the salmon population. We don't have
a good theory that bears on the question. We know we can't imagine how you
could _practically_ make such a prediction, but the only relevant theory that
we have is the theory of inductive inference founded by Solomonoff in the
1960s, which is very far indeed from ruling out such predictions. And you
making up bullshit about Heisenberg doesn't get us any closer.

~~~
powera
I'm claiming I do have an idea.

Let's say you have a model that will predict earthquakes based on the number
of salmon. I take that model, I see what will happen if I kill or don't kill a
salmon, and how that impacts the earthquake. Then I kill or don't kill a
salmon. Am I now causing an earthquake? If I am, am I changing the time of the
earthquake by an amount that we can actually measure?

------
mockery
This extended version of the associated video provides much more information /
better intuition about how delay embedding works:
[https://www.youtube.com/watch?v=6i57udsPKms](https://www.youtube.com/watch?v=6i57udsPKms)

~~~
contravariant
Thanks, the video in the article cut off before it made any interesting
points.

------
j2kun
There's a fine line between writing down a single equation that you claim
governs an entire system and using data to come up with some equations which
(based on the data) you claim describes the system.

Actually I take it back. There is no line. They are the same thing. It's just
that one is based on very small amounts of data, and the other has more
complicated-looking equations. It's all still math, folks.

> Complex natural systems defy standard mathematical analysis

So the punchline is that ecologists and the author of this article have a very
narrow view of what "standard" mathematical analysis entails. It's great that
ecology is making strides forward, but this isn't a groundbreaking departure,
it's just them catching up to modern mathematical analysis.

~~~
cafebeen
Yeah, I think it's a parametric vs. non-parametric modeling comparison they're
making, although they don't explicitly make that point. It does seem strange
to call what they're doing "equation-free", but maybe within their field this
kind of thing makes sense

------
jostmey
The name of the method seems a little sensational. Many scientists have become
hung up on specific models when actually all they care about are making
predictions from their data. This paper is reaction against that trend, but I
have to wonder if a boiler plate machine learning algorithm wouldn't perform
at least as well.

~~~
IndianAstronaut
> This paper is reaction against that trend, but I have to wonder if a boiler
> plate machine learning algorithm wouldn't perform at least as well.

The difference is simulations. Equation modeling lets us simulate and ask the
what if questions.

