
Science without Validation in a World without Meaning - nkurz
https://americanaffairsjournal.org/2020/05/science-without-validation-in-a-world-without-meaning/
======
cgiles
I work in molecular biology research, and I think this is a great article that
strikes at the heart of many problems in the field. I can't comment on the
climate change stuff, although I wish he hadn't included it because it was
almost certain to distract people from the overall point.

The problem is that there are no remotely comprehensive, predictive, and
mathematical models of what goes on inside of cells. It is pure empiricism:
you run an intervention, and see what happens. Write it up in a paper.

All well and good, except there are no viable models of what is happening
inside that are _predictive_ in the sense of being able to know what an
intervention will do until you test it. We really need that if we want to
develop treatments for molecular diseases that are more than marginally
better.

The Santa Fe Institute, systems biology people, and others were working hard
on this problem at the turn of the century, but progress has stalled. It's too
hard. We don't know how to do it. A new "mathematical epistemology" that could
handle this problem would be a huge step forward, if it is possible.

I can see why the author would extend this idea to things like economics or
climate science. The thought in systems research was that, perhaps, different
fields share similar underlying "complex systems" mechanisms, and if we can
solve the problem in one area, we may have insights for how to do it
elsewhere.

~~~
memexy
What's missing from current mathematics to make predictive models for biology?

I did a search for "neural network cell simulation" and got a few hits, e.g.
[https://ieeexplore.ieee.org/document/8805421](https://ieeexplore.ieee.org/document/8805421).

So it seems that people are working on the problem of predictability (or at
least augmenting the researcher's/experimenter's ability to do some analysis
ahead of time based on simplified models).

~~~
Balgair
Flops.

Cells balance right on the edge of Maxwell's Demon. Even a few thousand ions
can change behavior radically. So, you are forced to track all the ions,
proteins, lipids, etc. Which means you have to do a lot of atom-by-atom
tracking. There are a few tricks here, but since the cell is not crystalline,
you can't do a lot of fun physicsy math to get the problem to be easier.

Also, most of the time, since this is 'research' to begin with, you don't know
what's in the cell. That's the point of looking. We've nearly no idea what all
the proteins are in any given cell. DNA gives some guide, but a stochastic
switch from coding to non-coding happens, constantly. So you don't know what
all the proteins in a cell are, where they are, what they do, what they don't
do, what the extracellular space is like, etc.

Cells are just _really_ complicated. So you need a lot of flops.

~~~
memexy
How is "edge of Maxwell's Demon" related to "edge of chaos"?

Re: flops. I understand brute force is a good way to simulate dynamics but we
constantly solve hard problems by approximation and have gotten pretty far
with that approach. So what approximations have been tried and why have they
been considered failures?

Also
[https://mobile.twitter.com/SteveStuWill/status/1268111230020...](https://mobile.twitter.com/SteveStuWill/status/1268111230020882432):
> "Scientists created fully functional mini-livers out of human skin cells,
then successfully transplanted them into rats. The research is a proof-of-
concept for potentially revolutionary technology and provides a glimpse of an
organ donor-free future." Wow!

That's unrelated to the original points but I see plenty of innovative
approaches to problems in biology. Simulating cells is just one way to figure
them out and we don't need to figure them out completely through computational
means to put them to good uses. Biology is already computronium and if we can
understand how to "program" then we don't need to simulate everything.

------
btrettel
This article meanders too much. The basic point is this:

> While stochastic models present us with numerous difficulties, an even more
> perplexing conundrum faces contemporary scientists and engineers who wish to
> model highly complex systems involving hundreds or thousands of variables
> and model parameters. Owing to their sheer number, many model parameters
> cannot be accurately estimated via experiment and are left uncertain. As a
> consequence of uncertainty, for each different set of possible values for
> the unknown parameters, there is a different model—possibly an infinite
> number of models.

> Confronting the problems of complexity, validation, and model uncertainty, I
> have previously identified four options for moving ahead: (1) dispense with
> modeling complex systems that cannot be validated; (2) model complex systems
> and pretend they are validated; (3) model complex systems, admit that the
> models are not validated, use them pragmatically where possible, and be
> extremely cautious when interpreting them; (4) strive to develop a new and
> perhaps weaker scientific epistemology.

At the moment I am in favor of option 3, though option 4 might be more
appealing in the future.

This isn't an easy option to take. Recently I had an article accepted for
publication where I basically argued that no models, including my own, were
truly validated because none fit a non-naive data set well. (In another paper
I argued that most data sets used for validation are too easy to match because
they don't cover the parameter space well.) A reviewer recommended rejection,
basically saying that because the model isn't validated, it shouldn't be
published. So much for being intellectually honest!

The paper was eventually accepted after I made it more clear that none of the
popular models work that well (some are absolutely terrible in my view), and
that my model improves on the status quo in a few ways.

Note that this situation isn't exactly the same as that described in the link.
In the case of my article, I think we can get enough data to validate a model.
We just don't have that data at present.

~~~
mjburgess
I tend to come down on (1) in many cases, eg., social psychology.

I'd be open to more of this sort of modelling if (4) was a realistic and
completed project.

The reality of (4) however is saying, to the public, things like: there may
either be very limited climate change, or the entire world will be destroyed
as we know it.

I don't see the current public understanding of science, esp. via journalism,
as fit for such realities.

------
guscost
> Four conditions must be satisfied to have a valid scientific theory: (1)
> There is a mathematical model expressing the theory. (2) Precise
> relationships, known as “operational definitions,” are specified between
> terms in the theory and measurements of corresponding physical events. (3)
> There are validating data: there is a set of future quantitative predictions
> derived from the theory and measurements of corresponding physical events.
> (4) There is a statistical analysis that supports acceptance of the theory,
> that is, supports the concordance of the predictions with the physical
> measurements—including the mathematical theory justifying the application of
> the statistical methods.

This definition appears rigorous at a glance, but it is deficient. We cannot
properly test a theory if it only predicts things _which we already expect to
happen_. Popper said that scientific theories must instead make "risky
predictions":

"Confirmations should count only if they are the result of risky predictions;
that is to say, if, unenlightened by the theory in question, we should have
expected an event which was incompatible with the theory–an event which would
have refuted the theory."

------
snowwrestler
An article by an electrical engineer, published in a political journal, about
climate change... I had low hopes, which were met.

IMO few categories of professions have a harder time understanding cutting-
edge science than engineers. That is because they think they know science
because they use similar mathematical and technical tools, when in fact the
professions do the exact opposite of one another.

Engineers use what is known and understood to construct systems that can be
validated. Scientists investigate unknown systems to try to construct
knowledge and understanding.

The worst thing, professionally, for an engineer is to not know or understand
your work. The worst thing, for a scientist, is to spend too much time on
things that are already known and understood.

Engineers love to point out how scientists don't know, and can't prove,
whether their climate models are accurate. Scientists know that that is the
_whole point_ of building such models. Working without knowing whether you're
correct is not epistemological conflict. It's the fundamental condition of
being a scientist.

Yes, it would be better if we knew everything and only consulted systems that
are provably correct. But we don't. And the only way to expand what we know is
to spend a ton of time doing things that we don't know.

We haven't found any other way to do it. Writing hand-wringing articles about
the state of science from the sidelines does not advance human knowledge. You
can't expand the map by standing inside the border and complaining about how
hard it is to see past it.

~~~
pdonis
_> Working without knowing whether you're correct is not epistemological
conflict. It's the fundamental condition of being a scientist._

This is true, but it's also true that, when making public policy, we should be
thinking like engineers, not scientists. Public policy is not about
constructing new knowledge. It's about working out conflicts between different
values and priorities based on our best current knowledge. To the extent the
article is talking about public policy, I think it makes a valid point that
the limitations of our knowledge in areas highly relevant to public policy are
very often not recognized or taken into account when making public policy.

~~~
snowwrestler
There are no sidelines in public policy. There's nowhere to wait to see how
things turn out before making a decision.

So, there are no neutral decisions. "Waiting until we're more sure about
something" is not actually waiting, it is an affirmative decision to disregard
what we think we know right now. That can be a valid choice, and something to
debate, but it's fundamentally different from the feined neutrality of the
concept of waiting to see.

To make such an argument requires attacking that knowledge directly, showing
specifically how it's wrong. Not just vaguely complaining that it's not good
enough yet.

This is actually well-understood in public policy when it comes to other areas
like economics or defense; leaders act to address needs in real time, making
the best decisions they can with the information available to them. "A good
plan violently executed now is better than a perfect plan executed at some
indefinite time in the future.” ― General Patton

Engineering, as a discipline, cannot cover all of life. It largely constrains
itself to situations that are understood, and can do so because there are
other complementary disciplines that create the knowledge it uses. But in
public policy there's not a complementary Earth or society that we can wait
for.

~~~
pdonis
_> "Waiting until we're more sure about something" is not actually waiting, it
is an affirmative decision to disregard what we think we know right now._

No, it's an affirmative choice to _not impose a public policy on everyone_
based on what we think we know right now. And often that is the right choice.

 _> To make such an argument requires attacking that knowledge directly,
showing specifically how it's wrong. Not just vaguely complaining that it's
not good enough yet._

You have this backwards. Claimed knowledge doesn't get to be assumed to be
right until it's shown to be wrong. It needs to demonstrate that it's right
with a sufficient level of confidence before it even gets considered at all.
"Not good enough yet" just means "you haven't shown your claims to be right
with enough confidence to make them worth considering in this public policy
debate".

 _> leaders act to address needs in real time, making the best decisions they
can with the information available to them._

"Leaders" aren't the only ones who act to address needs in real time and make
decisions. Everybody does that. Often the best thing for "leaders" to do is to
not make any decisions at all as "leaders", but to simply let individual
people, who have far more accurate information about their individual
situations than any "leader" can possibly have, make their own decisions.

For a "leader" to make a decision and dictate a public policy, the policy
needs to be based on knowledge that is strong enough to justify overriding the
billions of individual decisions that people are making all the time about
their individual lives, with a top-down dictated decision that everybody must
follow. That's a much, much stricter requirement than most people appear to
think.

~~~
didibus
Hum, I think both of you provide a good point. Ultimately though, I have to
agree with snowwrestler, public policy often needs to make a decision with
incomplete information. There's probably a middle ground here, I'd say having
70% of the information, something like that.

Otherwise, it is too slow to act to prevent future problems.

In that regard, I feel it's best paralleled with business. Business decisions
are often made with incomplete information, to get a competitive advantage.
But you always need to weigh the risk/reward potential.

Feel it's the same for public policy. Depending on the potential risk of a
particular policy, pitched against the level of confidence in the information
that supports it, and you can arrive at a decision. I don't think it make
sense to say, always wait for the confidence level to be approaching 100%.
You've got to take calculated risks sometimes.

~~~
pdonis
_> public policy often needs to make a decision with incomplete information_

I disagree. Individual people and businesses often need to make decisions with
incomplete information, but individual people's or businesses' decisions are
only about their own actions, not about everybody else's. But public policy
decisions affect everybody, so the criterion needs to be a lot stricter for
how complete the information needs to be and how confident we need to be in
our knowledge before we impose a public policy on everybody.

 _> You've got to take calculated risks sometimes._

The idea that public "leaders" should be able to take calculated risks with
everybody else's money (and lives) is, IMO, pernicious. This is exactly the
mentality that has created so much mess in the world throughout history. No,
"leaders" should _not_ take calculated risks that affect everybody.

~~~
bobbydroptables
>But public policy decisions affect everybody, so the criterion needs to be a
lot stricter for how complete the information needs to be and how confident we
need to be in our knowledge before we impose a public policy on everybody.

Again, your own logic destroys your argument.

We have made public policy decisions that heavily subsidize oil, cars, lowered
air quality, etc.

These decisions affect _everyone_ not just car drivers. These decisions were
not based on complete information (in fact, we had very limited knowledge of
global warming, air pollution, etc. when we made public policy decisions to
favor air pollution).

>The idea that public "leaders" should be able to take calculated risks with
everybody else's money (and lives) is, IMO, pernicious. This is exactly the
mentality that has created so much mess in the world throughout history. No,
"leaders" should not take calculated risks that affect everybody.

So why should Cletus who like to roll coal on Tesla drivers be able to take
uncalculated risks that affect everybody?

I seriously think you don't understand what externalities are.

~~~
pdonis
_> We have made public policy decisions that heavily subsidize oil, cars,
lowered air quality, etc._

Yes, and I have already said that I oppose those decisions. The government
should not be playing favorites.

 _> why should Cletus who like to roll coal on Tesla drivers be able to take
uncalculated risks that affect everybody?_

Cletus' behavior doesn't affect everybody; it only affects the few people who
are within range of his coal rolling.

~~~
bobbydroptables
>Yes, and I have already said that I oppose those decisions. The government
should not be playing favorites.

So you agree that government should stop subsidizing roads and suburbs?

>Cletus' behavior doesn't affect everybody; it only affects the few people who
are within range of his coal rolling.

Actually air doesn't work this way. Pollution can and does carry for hundreds
of miles.

But we can proceed with your fictional conception of aerodynamics.

How are the people that Cletus rolled coal on supposed to get compensated for
their loss?

------
N1H1L
I read the full article. And it feels like much ado about nothing.

 _One_ , the author for some reason is uncomfortable with a purely
mathematical description of the world - without giving reasons beyond that
humans cannot physically comprehend what the equations represent.

 _Two_ , the whole piece disregards the advances in emergent phenomena,
complex systems and statistics. Yes, we do not understand how individual
electrons look like in copper, but statistical descriptions of copper
dimensions, purity and grain size are enough to get an exceptionally accurate
idea of the electrical behavior of a copper wire. This extends for
exceptionally complex systems like lungs (smoking will significantly increase
your cancer/emphysema risk), or planetary systems (increasing CO2
concentrations will increase surface temperatures).

 _Three_ , There is also a weird bias against modeling. I am an experimental
scientist, but I work closely with modellers, as all experimentalists do
nowadays. Personally, I believe, that this bias against modeling often has a
political component due to anthropogenic global warming. But in reality, there
is a very vibrant dialogue between modellers and experimentalists, and hybrid
scientists, people who are trained in both fields are the rage right now for
hiring committees. Models have also improved dramatically - density functional
theory is exceptionally accurate for smaller systems, while ReaxFF for dynamic
systems and molecular dynamics simulations for more complex systems become
better every year. The whole point about complex systems, applies to modelling
too. I do not need to know the location of every molecule in the north
Atlantic to predict the path of the hurricane - high pressure ridges and sea
surface temperatures will give me really good results.

------
ur-whale
Feynmann: "It is whether or not the theory gives predictions that agree with
experiment. It is not a question of whether a theory is philosophically
delightful, or easy to understand, or perfectly reasonable from the point of
view of common sense"

Then proceeds to invent string theory.

Don't get me wrong, I have a lot of admiration for Feynmann,but if he did
indeed say the above, that strikes me as rather inconsistent : string theory
is exactly what he preaches not to do : widely successful because of how
beautiful it is, but unfalsifiable (as of yet).

------
mturmon
A very nice article -- more pessimistic than I'd be -- about the limits of
outsider understanding when predictions come from complex models that few
understand.

~~~
webmaven
The pessimism seems to comes from the proposition that _no one_ really
understands models beyond a certain complexity, even insiders. Even whoever
_created_ the model.

------
Gatsky
"Medicine is a form of engineering in which the physician takes some action
(drug, surgery, etc.) to alter the behavior of a physical system, the patient.
Its underlying science is biology. Hence our ability to characterize medical
knowledge depends on our ability to represent and validate biological
knowledge."

This is really very wrong. The 'physical system' is conscious and suffering,
which excludes Medicine from even the most vaporous definition of engineering.
Real progress in medicine is much more akin to exploration (in the Christopher
Columbus sense) than engineering.

Anyway, this is a typical engineer writing about biology. They always make the
same mistakes:

1\. Ignoring evolution. 2\. Treating mechanistic models as first class
citizens with disdain for data (eventhough the latter in the form of DNA is
the sine qua non of all life) 3\. Failing to understand that "The simplest
complete model of an organism is the organism itself."[1] as von Neumann
noted.

[1] [https://onezero.medium.com/the-future-of-computing-is-
analog...](https://onezero.medium.com/the-future-of-computing-is-
analog-e758471fbfe1)

~~~
Ma8ee
> "The simplest complete model of an organism is the organism itself."

But that is trivially true about everything, including golf balls. The
interesting question isn't if the model is complete, but if it is sufficient
(or complete enough) to make predictions.

And in the case of golf balls we can make quite good predictions with a very
simple model, mostly because we are only interested in if it ends up in the
hole or not. But a complete model of a golf ball would need to include the
exact dynamics of the different layers and the turbulence around it, which we
just can't do. Luckily, we don't need to.

So what is a sufficient model of a living organism? That of course also
depends on what we want to know. I am quite certain that modern egg farmers
have, for their purposes, quite sufficient models of hens. They know how fast
the hens grow, the effects of different amounts and types of food, how much
heat they produce, and of course how many and how big eggs they lay, etc..

And it is of course possible that we need to know the exact wave function for
the complete system that is the cell to model it sufficiently enough to, say,
synthesise new drugs. But I don't think so.

~~~
Gatsky
It is definitely not trivially true about anything. A golf ball can be
modelled easily. You are adding external complications to the system. An
electron can be impossible to model if you inject it into a complex system.

Anyway the point is a golf ball will not evolve into a sentient being who will
then fly to mars and write poetry, which is what cells have done. This shows
you can’t even put a meaningful bound on what you need to model for an
organism.

~~~
Ma8ee
But that is not a complete model of a golf ball. A complete model will include
the wave function of all the particles in golf ball. It won’t be simpler than
the model an organism of the same size. And in this case I mean by simpler
that the number of bits required would be lower.

How does the capacity of the thing you try to model affect the complexity of
the model?

------
kdkdk
I think the system the author is looking for is probabilistic programming. It
allows you to write programs/models with uncertain factors. Then at the end
you can observe a few variables and it automatically computes the expectation
and variance all the input variables. Check out psisolver.org for example.

------
mensetmanusman
This is a great article, but I wish it would have really dealt with the idea
of complexity.

Nature is not ‘unintelligible’ it is ‘complicated’ because of the large number
of discrete interactive constitutive units.

In fact, if you plot # of constitutive quarks/atoms being worked with versus
uncertainty, you see an interesting layout of the sciences. e.g. take this
plot by xkcd ( [https://xkcd.com/435/](https://xkcd.com/435/) ) and think
about how much matter is under observation (i.e. increasing complexity to the
left). Somewhere on the left would be climate in this example (as in the
article).

------
mjparrott
I could use a TLDR on this one ...

~~~
guscost
We can't easily validate models of reality that are complex/stochastic,
because any experimental result could be a statistical anomaly. In particular
this is a problem for models of long-term and/or global phenomena like climate
or public health, since we only get one "experimental run", so to speak.
Therefore:

> Confronting the problems of complexity, validation, and model uncertainty, I
> have previously identified four options for moving ahead: (1) dispense with
> modeling complex systems that cannot be validated; (2) model complex systems
> and pretend they are validated; (3) model complex systems, admit that the
> models are not validated, use them pragmatically where possible, and be
> extremely cautious when interpreting them; (4) strive to develop a new and
> perhaps weaker scientific epistemology.

~~~
dang
I haven't read the article, but #4 is the interesting item on that list. #3 is
the Goldilocks option, the one that sounds between-the-extremes like the just-
right porridge that Goldilocks ate—but it's not realistic, because "be
extremely cautious when interpreting them" exceeds what the human mind is
capable of.

~~~
guscost
It may be possible for _a_ person to succeed with #3, but I have a hard time
believing it could work in a community of people who disagree.

For #4, the closest thing I can imagine is a utilitarian approach, which of
course I’m going to prefer as an engineer, and which (I think) ultimately
reduces to #1. That is a tough problem I don’t expect to see solved in my
lifetime, but I’d be happy to be proven wrong.

~~~
btrettel
On reflection, option 1 includes like Taleb's philosophy. At first I wrote it
off as leading to arguments like "no one knows so let's do what I, an 'expert'
say", which I hate. But it's written generically enough to include rules like
Minimax, where you don't need model estimates to make a decision. This rule is
_very_ conservative, though.

I think there are a fair number of people who succeed at option 3. I can think
of a few people who I believe do option 3 well, but as you indicated, they're
individuals. I think that the US Department of Energy is probably the closest
organization that I'm aware of to succeed at this. (Or at least some DOE labs;
I can say from personal experience that groups at both Los Alamos and Sandia
take this fairly seriously, though I don't think it has fully permeated their
culture.) I'll have to think more about this.

Option 3 is my basic approach, I'd like to think that I succeed at it, e.g.:
[https://news.ycombinator.com/item?id=23397785](https://news.ycombinator.com/item?id=23397785)

A major problem with option 3 is that it runs the risk of people claiming to
do option 3 but actually doing option 2. I think this happens regularly but
it's due to ignorance, not malice. Ultimately I think we need to change
scientific standards and the STEM curriculum before option 3 becomes tractable
on a large scale, but even then I'm not sure it'll work because it'll always
be easier to claim to do option 3 while actually doing option 2.

I agree with the writer that option 4 is more of a long-term goal. I wish the
comments on this article focused more on #4 than what was discussed...

~~~
webmaven
_> (Or at least some DOE labs; I can say from personal experience that groups
at both Los Alamos and Sandia take this fairly seriously, though I don't think
it has fully permeated their culture.)_

It has not. In the current funding environment it can't, given that the way
research grants and contracts are awarded will always constrain organizational
culture.

------
Hnrobert42
A little about the source:
[https://en.m.wikipedia.org/wiki/American_Affairs](https://en.m.wikipedia.org/wiki/American_Affairs)

~~~
Hnrobert42
I got downvoted for linking to a document about the source. WTF?

