
To Know, but Not Understand: David Weinberger on Science and Big Data (2012) - reedwolf
https://www.theatlantic.com/technology/archive/2012/01/to-know-but-not-understand-david-weinberger-on-science-and-big-data/250820/
======
redelbee
At what point do we shift our investment in time and energy from building
models like those mentioned in the article to the bigger picture? Maybe it’s
just my perception but it doesn’t seem like we have very many people thinking
deeply about what models we should build and to what ends. Instead we are just
building the models and hoping we can put them to good use afterwards.

For example, what’s the end game for the cellular signaling modeling outlined
in the article? It seems like the result isn’t valuable in and of itself, and
it can’t be much more than that because the scientist “doesn’t understand it,
and doesn’t think any person could.” So we now have an equation that expresses
constants within a cell and that’s it. We don’t understand it and we can’t put
it to good use. So was that time and effort well spent? Do we just put this
work in a drawer so we can pull it out if it could be useful at some point in
the future? Is that what we’re doing with all the similar advances in
modeling?

There’s nothing wrong with knowledge for knowledge’s sake, but I think we’ve
way over indexed on the tools and predictions side of the system. If we
continue to constantly create new tools/models/predictions we might find a use
for them by chance. It just seems more efficient to focus on what outcomes we
really want and then put the models to work in pursuit of those outcomes.
Perhaps we focused more on the outcomes in the past because we didn’t have the
technological horsepower to constantly churn out new models.

Maybe I’m wrong and there are people working on the big picture. Are there
modern day philosophers doing this work? Do they make up a significant portion
of the work being done? If not, why?

~~~
throwawaygh
_> what’s the end game for the cellular signaling modeling outlined in the
article?_

Pharma.

Most of the modeling work that people do is fairly well motivated. Going from
models to working technology is indeed a huge leap, but everything starts with
the basic scientific understanding.

 _> Maybe I’m wrong and there are people working on the big picture._

You can usually find the "big picture" behind a paper by reading the recent
grant applications from the PI who funded the research (or the funding lines
explicitly mentioned in the paper, if any).

~~~
randcraw
Pharma may be the intended target for the signaling work, but as a data
scientist who works in pharma, I can say with certainty that no biologist or
chemist here would entertain for a minute any model that can't explain its
mechanisms of action. Nor would the FDA, who wants any model not only to
accurately predict the intended outcome but also reflect awareness of the
contextual circumstances that surround and lead to it.

No competent physician would be satisfied with a disembodied diagnosis. The
constituent symptoms and assay metrics that support that diagnosis are
essential to know, especially as disease is often complex and dynamic, and no
single diagnostic label should ever hope to supplant a deeper understanding of
each patient's unique mix of normality and abnormality. A diagnosis using ML
may be a useful starting point in treatment, but never should be the endpoint.

~~~
PaulDavisThe1st
>no biologist or chemist here would entertain for a minute any model that
can't explain its mechanisms of action.

Entertain? Who even really knows what means at this point. But I'm fairly
convinced that you'd be quite happy to have a theory-free "intuition pump"
that could tell you "if you slow down binding with the following 3 membrane
proteins, you see roughly double that effect on overall energy use by the
cell".

The tool that generates this prediction may be completely unable to give you a
"theory" about why this should be so, but then neither will the experiment(s)
you do that confirm it to be true.

So, while indeed, ML-style stuff "should never be the endpoint", they can act
as a incredibly useful intuition pump/launchpad for ideas and approaches that
would otherwise remain inaccessible.

~~~
throwawaygh
That's the mode of use for ML in _most_ industries -- flagging stuff for
follow-up by humans. Basically anything that's not real-time works like this.

Most uses of ML in real-time settings look more like hybrid systems -- a
little dusting of ML on top of a whole heap of more traditional mathematical
modeling/software engineering.

Outside of a few very niche settings, we're still a long way off from
"trusting" ML in any meaningful sense.

------
trabant00
From my point of view there are 2 possible ways:

\- we simply acknowledge we don't understand something enough and keep looking
into it until we do. I mean everything we now understand (at an acceptable
level by our standards) has gone through an intermediary phase - see alchemy
for example.

\- we declare some things (prematurely?) as forever escaping our grasp and
accept we may never have a simple model of them.

What bothers me is the 3rd way:

\- we don't know why but the computer model gave this result so let's go ahead
with putting it into production. We make money, the user/consumer may have a
nice experience or die, fingers crossed.

~~~
elliekelly
I think there's another (perhaps worse?) way:

\- we think we understand a complex model completely and we actually don't

------
reedwolf
Bottom line:

"With the new database-based science, there is often no moment when the
complex becomes simple enough for us to understand it. The model does not
reduce to an equation that lets us then throw away the model. You have to run
the simulation to see what emerges. For example, a computer model of the
movement of people within a confined space who are fleeing from a threat--they
are in a panic--shows that putting a column about one meter in front of an
exit door, slightly to either side, actually increases the flow of people out
the door. Why? There may be a theory or it may simply be an emergent property.
We can climb the ladder of complexity from party games to humans with the
single intent of getting outside of a burning building, to phenomena with many
more people with much more diverse and changing motivations, such as markets.
We can model these and perhaps know how they work without understanding them.
They are so complex that only our artificial brains can manage the amount of
data and the number of interactions involved."

~~~
throwawaygh
_> The model does not reduce to an equation that lets us then throw away the
model. You have to run the simulation to see what emerges._

This is true of simulation in general, not just data-drive models. E.g., a lot
of applied mathematics uses PDE models that don't have closed-form solutions
and so you just run a ton of simulations sweeping a parameter space.

 _> For example, a computer model of the movement of people within a confined
space who are fleeing from a threat--they are in a panic--shows that putting a
column about one meter in front of an exit door, slightly to either side,
actually increases the flow of people out the door._

The crux of this type of science is that you don't know whether the computer
simulations are telling you anything about reality. You just have to run real-
world experiments and see what happens. And even if the experiment turns out
to work, you still don't know for sure that your model was reasonable.

~~~
newqer
So you are suggesting we need to do a double blind test, by means of throwing
a molotov cocktail at a few people gatherings?

~~~
nxpnsv
Double blind so that neither the thrower of recipients of said cocktail
doesn't know wether it is real or placebo? It's the only way to know if they
are panicking because of a bottle or a fireball...

------
andrewla
Completely off topic -- one of my favorite stories from a friend at Google was
that they saw someone writing what looked like a giant AWK script, and they
went over to the guy and told them "look over at that desk, that's Brian
Kernighan, the 'K' in AWK" only to be met with a scornful "I'm the W" from
Weinberger.

