
A Philosopher Reviews Judea Pearl's “The Book of Why” - fluentmundo
https://bostonreview.net/science-nature/tim-maudlin-why-world
======
epistasis
I like the review, but the criticism of distinguishing causality from
counterfactual reasoning feels weak to me. Do we actually care about the
counterfactual reasoning most? Of course. But establishing causality as its
own thing before counterfactuals is necessary in the way that Pearl has
structured his math. And even the grammar of human languages enforces this
separation of concepts. Do you need causality to even have the concept of
counterfactual? Of course. Is there value in describing that causality before
going on to counterfactuals? Yes I think so, because of the difficulties of
current counterfactual algorithms.

Separating out causality opens the door to doing counterfactual reasoning in
better ways. Since whatever method is in the Book of Why confused the
reviewer, it's probably not the best way to look at counterfactuals, so the
causality bits can be taken without establishing the counterfactual methods as
the be-all end-all.

For that matter, DAGs versus cyclic directed graphs or time-based DAGS is
still a big concern too. Many if not most of our causal reasoning will have
loops, when time is not accounted for, and DAG formulations make this
difficult to unroll. There may be big improvements in causal modeling, without
considering the counterfactuals even.

Also, in Pearl's prior 2000 book on Causality, it's clear that Pearl gave
plenty of credit to Spirtes, and it always seemed that they were working in
parallel and on very similar problems; I'm not sure how much Spirtes took from
Pearl, but Pearl makes clear that his ideas are heavily informed by Spirtes'
work.

~~~
empath75
I am not sure how to define "cause" without introducing counterfactuals.

"This happened because I did that" seems to imply "Had I not done that, this
would not have happened" \-- a counterfactual.

~~~
entropicdrifter
I'm not sure that's always the case.

Couldn't you have a scenario where your action was the direct cause of an
outcome that might have come about some other way without your involvement at
all?

For instance, if a ball rolls down a hill because you pushed it, but a breeze
blows a moment after you pushed the ball, if that breeze was strong enough to
push the ball down the hill too, then you have a scenario where "that ball
went down the hill because I pushed it" is true, but "had I not pushed the
ball, it would not have rolled down this hill" is not true.

~~~
skybrian
Yes, what you're describing is basically an "or" gate. The counterfactual is
"what if neither cause happened?"

~~~
shkkmo
right, but the potential existence of multiple potential causes (an "or" gate)
is what makes this reasoning a fallacy:

> "This happened because I did that" seems to imply "Had I not done that, this
> would not have happened" \-- a counterfactual.

The point is that constructing the counterfactual requires a more than just
knowledge of a single causal link, but a broader set of causal knowledge about
many causal links.

Language is messy and ambiguous and we do often colloquially use "X caused Y"
to imply the truth of the counter factual "If X had not happened Y would not
have happened". Interestingly, a different tense such as "X causes Y"
generally does not have the same counter factual implication.

------
adolph
See also a review by Andrew Gelman:

[https://statmodeling.stat.columbia.edu/2019/01/08/book-
pearl...](https://statmodeling.stat.columbia.edu/2019/01/08/book-pearl-
mackenzie/)

~~~
stewbrew
Well, his a priori is rather skewed, though.

~~~
ml_basics
I see what you did there

------
enjoylife
> “but given my own background I could not but wonder how much farther Pearl
> would have gotten had he had the training I did as a philosopher.”

The review had some good call outs but given the above quote and a few
surrounding criticisms, this review is more pretentious than anything.

~~~
Luc
The reviewer is Tim Maudlin, one of the foremost philosophers of physics. It's
kind of hard to find someone better placed than him to review a book about
causality, and his comment seems entirely appropriate.

------
ncmncm
This presents the RCT criterion uncritically, albeit citing a case where it
would be hard to apply.

But RCT fails -- gives a nonsense result -- when the hypothesis under test is
incoherent. This wouldn't matter, except that RCT is routinely used in such
circumstances, and the results treated as gospel by people in positions of
authority.

Consider: Outcome X may have six causes A-F. RCT tests B, and finds that
varying B only affects one in six cases. With infinitely many trials, the
relationship resolves, but with one trial the difference is indistinguishable
from noise.

Substitute a medical symptom for X, and a medical treatment that addresses one
of six causes, for B. After one RCT, B is "shown" to be ineffective.

The problem is not B. The problem is that X is ill-defined. X could be a
mental illness, or tumors in a given organ. How often do we read "anti-
depressants shown ineffective"? The only way we have to distinguish one
variety of depression from the next is which treatment works.

The problem is not limited to medicine.

~~~
fluentmundo
A nice point. But this is not a criticism of the RCT so much as a criticism of
how its results are used or interpreted.

~~~
ncmncm
It is a criticism of the notion of RCT as the unimpeachable "gold standard" of
evidence. The limits of a tool are the most important thing to learn about it,
and, for RCT, few can be bothered.

~~~
fluentmundo
But it's not a limitation of the tool (the tool _does_ provide gold-standard
evidence); it's the stupidity of the researcher using the tool. (The tool
works perfectly well and does, in fact, constitute the gold standard of causal
evidence.) In your depression example, it's the stupidity of a researcher who
fails to consider why a 1-in-6 success might be significant, or fails to
consider that the umbrella disease "depression" can be multiply realized by
different physiological mechanisms, which in turn require different
treatments.

Your beef isn't with RCTs, or with the notion of RCT as evidence. Your beef is
with researchers who don't know how to think.

Your complaint is like saying, we should stop considering hammers to be the
gold-standard of nail-hitters because some idiots use them to try to turn
screws.

~~~
ncmncm
My beef is with the public and policymakers convinced they have no need to
understand the method's limitations, "because it's the gold standard", thus
never misleading.

------
coldtea
> _Well, there are some caveats even here. The real gold standard is a double-
> blind experiment, in which neither the subjects nor the experimenters know
> who is in which group. In the case of car color, we would literally have to
> blind the drivers, which would of course raise the accident rate
> considerably._

Not really. It's just enough that we don't tell the red car drivers that
they're part of a special group. Just let them think we have equal number of
different colors assigned to different drivers (and don't let them see what
the others got). Then, the fact that their assigned color happens to be red
will hold no significance to them related to the test then...

~~~
tfowler
What if being assigned a red car caused drivers to drive more carefully
because they believe that red cars attract the attention of law enforcement
more so than other colors?

------
mistermann
Recent appearance on Sam Harris podcast, I quite enjoyed it....

[https://samharris.org/podcasts/164-cause-
effect/](https://samharris.org/podcasts/164-cause-effect/)

#164 - Cause & Effect - A Conversation with Judea Pearl

August 5, 2019

In this episode of the Making Sense podcast, Sam Harris speaks with Judea
Pearl about his work on the mathematics of causality and artificial
intelligence. They discuss how science has generally failed to understand
causation, different levels of causal inference, counterfactuals, the
foundations of knowledge, the nature of possibility, the illusion of free
will, artificial intelligence, the nature of consciousness, and other topics.

Judea Pearl is a computer scientist and philosopher, known for his work in AI
and the development of Bayesian networks, as well as his theory of causal and
counterfactual inference. He is a professor of computer science and statistics
and director of the Cognitive Systems Laboratory at UCLA. In 2011, he was
awarded with the Turing Award, the highest distinction in computer science. He
is the author of The Book of Why: The New Science of Cause and Effect
(coauthored with Dana Mackenzie) among other titles.

Twitter: @yudapearl

~~~
tu7001
I would strongly recommended this podcast.

~~~
terminlvelocity
I really enjoyed hearing Judea Pearl being interviewed, as I am most of the
way through "The Book of Why" and have learned a lot from it. I did feel that
Sam steered the conversation a bit too much towards his favorite topics (like
free will) and wish there was a bit more discussion of philosophy/history of
science, but it was still a great listen.

I first learned of Judea Pearl by stumbling across the transcript of a talk he
gave while I was researching DAGs:
[http://singapore.cs.ucla.edu/LECTURE/lecture_sec1.htm](http://singapore.cs.ucla.edu/LECTURE/lecture_sec1.htm)
. The way he grounded his talk in the history of thought hooked me, and the
talk serves as a good general overview for those deciding if they want to pick
up the book.

------
anthony_doan
Post is sweet.

It point out the cliche, "correlation is not equal to causation."

It is a real thing but throwing that quote around without reading into the
research or having proper foundation to assess a conducted research is just as
bad.

~~~
guerrilla
Yes, indeed, its also interesting to notice that people struggle when one asks
"Why isn't it?" One often gets strange statistical answers that don't really
get to the root of anything.

------
conjectures
I mostly agree with Pearl.

However, the account of traditional statistics given in this article is
misleading. Randomisation is a major part of traditional stats, and it is
inherently a causal hypothesis: breaking the links between unobserved
covariates and treatment regimes.

An important alternate contemporary causal inference framework by Rubin has
origins in a 1923 thesis...

...but the _content_ of Pearl's approach seems superior; if you ignore the
academic spats.

~~~
fluentmundo
>> Randomisation is a major part of traditional stats, and it is inherently a
causal hypothesis

Yes, randomization is central to classical statistics, but no, it is not
inherently causal. Drawing a random sample from a bivariate distribution (X,Y)
is key to doing a lot (though not all) of classical statistical inference
(think of estimating slopes in regression), but the randomization does not
imply anything about the causal relationship between X and Y. When you speak
of randomization in the context of "treatment regimes," you are thinking about
randomized controlled trials, which the piece does analyze explicitly, in some
detail. So in this sense the account given in the essay is not misleading.

~~~
conjectures
I'm afraid you're mistaken. Randomisation allows one to make the strong causal
assumption that the treatment regime allocation is unrelated to any of the
other variables, observed or _unobserved_.

Anyway, the section you're pointing to agrees with me. It just happens to be
overlooked when they summarise...

~~~
fluentmundo
No, I am not mistaken; you are confused. And your confusion is very pervasive
in the technical community. You're not talking about randomization in general
when you speak of a "treatment regime." You're talking about randomization in
a _causal experiment_ such as a randomized controlled trial. But
"randomization" is a broader thing than randomization in a controlled
experiment. The assumption of randomization is made for almost all classical
statistical inference, which has nothing at all to do with causation.

Say you want to do basic linear regression: you want to estimate the slope for
Y regressed on X. The most stringent form of inference works like this: you
draw a _random sample_ (X_i, Y_i), modeled as n independent and identically
distributed realizations from the joint distribution (X,Y). Etc. This is
certainly a stochastic model; we require randomization (or some approximation
of it) to do inference. But it has nothing whatsoever to do with causality.

------
mistrial9
I had a chance to see Judea Pearl speak at a high-end mathematics conference a
few years ago, and asked him personally afterwards about a few things in his
talk.. I did not get much satisfaction in the admittedly brief exchange, and
therefore unfortunately my impression was that there was some smoke-and-
mirrors aspect to his talk, I would guess for some competitive reasons I don't
know.. however, the company on stage at that event were of the highest rigor,
so take this as you will...

------
kunkelast
This is the content I like to see at HN most of all. Good article, thanks for
sharing.

------
melling
I’m 30 pages into the book. It hasn’t grabbed me yet.

Worth finishing?

~~~
terminlvelocity
The writing style does not change much, and I do find that it's taking me
longer than usual to work my way through the book, especially since I'm very
interested in the subject matter. That being said, there are some good nuggets
of information a little further in. Understanding some of the patters that pop
up in the causal scenarios he lays out and being able to think about these
with a shorthand or graphically has changed the way I think about complex
situations.

------
ngcc_hk
very good article. Highly recommended

