
The Book of Why: The New Science of Cause and Effect [pdf] - elcritch
http://cdar.berkeley.edu/wp-content/uploads/2017/04/Lisa-Goldberg-reviews-The-Book-of-Why.pdf
======
wavegeek
Great book that clearly explains the perils or inferring causation and now you
can deal with them constructively.

For those who want more he has a more technical book "Causality: Models,
Reasoning and Inference" which is also excellent.

Pearl is a legend in the field, who wrote the seminal "Probabilistic Reasoning
in Intelligent Systems: Networks of Plausible Inference".

There is a school of thought that there is nothing new here. Pearl is very
honest and open about who invented what and of the mistakes in his earlier
work. While no one finding is entirely new here, the overall package adds up
to a lot IMHO. And this after having intensively studied statistics and
probability over many years. It really changed my approach to and
understanding of causality. And most importantly it gave me reliable
intuitions on the subject. After his books, things seem obvious to me that
others struggle with.

~~~
zwaps
I like Pearl's work a lot more than his opinions on who invented what, and who
is right about what. For that reason, I would wholeheartedly recommend his
books but I would also recommend skipping any sideshows about related
literature.

Fact is, Pearl selectively picks literature concerned with causality and, in
particular, literature not successfully tackling the subject. He does ignore
many other approaches to the issue, especially parallel developments to tackle
problems in fields that he specifically critiques. In other words: Pearl, next
to being a great researcher, is also a showman who knows how to build a
following.

The essence of the debate is this: Neither Pearl's framework, nor anyone
else's capture all valid approaches to causal inference. One can construct
cases in Rubin's framework that DAG can not solve and vice-versa. The downside
to Pearl's approach is that it is - right now - more difficult to implement.
The cases where DAG undoubtedly succeeds better than other approaches are, in
a sense, unlikely to succeed as a practical research projects.

That being said, a great strength of such graphical models is that they allow
quite sophisticated reasoning in several well-known simple but non-intuitive
cases. Such reasoning otherwise requires an immense amount of experience and /
or education on the pitfalls of causal inference. That is also a reason why I
would like to see this framework taught more in schools.

All in all, as another great post in this thread has pointed out, much of the
debate is in violent agreement on base issues. Once this issue transcends the
egos involved, much progress will be made and that is, in my view, very
exciting.

~~~
ced
_One can construct cases in Rubin 's framework that DAG can not solve and
vice-versa_

Example?

~~~
zwaps
I'll try to dig it up, saw it a while ago on a forum.

Another thing might be that in Rubin's framework it's immediately
straightforward to do semi-parametric estimation and get consistency and all
that. I'd say in practice that's probably not the first thing to do for DAGs,
where writings are focused on toy models (the question then would be: how do I
get to the correct DAG?).

Edit: This was posted itt, it has some examples of Rubin's framework
(potential outcomes) that can not be identified in the DAG framework
[https://arxiv.org/pdf/1907.07271.pdf](https://arxiv.org/pdf/1907.07271.pdf)

------
pjmorris
For balance and discussion, Andrew Gelman's critical review of 'The Book of
Why': [https://statmodeling.stat.columbia.edu/2019/01/08/book-
pearl...](https://statmodeling.stat.columbia.edu/2019/01/08/book-pearl-
mackenzie/)

~~~
epistasis
Thanks for sharing this! I knew that Gelman and Pearl have been going back and
forth for years, and it's great to see this.

I fall heavily on the Pearl side of this debate, because I'm interested in
scaling to thousands to millions of variables, and though graphs are not a
great tool, they are pretty much the only one that I think we have for scaling
that direction.

That said, I think their disagreements don't have much effect on practice, and
have as much to do with their different academic lineages as anything else. I
expect that any dispute will be resolvable as the science progresses

~~~
elcritch
> I fall heavily on the Pearl side of this debate, because I'm interested in
> scaling to thousands to millions of variables, and though graphs are not a
> great tool, they are pretty much the only one that I think we have for
> scaling that direction.

Are you able to elaborate?! It'd be really interesting to hear what/if tooling
exists for these graphs. I just found the book recently and am listening to
the Audible version. It also reaffirms my desire for a "Bayesian inference"
spreadsheet like tool. Something that'd help organize a few dozen thoughts
ideas for researchers/engineers.

~~~
canjobear
> a "Bayesian inference" spreadsheet like tool

[http://probcomp.csail.mit.edu/software/bayesdb/](http://probcomp.csail.mit.edu/software/bayesdb/)

------
bachmeier
Recognizing that not many will care about my opinion on the subject, I will
link to this excellent (and IMO accurate) paper by Guido Imbens:

[https://arxiv.org/pdf/1907.07271.pdf](https://arxiv.org/pdf/1907.07271.pdf)

One of the problems in groups is that there is a belief that the loudest and
most self-confident individual is correct. I believe that is the case with
Pearl. He attracts a following based on his personality, but when it comes to
actually doing empirical research, he comes up short.

"Separate from the theoretical merits of the two approaches, another reason
for the lack of adoption in economics is that the DAG literature has not shown
much evidence of the alleged benefits for empirical practice in settings that
resonate with economists....In contrast in the DAG literature, TBOW, [Pearl,
2000], and[Peters, Janzing, and Sch ̈olkopf, 2017] have no substantive
empirical examples, focusing largely on identification questions in what TBOW
refers to as “toy” models. Compare the lack of impact of the DAG literature in
economics with the recent embrace of regression discontinuity designs imported
from the psychology literature, or with the current rapid spread of the
machine learning methods from computer science, or the recent quick adoption
of synthetic control methods developed in economics [Abadie and Gardeazabal,
2003, Abadie, Diamond, and Hainmueller,2010]. All three came with multiple
concrete and detailed examples that highlighted their benefits over
traditional methods. In the absence of such concrete examples the toy models
in the DAG literature sometimes appear to be a set of solutions in search of
problems, rather than a set of clever solutions for substantive problems
previously posed in social sciences, bringing to mind the discussion of Leamer
on the Tobit model ([Leamer, 1997])."

~~~
currymj
i get the sense that back in the 90s, statisticians must have been really rude
and dismissive to Judea Pearl, and he might not have gotten over it.

it seems very plausible to me that at the time, many people outside CS would
have considered a graph to be a type of illustration not a "legitimate"
mathematical structure, and wouldn't have understood or cared about
computational complexity arguments about performing inference or computing
independence.

it does seem to be the case that for various reasons, for economics and
population-level social science, there's limited advantage to using DAGs.

I have also noticed as a causal inference outsider in machine learning land,
people use DAGs or SWIGs to write down assumptions, and then translate to
potential outcomes to derive estimators. And for people who do automated
causal discovery, I haven't heard of anyone trying to use a potential-
outcomes-based representation for their system.

so i buy Pearl's argument that DAGs are the "right" data structure for
representing causality, both for humans and computers.

But many social scientists may not have any reason to care about this, because
automated causal discovery is hopeless for their applications, and for
methodological reasons any reasonable model has to be such that representing
things directly as potential outcomes is manageable anyway.

~~~
TeMPOraL
> _it does seem to be the case that for various reasons, for economics and
> population-level social science, there 's limited advantage to using DAGs._

A tangent: on the one hand, I do think DAGs are the best thing since sliced
bread, and I wish more people would embrace them and add them to their toolbox
as a fundamental modelling tool. On the other hand, the next step is
realization that DAGs aren't sufficient. Lack of cycles is nice for analysis
and implementation, but it's also a drawback. The world is running on feedback
loops, and yet this seems to be a secret restricted only to specialists deep
inside their respective fields. I'd wish the public was more accustomed to
working with dynamic models.

(I don't think cyclic graphs help much with causality analysis, as presumably
cycles would imply time travel.)

~~~
currymj
yeah, i've wondered about this too.

the natural thing to do is to just unroll the models in time. the classic case
of this is a hidden Markov Model, but you can easily have a more complicated
time-indexed DAG.

There are also some interesting connections to reinforcement learning (MDP can
be expressed as DAG e.g. [https://www.microsoft.com/en-us/research/wp-
content/uploads/...](https://www.microsoft.com/en-us/research/wp-
content/uploads/2013/11/bottou13a.pdf), and then many RL algorithms are
equivalent to causal inference estimators).

it seems like people deal with equilibria and feedback loops in a pretty ad-
hoc way though.

------
Anon84
If you’re interested in learning more about causality, you might be interested
in my blog series working through Pearls “Causal Inference in Statistics: A
Primer”:
[https://github.com/DataForScience/Causality](https://github.com/DataForScience/Causality)
(Jupiter notebooks and links to blog posts)

------
kgwgk
Nice Causal Inference book from Hernan and Robins (pdf available for
download): [https://www.hsph.harvard.edu/miguel-hernan/causal-
inference-...](https://www.hsph.harvard.edu/miguel-hernan/causal-inference-
book/)

------
stewbrew
Hernan demonstrates how to put this to work in epidemiology. Quite
interesting. Pearl's work should be seen in context though as a effort to
create reasoning machines (AI), i.e., in automating reasoning.

------
rahimnathwani
If you're interested in learning more about causal inference and 'do
calculus', you might like this course and free textbook:

[https://www.bradyneal.com/causal-inference-
course](https://www.bradyneal.com/causal-inference-course)

------
petecooper
Meta question: what's the workflow for creating PDFs like this? Is the PDF
generated from markup-style text with a certain application or process?

I notice a lot of academic papers have a particular 'feel' to them, using a
subset of typography styles and presentational guidelines, especially where
mathematical formulae are presented to the reader.

Note: I do not have an especially academic background, nor any experience in
university (tertiary) education. Please excuse my unintended ignorance on this
subject. Thank you.

~~~
vatican_banker
This is probably done with LaTeX or one of its variants

See here:
[https://en.m.wikipedia.org/wiki/LaTeX](https://en.m.wikipedia.org/wiki/LaTeX)

------
yashevde
Thanks for sharing, this makes me want to move the book up my list of to-
reads. As a side-note, I’m blown away to learn that Judea Pearl is the father
of Daniel Pearl — can’t imagine the amount of composure it must take to
continue to operate at the highest levels in his field after experiencing
grief of that nature.

------
pavlov
It's a great book; an enjoyable read even for someone like me who's mostly
afraid of statistics. Pearl's enthusiasm for the topic is palpable, and he
seems to have the academic pedigree to match the occasionally lofty claims.

The book makes a very clear case in walking the layperson through various
causal techniques and underlining how they differ from traditional methods.
Few popular science books actually go into such detail.

~~~
tgb
It's a good book, but some may be turned off by the first chapter or so that
reads like he's trying to rub his victories into the faces of his academic
adversaries. It had more than a little air of gloating to it, in my opinion,
and the book would have been better without that. (Though he spent
considerable time on the earlier precedents for his work, which I had
previously thought was entirely novel.)

------
johndoe42377
There is, again, principles from real philosophy, which state that there is
and always will be a principal gap between an observation (and language) based
model and What Is.

The classic example is inability to tell why the sun will rise tomorrow by
merely observations of it doing so and _any kind_ of statistics.

The "why" is in understanding of the nature of the process we call Sun and
realization that such kind of process cannot be stopt or even change in 24
your.

Another modern illustration of the same principle is the principal, infallible
inability to infer the actual wiring if processor from the level of code it
excites.

This, by the way, is the very same Upanishadic principle of inability to infer
the true nature of Brahman from the level of human intellect, conditioned by a
language and experience.

Some things will remain only guesses, models and "scientific", (or rather
sectarian) consensus.

------
mwexler
Just noting: paper (and book) came out in 2018. The book is well worth reading
if you missed it then.

------
transfire
"Why" is not a query of causality, but rather purpose. The proper word would
be "How".

~~~
kgwgk
[https://en.wikipedia.org/wiki/Four_causes](https://en.wikipedia.org/wiki/Four_causes)

Aristotle holds that there are four kinds of answers to "why" questions:

Matter (the material cause of a change or movement): The aspect of the change
or movement that is determined by the material that composes the moving or
changing things. For a table, such might be wood; for a statue, such might be
bronze or marble.

Form (the formal cause of a change or movement): A change or movement caused
by the arrangement, shape, or appearance of the thing changing or moving.
Aristotle says, for example, that the ratio 2:1, and number in general, is the
cause of the octave.

Agent (the efficient or moving cause of a change or movement): Consists of
things apart from the thing being changed or moved, which interact so as to be
an agency of the change or movement. For example, the efficient cause of a
table is a carpenter, or a person working as one, and according to Aristotle
the efficient cause of a boy is a father.

End or purpose (the final cause of a change or movement): A change or movement
for the sake of a thing to be what it is. For a seed, it might be an adult
plant; for a sailboat, it might be sailing; for a ball at the top of a ramp,
it might be coming to rest at the bottom.

