

The New Science of Causation - wglb
http://www.aaronsw.com/weblog/newcausation

======
bokonist
How is this different than typical multi-variable regressions with control
variables?

The problem with social science is that it tries to be a science but in
reality its closest relative is product management. Using even the best
statistical methods still runs into the keys/lamp post problem. There are too
many variables, the variables are not well defined, the variables may have
strange interactions, there may be too much noise in the data, etc. In almost
all social science situations, you cannot find truth solely through
regressions.

For instance, recently I was studying the issue of whether the crash in the
stock market caused the fall in consumer spending, or the slow down in
consumer spending caused the fall in the stock market. If you run a straight
regression you'll have a lot of problems because a) even if the stock market
falls first, it could have been because of expectations of future consumption
b) the data on expenditures is not collected frequently enough, and c) there
are other potentially confounding variables.

But if you take the product management approach you can listen to what people
say. And when you listen, you find that people say stuff like, "After the
stock market fell, I had to cut back my plans to travel." or "Since the fall
in our endowment, we have had to suspend spending on all new projects." By
simply _asking people_ you can figure out the causality.

Note that this is how every company in the world makes most of its decisions (
except perhaps Google, because they actually enough controlled data to
meaningfully do regressions). When we're trying to figure out what features to
build, we do customer interviews, follow-me-homes, surveys, feature requests,
talk to customer support, etc. We make very modest use of statistics, because
we do not have large amounts of controlled data. Did the customer not use this
part of the app because it was not useful, or because we got the label wrong,
or because of something else? We could do a bunch of statistics, but what's
the point? It won't answer those questions. Asking people may be subjective,
but it's the only way (keys/lamppost).

The problem is that social scientists do not get promoted for making accurate
judgments and being right. A product manager does. A social scientist/academic
can only be judged by his methods. Thus there is the preference within the
professions for methods that can be objectively evaluated, and methods that
only a select elite can perform. Being right is not part of the selection
process for being an academic.

~~~
alan-crowe
> How is this different than typical multi-variable regressions with control
> variables?

As different as astrology and astronomy. Multi-variable regressions with
control variables are known to incapable of supporting causal inference. The
network based techniques of Pearl et. al. do permit causal inferences. Notice
though that the network based techniques will quite often rigorously prove
that the data you have are inadequate to make a causal inference. They do this
by discovering multiple causal networks, all consistent with the patterns of
correlation and conditional correlation in the data, but with causal links
going one way in one net work and the other way in another network.

------
andreyf
The statisticians Aaron cites need an epistemology lesson. Mathematics is not
a science. Science is about creating models and testing them, models that
involve causality. Mathematics is simply the rigorous definition of things
(making it the perfect source of language for said models).

"You must learn to distinguish between what is true and what is real." See:
<http://www.edge.org/q2005/q05_8.html#kay>

~~~
jey
Mathematics is a science. Mathematicians form hypotheses (guessed theorems,
guessed properties, etc) then perform experiments to test their hypothesis
(trying to construct a proof). If they end up with a proof, the hypothesis has
been confirmed.

I think most mathematicians would disagree with your assertion that math is
about "definitions". (I'm no math geek, just a programmer, so take with an
appropriately sized grain of salt.)

~~~
andreyf
_Mathematicians form hypotheses (guessed theorems, guessed properties, etc)
then perform experiments to test their hypothesis (trying to construct a
proof)._

There's certainly an abstract similarity, but etymologically, constructing a
proof is _very_ different from running an experiment. A proof declares a
statement as True. An experiment (at best) shows that reality behaves as your
model hypothesized it would. Experiments can provide evidence of something,
not prove it as True. As Alan Kay says (following former link):

 _When we guess in science we are guessing about approximations and mappings
to languages, we are not guessing about "the truth" (and we are not in a good
state of mind for doing science if we think we are guessing "the truth" or
"finding the truth"). This is not at all well understood outside of science,
and there are unfortunately a few people with degrees in science who don't
seem to understand it either._

------
yaroslavvb
Pearl provides a formalism for expressing causality, however it's questionable
(to me) whether that formalism corresponds to what we humans consider
"causality". Take 3 variables as an example, A,B,C. We observe some instances
of A,B,C triples, and try to fit a probability distribution P(A,B,C) to data.
If A,B,C are binary, there are 7 parameters, we can fit them directly to
training data to get perfect fit, however, when we run it on separate set
(validation data), it'll probably not perform very well. To improve fit on new
data, you need to reduce number of parameters, so you could make some
simplifying assumptions and write P(A,B,C) as P(A) _P(B|A)_ P(C|B). In this
form, there are 5 parameters to fit, and the resulting model could perform
better with validation data. Alternatively you could fit P(A) _P(B)_ P(C|A,B),
6 parameters, or P(C) _(B|C)_ P(A|B). Select the one that performs the best on
your validation data, and is hence the "best" model.

Now comes the cuasality connection -- if the best model is P(A) _P(B)_
P(C|A,B) you read it as A->C<-B. If best model is P(A) _P(B|A)_ P(C|B), you
read it as A->B->C. (ie, A causes B, B causes C)

I think causality connection is questionable because the model that
corresponds to A->B->C will have the same fit as model for A<-B<-C or A<-B->C.
In fact, you could take any causal network without loops and "unshielded
colliders" (connections of the form A->B<-C), pick any node as a root, and re-
order the arrows to face away from the root to get a model with a _different_
semantic causal structure, but the _same_ mathematical structure, meaning
it'll give identical fit to data.

What would be really interesting is if someone deduced causality using Pearl's
approach, then verified it using a direct experiment

~~~
cousin_it
Wow. That's the kind of discussion we should be having. I didn't quite realize
that Pearl's causality cannot tell A->B->C from A<-B->C. On the other hand, no
other non-experimental statistical technique can do that either. Also, many
problems will likely have a very constrained set of causal graphs to choose
from, like smoking/tar/cancer.

------
dfranke
I haven't read any of the cited research, but formalization of causality seems
like it should be a straightforward endeavour. The notion of a controlled
experiment already embodies a fairly rigorous understanding of what it means:
if you have two starting conditions which are identical except for one
variable 'A', and the outcomes differ in terms of 'B', then you've established
a causal link from A to B. I don't envision any particular difficulty in
transcribing this notion into the language of formal logic.

~~~
timf
His comment about statistics (not formal logic?) is mainly about usefulness.

It's not _useful_ to have a pure causality representation because with the
real world and other complex systems, you don't know all the variables. Your
variable 'A' in 10 years might be shown to be a combination of variables.

I, for example, only 99.99% believe in gravity. One day we might float off the
planet. Lack of 100% proof about observations was one of Hume's basic
arguments and I think the point of the "do-calculus" for AI that is discussed
in this link. We all have an internal "do-calculus" for operating in the
world: work with current held beliefs/rules until they're proven to work
differently, etc.

~~~
darkxanthos
I'm not convinced of this "truth". Do-calculus might help AI in its current
state, but really they may just be making up for fundamental flaws in their
neural simulations.

I normally hate to hear people say no one knows how certain things work but in
this instance it seems apt. No one has created a self autonomous consciousness
yet and so no one really knows exactly what goes into it.

I guess there's also a difference in advancing the study of robots and
advancing the study of artificial intelligence.

~~~
jey
> neural simulations

 _huh?_ I have no idea what you're talking about.

------
nazgulnarsil
<http://yudkowsky.net/rational/the-simple-truth>

tl;dr: for truth to make any sense you have to postulate how "X being true"
would cause a different sensory experience vs it being false.

------
MikeCapone
I know it's stupid, but I'm a bit scared of reading Judea Pearl.

Afraid it's going to be terribly brilliant with lots of maths and I won't
understand it.

I know, I should just give it a try...

~~~
jey
Yes, Pearl's Causality is pretty dense. The trick to reading any research
monograph is to realize that you're not supposed to read all of it, nor are
you necessarily supposed to read it front-to-back. Once you realize that,
Pearl's Causality is actually very accessible -- read the introduction to each
chapter, read the text sections in that chapter that look interesting, and
study the mathy parts that you find relevant.

~~~
ad
Agreed. MikeCapone, I'd also recommend that you start with the essay at the
very end of the book; serves as a good intro/overview.

------
thunk
I'd love an explicit ban on pop-sci health articles in the guidelines. Most of
them are pure upvote-garbage. They can make HN look like a tabloid.

~~~
llimllib
While it mentions pop-sci health, I think this is not a pop-sci health
article.

(I would support a hypothetical ban on those too, FWIW)

~~~
thunk
Oh, I wasn't suggesting it was. I thought it was a good post.

------
leecho0
This is really interesting, does anyone have references to the stuff stated
here? I'm trying to read through the stuff from his website:
<http://bayes.cs.ucla.edu/jp_home.html>

~~~
ible
Causation, Prediction, and Search by Spirtes and Glymour is much more
accessible than Pearl's Causality, and they cover a lot of similar material.

Video lectures has a workshop from NIPS '08 with talks by Pearl, Spirtes, et
al at <http://videolectures.net/coa08_whistler/>

The UAI and NIPS conferences are a common venue for this research to be
published in if you're interested in the latest results.

------
scott_s
I took a course which addressed exactly this stuff, and I was going to point
out material from that course, but he cites one of the professors who taught
the course! Clark Glymour. That doesn't happen to me every day.

------
gregwebs
The problem with many popular health science articles is not just that the
media claims causation from a correlational study, but that the scientists
they interview do it also!

