
Correlation implies Causation (2009) - Tomte
http://web.archive.org/web/20091113124223/http://dailymeh.tumblr.com/post/238289205
======
xyience
Simpler article (with a proof):
[http://oyhus.no/CorrelationAndCausation.html](http://oyhus.no/CorrelationAndCausation.html)

Edit: also regarding the last sentence, "...because all we have to help us
establish causal relationships is correlation". The work of Pearl et al. give
us quite a bit more: [http://www.michaelnielsen.org/ddi/if-correlation-doesnt-
impl...](http://www.michaelnielsen.org/ddi/if-correlation-doesnt-imply-
causation-then-what-does/)

~~~
infinity0
I think the point of "correlation does not imply causation", refers to the
literal prepositional logic sense of "=>".

Yes, correlation suggests causation, i.e. P(causation|correlation) >
P(causation) from a Bayes perspective. That doesn't mean you should discount
the possibility of ¬causation, merely that its probability is smaller. And
"how much smaller" could be very close to 0, so it would still be hasty to say
"implies", which linguistically implies "=>".

A better phrasing would be "correlation suggests but does not imply
causation". (edit: e.g. as per that xkcd comic, mentioned by other posters.
edit2: I mixed up the proof with the OP article. the proof uses "evidence of"
which is also good.)

But yes, nice proof nonetheless. I like how causation is basically defined as
P(c|a) = 1, showing how most complex philosophical issues are actually
irrelevant (for this particular result).

~~~
dragonwriter
Actually, as I see it, the _main_ point of "correlation does not imply
causation" is mostly "correlation does not imply _a particular causal
relationship_ ". While coincidence is possible, as well, its mostly about the
fact that you can't conclude A causes B from a correlation between A and B
alone, because the correlation may be due to the fact that B causes A, or the
fact that A and B are both caused by C. That's why you need the correlation,
plus an explanatory theory of the causation, plus evidence to reject
alternative causal relationships, to have a decently strong basis for
concluding a particular causal relationship.

~~~
infinity0
I think, what I said is the same as what you said, just with some more maths.

> you can't conclude A causes B

right, this is what I referred to as "=>"

> plus an explanatory theory of the causation, plus evidence

yes, this all works together to build up the "how much smaller". An
explanatory theory basically allows you to make predictions and run tests to
collect more data to pump into the application of Bayes' theorem as used by
that proof, improving your confidence of the difference between P(a|c) and
P(a).

~~~
JadeNB
> > you can't conclude A causes B

> right, this is what I referred to as "=>"

Except that it isn't quite, since you were careful to clarify that your
\implies (i.e., '=>' or '⇒') was the \implies of propositional logic, which
explicitly disclaims any causal relationship. \implies in that context says
precisely and only that the antecedent is false, or the consequent is true. In
this sense, 2 + 2 = 4 \implies Barack Obama is currently the president of the
US, and 2 + 2 = 5 \implies George Bush is currently the president of the US,
even though there is no causal relationship in either case.

~~~
infinity0
"is" in language can often mean "is a subset of", I was using the term that
way. Whilst you are right that "=>" disclaims any philosophical relationship,
the proof covers all definitions of "cause" that one might reasonably come up
with. So "you can't conclude A => B" implies (with probability 1) that "you
can't conclude A causes B":

The proof only defines "causation" as some event "a" for which "P(c|a) = 1".
This is the same property that "=>" has in propositional logic, and there is
no implication of philosophical causation here either. But the proof still
works, as a consequence of its definitions.

So in other words, the proof says: if causation causes correlation then P(a|c)
> P(a) (i.e. correlation is evidence of causation) but we can't say causation
is definitely true (P(a) = 1), however you want to define "causes" as long as
it has the property that P(c|a) = 1.

~~~
dragonwriter
That's not entirely true, at least, using the normal definition of "causes"
that is of interest in correlation vs. causation discussions, which certainly
includes "causes" which are contributors to the occurrence of an effect but do
not alone guarantee it (e.g., smoking causes cancer, but it is not true that
smoking _implies_ cancer in the propositional logic sense.)

------
joe_the_user
The actual title is "If correlation doesn’t imply causation, then what does?"
which I think is a _much_ more useful title and question.

I think that you _can_ answer that. Science is not a series of isolated
experiments that stand or fall based on their particular data. Instead, all of
our judgments of causation depend on a series of nest broad and narrow
assumptions about the world. The broadest assumption is perhaps that we have a
material world whose substance lacks the ability to intentionally sabotage our
experiments and from which we can generate uniformly distributed random
samples from. But there are whole range of assumptions below that.

From this view, "extraordinary claims require extraordinary evidence",
essentially things are consistent with our existing assumptions still need
evidence but not huge amounts. Things that are sudden changes in our whole
understanding of the world require much more change. The faster-than-light
neutrino experiments, in isolation, were probably a lot more convincing in
just their statistics than a lot of experiments that get accepted without
comment. But because such other experiments didn't contradict very established
positions, their results weren't gone over with a fine-toothed comb. And
that's how it should be.

Edit: the thing with a "calculus of casual inference" is that it also would
have to include a way of taking into account the range of indirect assumptions
that a given casual deduction depends on, so one would something like a
knowledge-database.

------
wai1234
This argument is absurd. It takes the obvious truth that causation will
certainly lead to correlation and blindly flips it around and claims it's
somehow profound. Just bad reasoning from beginning to end. The correct
reasoning - causation leads to correlation - is the basis of all science. When
scientific theories are evaluated, the razor is a check that the theory
correctly predicts observed results. You can't cook up a theory that fits a
set of already known data and claim you have formulated a causal relationship
unless you can then predict the NEXT results over a substantial set of trials.

~~~
rileymat2
I think the author is trying to say that coorelation is the starting point in
the hunt for causation in many investigations.

~~~
wai1234
If so, the author says it poorly. Yes, you wouldn't look for a causal link
_without_ correlation, that would be insane, but the existence of correlation
tells you only that there might be causation, nothing more.

~~~
Barrin92
but there is no other way to actually show causation. You can't 'look at'
causation. You can look at a million chess games, you'll never be able to see
the rules themselves. All you see is correlation between how the pieces move
in every game and from this you can try to figure the rules out, if any exist
at all.

So while weak(!) correlation does not imply causation, strong correlation
pretty much does. It's how the scientific method works. We can't look behind
the curtain and take a look at the rule sheet.

------
ikeboy
[https://xkcd.com/552/](https://xkcd.com/552/)

>Correlation doesn't imply causation, but it does waggle its eyebrows
suggestively and gesture furtively while mouthing 'look over there'.

------
kijin
I would go one step further and argue that "causation" or "causality" is just
a label that we attach to a special kind of correlation. The very concept of
causality, it seems to me, is an attempt to impose human logical structures on
a messy world that is fundamentally probabilistic.

The universe doesn't guarantee that if P, then Q. At best we can observe that
if P at t1, then it is highly likely that Q at t2. We can often simplify that
as "if P, then Q", just as we can approximate Einstein's physics with
Newtonian physics for low-velocity applications. But at the end of the day,
both are only approximations. The clear rules of logic only exist in our head,
just as a perfect circle doesn't exist in reality.

If so, whether correlation implies causation is the wrong question to ask. A
more important question is _what kind of correlations_ we usually take to
imply causation. We're probably looking at correlations that hold exclusively
between two sets of events with an extremely high probability, with the right
sort of temporal relationship. We could then say that those kinds of
correlations simply _are_ what we mean by causation, because there really is
nothing else to say.

Once upon a time, most philosophers thought that the mind was some immaterial
substance separate from the brain. Now many of us believe that certain
functions of the brain _are_ the mind. Perhaps we could apply a similar
reductionism to the issue of correlations and causations, too.

------
dmvaldman
First, correlation is symmetric between effects and their causes, while
causation is not.

You may as well replace the phrase "correlation implies causation" with
"correlation implies effectation".

For a careful treatment on correlation and causation, you should read Judea
Pearl, one of the great living computer scientists. I highly recommend this
casual read:
[https://www.nyu.edu/classes/shrout/SEM06/pearl.pdf](https://www.nyu.edu/classes/shrout/SEM06/pearl.pdf)

------
shasta
This article seems to miss the role of theories in physical sciences. When we
talk about 'cause' we understand that to mean that some chain of events,
governed by the rules of physics, lead to the result. Yes, those rules of
physics were arrived at largely as a result of observation of correlations,
but no one is going to propose military coups leading orange harvests as a
fundamental physical law on the basis of observed correlation.

------
dcl
Correlation doesn't even imply correlation!

[http://andrewgelman.com/2014/08/04/correlation-even-imply-
co...](http://andrewgelman.com/2014/08/04/correlation-even-imply-correlation/)

"That is, correlation in the data you happen to have (even if it happens to be
“statistically significant”) does not necessarily imply correlation in the
population of interest."

~~~
golergka
Does not necessarily is not the same thing as does not.

------
jimothyhalpert7
> They had done the chemical reaction that blew up the lab 175 times before
> without incident; then, suddenly, something went wrong and the lab went boom
> and real, actual people died.

I really wish the author would expand more on this. Maybe due to my shallow
knowledge of statistics, I've recently become baffled by the fact that an
_arbitrary_ number is used as a confidence interval, to state that something
is true or false. And I'd guess most of today's world depends on these
confidence intervals. Why is it that we're OK with stating that something is
true, if it's true 95% of the time? Or is it the case of _good enough, it it
ain 't broke, don't fix it (until it isn't)_?

~~~
Scea91
We are OK with it because almost always we don't know a better way. Every time
we use statistics and have certain finite sample size, these confidence
intervals will emerge.

~~~
wai1234
The only place we are ok with this is in the social sciences where rigorous
causation can never be proven. No physics theory is considered 'valid' because
of a confidence threshold. Pharmacology and epidemiology flirt with this too,
because the underlying causal mechanisms can be fiendishly difficult to
determine, but we see 'new evidence' stories almost every day that flush
whatever last week's 'wisdom' was.

~~~
jimothyhalpert7
>we see 'new evidence' stories almost every day that flush whatever last
week's 'wisdom' was.

I guess the process you are talking about involves creating a new model which
is able to account for the causal relationship for a larger set of
observations, compared to the previous model. In this way, the previous model
can be either _flushed_ , or if the conditions under which it fails are known,
we can use it for specific cases. Isn't physics prone to the same effect, only
to a lesser frequency? Which might be worrisome, because if the opportunity to
revise a model comes every 100 years, that might mean that you'll spend your
whole life interpreting the world through an inferior model.

------
j2kun
The distinction is obvious: correlation does not include an ordering, but
causation does. You can observe that two things both happened, and that is
correlation. You can observe that one thing happened, and then another thing
happened after, and stipulate causation. You can increase your certainty by a
controlled experiment.

It seems like all the author is really saying is that experiments aren't good
enough to produce 100% certainty of causation. Not all that shocking. But the
author also seems to conflate correlation with uncertainty, and this is
probably where the title comes from: increasing certainty from controlled
experiments implies causation.

~~~
dllthomas
> You can observe that one thing happened, and then another thing happened
> after, and stipulate causation.

Increased purchases of gifts causes Christmas.

------
linhchi
Correlation has a formula, it detects the _linear_ relation between two
variables. So the quadratic relation is actually having zero correlation.

Second, in academic research, we mean 'correlation doesnt imply _direct_
causation'. Because we're talking science (what's significant) not astrology
(as above, so below).

For example, the octopus predicts the results of football match correctly most
of the time. But as a scientific person, would you say that there is any
_conceivable_ causation?

The important word is conceivable.

------
theoh
Just not sound, philosophically. I don't understand why the author wants to
make the claim.

See for example the irrefutable possibility of
[https://en.m.wikipedia.org/wiki/Occasionalism](https://en.m.wikipedia.org/wiki/Occasionalism)

~~~
talles
I'd say that God has been pretty consistent so far (or giving the illusion
that it is, if you prefer).

------
chris_wot
This is quoting Hume... I always find it hard to get to understand these
figures, because I feel I need to know who influences them, but then I need to
know who influences them... and so on.

What is the best way of getting summaries of philosophy from as close to the
beginning as possible?

~~~
JeffreyKaine
Study philosophy from it's base principles. Start with pre-socratic
philosophers and keep reading until you get to modern day.

The book "Sophie's World" does a great intro here and it a really great
weekend read.

[http://www.amazon.com/Sophies-World-History-Philosophy-
Class...](http://www.amazon.com/Sophies-World-History-Philosophy-
Classics/dp/0374530718)

~~~
chris_wot
Will do - thanks!

------
raould42
[http://tylervigen.com/spurious-correlations](http://tylervigen.com/spurious-
correlations)

