

Does Causation Imply Correlation? - Fomite
http://stats.stackexchange.com/questions/26300/does-causation-imply-correlation

======
anon4
My favourite counter-example is the one with a thermostat.

If someone controls the thermostat so that the temperature in the room stays
constant no matter the temperature outside, then you record the temperature
inside and the thermostat setting over many days, the two series will be
completely uncorrelated. But you will have a perfect correlation between
thermostat setting and outside temperature.

~~~
emiliobumachar
" But you will have a perfect correlation between thermostat setting and
outside temperature."

You meant a perfect _causation_ between thermostat setting and _inside_
temperature, right?

~~~
anon4
Well yes, but I also meant what I wrote.

When the temperature outside goes down, the thermostat goes up and the other
way around. They are perfectly correlated, even though you can't really say
one causes the other (outside temperature does in a very roundabout way cause
thermostat setting).

I like that example because it counters both "causation implies correlation"
and "correlation implies causation".

~~~
baddox
I'm confused. Don't you simply set a thermostat to the desired indoor
temperature and leave it? Assuming the desired indoor temperature is constant,
I would expect the thermostat setting to be constant, the internal temperature
to be nearly constant, the outdoor temperature to be uncorrelated, and the
energy usage of the heating/cooling system to be correlated with the outdoor
temperature (namely, the difference between outdoor temperature and desired
internal temperature).

~~~
ProblemFactory
I think the confusion arises from the term "thermostat". The example works
much better with a "heating power dial", not a "desired indoor temperature
dial".

------
computer
Here's a harder question: is it ever possible to deduce causation based on
data alone? We all know that correlation does not imply causation, but is
there anything that does?

If not: prove it. And if yes, under what conditions exactly?

~~~
rm999
In general, no. Simple counterexample: everyday it rains my grass is wet, and
everyday it doesn't rain the grass is dry. Does the rain cause the grass to be
wet? Well, probably. But maybe my grass is in a greenhouse, and I only turn on
my sprinkler when it rains so the grass gets a "natural" amount of water. In
this case the wet grass is not causally linked to the rain, even though it
looks exactly like the situation where it is. My action of turning on the
sprinkler is known as a "confounding factor".

You need to make assumptions to be able to draw a causation from data. This is
what people are doing when they design controlled experiments.

~~~
tedsanders
I'm not an expert in algebraic statistics, but I think sometimes the answer is
yes: [http://acritch.com/media/math/critch-causality-talk-
slides.p...](http://acritch.com/media/math/critch-causality-talk-slides.pdf)

You have to assume the data you have is all that there is, and sometimes the
math will only give a class of causal graphs instead of a single causal graph.

~~~
rm999
From what I can tell, that presentation is assuming a certain class of graph
structures in its DACB example - for example that there isn't an element E
that can causally link the other nodes, emulating a causal relationship
without there being one. Assuming a class of structures is not a general
approach; in my example this would be the equivalent of assuming there is no
sprinkler, but how can you make that assumption from the data alone?

There's another issue: their approach is statistical, so even if you assume a
class of graph structures you can only draw conclusions that the causal
structure "almost surely" exists, not that it does exist. What this means is
that even within infinite data, you can't prove that the other structures are
impossible. If I turn on my sprinkler when it rains the first {very large
number} times, you may conclude that there is almost surely a causal link
between the rain and wet grass, but there is no reason I have to do it. You
aren't left with a proof.

------
conjectures
This comes down to vocabulary.

If by correlation we mean Pearson correlation (linearly related) the answer is
no.

If by correlation we mean some hand-waving association between A and B then
yes. Since if A causes B there is some hand-waving association between them,
namely causation.

I suggest we reserve 'correlation' for linear relationships and stop using it
in the second sense. It's unhelpful and confusing.

~~~
mbq
Two problems: first is that casual association may be invisible to even hand-
waving criteria; think good crypto PRNG output which is fully determined by
seed and is serial number, but the parameters are practically unfittable from
the data because the function is so chaotic. Second thing is that the more
hand-waving you allow the more totally incidental relations will look "true".
With a modern amounts of hypotheses, even Pearson correlation often becomes
useless in this manner.

~~~
conjectures
mbq, I would not contradict your points.

Though first sense 'correlation' could well be useless in many cases at least
I know what it means.

The second sense is perfidious as it's virtually impossible to deny there's a
'correlation' between any two things since the term so poorly defined.

~~~
mbq
Note that I am criticising the idea of "general" or "intuitive" correlation;
actually useful correlation measures (like Pearson or Spearman) will be just
always tied to some model of dependence. This way it cannot be directly tied
to the obviously absolute dependence.

------
jobigoud
Wouldn't a Random Number Generator be a case of causation not implying
correlation?

~~~
cheald
If you're talking about PRNGs, then no - there is a very specific correlation
between successive numbers in a PRNG.

~~~
ogrisel
But it's highly non-linear. They have an expected Pearson correlation of zero.

Note: there are simple counter example of variables that have a Pearson
correlation of 0 yet some non-linear dependency structure as in the 3rd row of
the plot of:

[http://en.wikipedia.org/wiki/Correlation_and_dependence](http://en.wikipedia.org/wiki/Correlation_and_dependence)

~~~
mistercow
If the question is "does causation imply linear correlation", then the answer
is obviously and trivially "no".

But we don't need a PRNG to see that. Take a series of chutes whose entrances
are lined up in a row, and whose exits are arranged in a circle. Now start
dropping balls in the entrances and see where they end up.

Obviously there is a causal relationship between where we drop the ball and
where it lands, but it's not a linear correlation. But there's still a
correlation.

------
holycow19
Wrong verbage here - would suggest seems better?

~~~
Fomite
Probably. One of the answers on there also has a discussion of referring to it
as 'correlation', which has a specific statistical meaning that is often
ignored when people trot out "Correlation != Causation".

A better wordsmithed version of the question would probably be "Does
Association Suggest Causation?"

~~~
sesqu
Correlation does not always mean Pearson's r², which is the thing with a
specific statistical meaning. There are other specific correlations, and
general correlation.

~~~
Fomite
Never the less, I think 'association' is a better general term.

------
jeffcox
"Correlation doesn't imply causation, but it does waggle its eyebrows
suggestively and gesture furtively while mouthing 'look over there.'"

[http://xkcd.com/552/](http://xkcd.com/552/)

------
smoyer
Does a headline with two HN memes get extra points?

------
RamiK
Most of the time...

