
We know correlation does not imply causality. What does? - pascal_cuoq
http://rgrig.blogspot.com/2014/07/predictive-causality.html
======
QuantumChaos
You always need to make some assumption about the causal process before you
can infer causality from evidence.

For example, take a randomized trial, where half of patients are given a drug,
and half a placebo. If we define Y to be the outcome of interest, and X to be
0 when the patient received the placebo and 1 when they received the drug,
then a positive outcome would be a correlation between X and Y. A person might
object that "correlation does not imply causation", however, we know enough
about the causal process involved, to infer that if Y and X are correlated,
this must be because X caused Y.

There are many other ways that causation can be inferred, such as instrumental
variables, regression discontinuity, or panel regressions. However all these
methods require making an assumption about the causal process underlying the
statistical distribution. Whether these assumptions are correct is often a
matter for debate.

~~~
alok-g
Well, that's the point! Correlation is seen in the data, then causation calls
to be tested for. Your test for it, and continue to see correlation. Yet, the
causation stays as an "assumption" that is a matter of debate. This brings
back the question, how _do_ you establish causation then?

Your argument is basically saying that if there is understandable physics
behind the correlation, then causation could be safely inferred. However, that
does not ultimately answer the question still, since how would physics would
have established that explanation to begin with.

~~~
QuantumChaos
Causation is not a fundamental notion for me.

We happen to live in a world where sometimes things can be ascribed specific
causes, but that is due to the nature of the physical universe, not an
abstract truth.

Causality in our best theories of physics (general relativity, quantum field
theory) is extremely complex and subtle. But that isn't needed anyway, what we
use in ordinary reasoning is "naive" physics, an approximation of physical
reality. I don't think this "naive physics", which we could also call common
sense, can be derived from a statistical model, although maybe it can.

Going back to the randomized trial, presumably the choice to give a drug to a
patient vs a placebo was determined by a random number generator. This in turn
is a physical process, and there is no reason why the physical process that
generates this number could not also influence a person's health outcome. It
is only our naive physics model that tells us that there is no such process,
and the only way the random number generator affects the health outcome, is
through the choice of drug.

So to summarize, doing causal inference relies on a model of reality, and this
model is sufficiently complex that the model itself cannot (for now) be
derived through purely statistical means, but rather is arrived at through
ordinary human thought.

~~~
alok-g
One thing though that is a bit hard to explain, and I would like to hear your
thoughts/understanding on this.

Establishing causation includes at least establishing precedence in addition
to correlation. In other words, the very definition of time is intertwined
with the relationship between causes and effects. And Physics does not yet
claim to have understood what time means (AFAIK): It is really real? Is it
something just within our heads? ...

If you do not treat causation as a fundamental notion, do you treat time as
one? If yes, how do you explain the dependence between the two. If not, well,
that becomes a subject of even more basic discussion since then I would ask
what do you treat to be fundamental notions. :-) (There does not seem to be a
way for us to get around fundamental notions completely.)

~~~
QuantumChaos
When I studied physics as an undergrad, I was made to take a course in
experimental particle physics (in Germany they call this "Phenomenology", not
in English to my knowledge). I complained that I wanted to study the
fundamental laws of physics, not how they were tested. They university advisor
replied that in order to understand how the world works, first we have to know
what it looks like.

This is basically my view on this matter. We can only take for granted the
existence of the objective universe. Everything else must be derived from
experiment, including concepts that seem fundamental like time or causality.

One interesting thing about causality is that the fact that the macroscopic
world obeys cause-and-effect laws is usually attributed to the second law of
thermodynamics. But where does the second law of thermodynamics come from?
I've heard it stated (can't say if this is universally held by physicists)
that the second law of thermodynamics holds because the initial state of the
universe had very low entropy, while the final state of the universe will have
very high entropy. That is the second law of thermodynamics is "caused" by
both the initial _and_ final states of the universe. So causality and time are
something that we don't fully understand yet.

------
brownbat
We keep falling back on Hume as if no one has thought critically about the
subject since science became more mature. For the curious, Mackie's "The
Cement of the Universe" from 1974 provides a more recent analysis of
causation.
[http://books.google.com/books/about/The_Cement_of_the_Univer...](http://books.google.com/books/about/The_Cement_of_the_Universe.html?id=KhJ99xSUkPAC)

For a sample, suppose a barn burns down right after someone chucks a lighted
match in the barn. You can't say that the match was "necessary," the barn
could have been struck by lightning or arson and still burned. And the match
wasn't "sufficient," because the match probably also needed to fall upon
something flammable, like dry hay; there had to be oxygen; etc.

Mackie comes up with a fancy acronym, INUS, but I'll leave unpacking all that
aside. One of the more provocative suggestions is that a "cause" is highly
contingent on context (that might only be relevant to human observers).

If people harmlessly throw matches into a concrete-floored barn every day, if
it's widely accepted as hot match disposal, but one day some joker uses it to
store a bunch of oily rags... well, the unusual factor tends to get the blame.

------
_delirium
The book _Causality_ by Judea Pearl is kind of an extended answer to this
question.

~~~
mjklin
Also much of the work of David Hume

[http://en.m.wikipedia.org/wiki/David_Hume#Causation](http://en.m.wikipedia.org/wiki/David_Hume#Causation)

~~~
scoofy
Yes, the author is describing the problem of induction. There are two major
works of Hume that take on the subject, but the most recent popular work I've
read is the black swan by nassim taleb.

I'm on my phone now, but I could give a short summery of the problem, and how
I think it is intractable if people are interested.

------
contingencies
Err ... reliable experimental repetition of the hypothesized results against a
control (and/or variety of controls) in carefully isolated environments using
carefully controlled and documented methods that effectively remove all other
plausible explanations?

Yes, we will never know for sure because
[https://en.wikipedia.org/wiki/List_of_unsolved_problems_in_p...](https://en.wikipedia.org/wiki/List_of_unsolved_problems_in_physics)
... but we can be pretty certain.

------
bo1024
Hmm, this seems to assume that we only have access to collected data (e.g.
epidemiological studies) rather than being able to _generate_ the data
ourselves (many other types of scientific studies).

There is a key difference. A million surveys of smokers and cancer patients
will never definitively prove causality. But if we could (legally and
ethically) construct a randomized experiment in which we assigned a random
half of the participants to smoke, we could easily conclude with very high
confidence that smoking causes cancer.

(If this is not self-evident to you, the reason is straightforward
probability. Suppose smoking does not cause cancer. Then when you take any
given person and have them start smoking, their probability of getting cancer
does not go up. Then the rates of cancer in the smoking group are exactly what
they would have been had they not smoked. Then, since the people were placed
in groups at random, with very high probability the rate of cancer in the
smoking group will be almost the same as in the nonsmoking group. So if we
find in our experiment that the rates of cancer are much higher in the smoking
group, we must reject the hypothesis that smoking does not cause cancer.)

~~~
glesica
Agreed, this is also a problem in economics (though even in economics there
are some limited experiments that can be done). I think a key issue people
overlook in this debate is the presence of a theory that makes sense. If you
just plot data until you spot something that looks like a trend, or fit models
until your F-test comes back significant or your R^2 is big enough, then yeah,
you can't even make a good case for causation. But if you start with a
reasonable theory that fits with existing knowledge, then you can at least
make a case.

------
afafsd
>But, in the end, we'll never be absolutely sure that one thing causes
another.

Maybe so, but in the same pedantic sense that we'll never be "absolutely sure"
that the universe didn't pop into existence five seconds ago with our memories
already programmed in. This level of pedantry is great for impressing your
fellow freshman philosophy students, but not actually useful for real life.

For actual practical day-to-day purposes knowing whether something causes
something else is _really useful_ , so what can we do? Controlled experiments.
If you want to know whether cyanide is poisonous, get a hundred undergrads,
randomly select half of them, feed 'em cyanide and see which ones drop dead.
If you observe a strong correlation between the variable you're interested in
and the one you think you're controlling then you've established causation for
all practical purposes.

The pedant will point out there are two other possible explanations:

1\. A massive coincidence -- those fifty undergrads just _happened_ to drop
dead for unrelated reasons. You can compute the probability of this as
something like the average rate of spontaneous and inexplicable human death in
undergrads to the power of fifty and multiplied by 1/(100 choose 50) [or
something].

2\. A slightly less massive combined with a reverse-causal process. Fifty of
your undergrads just happened to be about to die anyway, and some aura
emanating from them affected your coin-flipping process. The probability of
this is harder to quantify, though since the scenario still requires fifty
cases of spontaneous undergraduate death in a short space of time we can put
an upper bound on it.

------
kazinator
Oh, correlation does point to causality. What you don't immediately know is
where to put the causal arrows. A and B occur together with a high
probability: does A cause B, or B cause A? Or do they have a common cause?

Mere statistics cannot do that detective work for us, I'm afraid. But when a
pure cause-and-effect relationship is uncovered and reproduced, the
correlation will be 100%: remove the cause, the effect disappears. Restore the
cause, and the effect appears reliably.

For instance, if we stop the flow of electric current through a coil, the
magnetic field will collapse soon afterward. If we start the current again,
the field comes back. Moreover, if we place a coil into a similar magnetic
field, that field doesn't produce current. These things are 100% reproducible.
It's not some weak statistic like "in coils where current was present, the
magnetic field was observed to be 3.5% stronger on average". That kind of
situation shows that some effects which are not being controlled for are
masking the underlying causes and effects.

~~~
alok-g
Exactly, correlation does "point to" causality. The question is what
"establishes" it.

Also, agreed, use of controlled experiments is probably the best we can do to
validate the hypothesis, and then assume its correctness beyond doubt till
counter examples show up, if at all.

Correlation does not need to be 100% though. We live in the world of quantum
mechanics today where you can successfully predict the probability
distribution, but not beyond (at least as yet).

------
drakaal
May not, not does not.

Correlation and causation can be implied if you have a control. Or if you can
rule out external factors. Saying it does not is not accurate.

The parrot wakes and sleeps with the sunrise and sunset. Does the Parrot Cause
the sunrise? No. This happened BEFORE the parrot was born. Does the sunset
cause the sleep? Throw a blanket over it's cage it goes to sleep. TADA!
Causation!

When you have two things which you can't control, and that have been going on
longer than you can observe it gets to be harder. Proving the Moon phase
causes the Tides, not the Tides cause the moon phase. That's a bit trickier.
Not in today's age, but at one point. There is a reason we thought the Earth
was flat, and that you could throw a stone in to a pond and get a Turtle.

------
javert
> But, in the end, we'll never be absolutely sure that one thing causes
> another.

This is incorrect, because all knowledge is contextual.

For instance, it will always be true that gravity causes things to fall to the
Earth. Future knowledge can't invalidate that, in can only expand the context.

Another example: Suppose a primitive tribesman climbes a tall tree, looks all
around as far as he can see, and proclaims, "The land is flat." The eventual
discovery that Earth is round does not invalidate his claim; it expands the
context. The context of "the land [I can see]" is not the same as the context
of "the entire Earth."

~~~
alok-g
I am not sure actually that gravity "causes" things to fall to the earth.
Gravity is the "name" with give to an entity in a mathematical model that
somehow fits the observations of things falling very well. What is the "cause"
as to why things fall down, AFAIK, that question is still _ultimately_
unanswerable beyond what would be just semantics. Having found the model
though, it would be now correct to say that if someone brings a meteor to the
field of the earth (cause), the meteor will fall down (effect), as far as the
model is correct.

I am not yet seeing how your second example is relevant to figuring how to
establish causation.

~~~
javert
> AFAIK, that question is still ultimately unanswerable beyond what would be
> just semantics.

Out of my range of expertise but I think this could be said about all
knowledge. For instance, about elementary particles, You can always say, "What
if there is one lower level that we can't detect?" And the right take-away is
not: "Therefore, we never know the cause of anything." The right take-away is
that "causation" has to take into account contextuality. You can make lots of
X causes Y statements that are valid, as long as you specify the context.

The second example is relevant because it's another illustration that
knowledge is contextual. Though it's not about causation. The connection is
that I'm trying to use the contextuality of knowledge to defend causation. So
I was giving a "pure" example of the contextuality of knowledge that doesn't
also involve the complexity of causation.

------
srinathh
Correlation provides a starting point to investigate causality. To establish
causality what you do is develop hypothesis for mechanisms of causality,
identify intermediate events that may be observable if the mechanism
hypothesis is valid and observe for those events. If these are observed, you
have a much higher likelihood of causality. You'll typically do this if the
return on investment from establishing causality is high enough since it's
much more work - eg. You're trying to get a drug approval from FDA or you're
trying to justify a major investment decision.

------
spikels
4\. It does not account for Selection Bias. Just because some particular
sample show a correlation does not mean the true distribution is correlated.
This is almost certainly true if you look at lots (say 10+) relationships.

It's just too easy to fool yourself. For every time someone claimed no
causality there are dozens, perhaps hundreds, of cases of false claims of
causality. There is a huge bias.

The true test of causality is reproducible experiments. If X causes Y then
making X happen should make Y much more likely. Unfortunately reality is not
so easy to arrange.

------
firebrand39
I have been privately lumping causation and correlation together for decades
now. It freed my thinking. While the two are certainly different, it is also
true that causation does not have the set-in-stoneness/invariability that is
commonly taken for granted. Witness the recent discussion about nasa's
impossible-space-propulsion.

In terms of a human acquiring knowledge, correlation certainly is the first
step. It is also called observation. Causation just seems to be a subsequent
underpinning with context and concepts.

~~~
DanBC
How do you protect yourself against bias caused by coincidence?

~~~
firebrand39
Well, in the first place I cannot. I take the thing as it is. This is the
risk-taking part. But, I guess, that is why I have to move from correlation to
causation or rather embed this one correlation into a broader context. Because
it protects me against correlational flukes.

Thanks for this inspiring question.

------
StephenGL
Wants to be smart, but uses "begs the question" in the first couple of
paragraphs. What else didn't get caught in editing?

~~~
DanBC
Begs the question has established another use.

Pedants may want to switch to _petitio principii_.

You received downvotes because your comment is a middlebrow dismissal. You
sieze a single snippet and use that to dismiss the entire submission.

~~~
StephenGL
Have you read the rest of the submission? It is likewise a bit of a mess. I
believe that my assessment informed partly by this item was more or less
accurate.

------
danieltillett
You know correlation implies causality when you can make money from the
correlation.

~~~
alok-g
Well, who knows. You may be deriving money from a different correlation than
what you thought. In other words, the cause of money-generation may not be the
correlation you thought it is.

For example, you may be making money from your customers' lack of
understanding of correlation and causation. You may convince your customers
that your product leads to revenues for them when it doesn't. And now the
money you derive may be correlated with the strength of their beliefs in your
product being correlated to their revenues, but not with your product actually
being correlated to their revenues.

~~~
danieltillett
I must remind myself to not try and be subtle here on HN. The true cause(s) of
any effect can never be determined through observation (this is Hume’s problem
of induction).

When dealing with causation there is no way around the step of having to
propose a causative model that you believe to be true - the point I was trying
to make is there is nothing better than making money to convince people that
their own model is the correct one. Making money does not mean the model is
correct, just that the person making money will believe it to be correct.

