
Want to make good business decisions? Learn causality - yarapavan
https://multithreaded.stitchfix.com/blog/2019/12/19/good-marketing-decisions/
======
Anon84
For those of you interest in learning more about causality, I just announced
on my mailing list
([https://data4sci.com/latest](https://data4sci.com/latest)) that I'm working
on a series of blog posts covering the contents of Judea Pearls's Causal
Inference in Statistics - A Primer
([https://amzn.to/39G5lWl](https://amzn.to/39G5lWl), affiliate link) using
Python.

------
h3ctic
Super interesting article. I can recommend Pearl's Book of Why to learn more
about it

Although to me it's still quite difficult to apply the theoretical knowledge
to day-to-day applications. Any references or ideas?

~~~
satvikpendem
I find that as well. I've also read Book of Why but it's difficult to know
when and how to use them. Perhaps it's just more practice of writing causal
diagrams, as Pearl seems to do with ease, just due to the years of practice
he's had. Something like programming, I guess.

------
coldtea
Here's a relevant video for causality:

[https://www.youtube.com/watch?v=1dWjKkF0Zi4](https://www.youtube.com/watch?v=1dWjKkF0Zi4)

~~~
dhimes
Well, there goes the afternoon.

~~~
peterjussi
Causality.

------
runningmike
Nice article. But playing and learning, thinking and debating about causal
loops is Better imho. So I created a loopy clone:
[https://nocomplexity.com/causalloopdiagram/](https://nocomplexity.com/causalloopdiagram/)

------
lonelappde
Note that the article subtly admits that you can subvert the objectivity in
your model-based decisions by making your subjecive choices of how to build
the graph.

~~~
sjg007
I think the graph itself is useful for that exact reason. It lets us see the
the specific model assumptions and generate the testable conditions.

------
briefcomment
This is common sense. Nothing more than you would learn in an introduction to
regression analysis class.

Still helpful to those not familiar with it.

*edit: To those downvoting, is there anything especially insightful that I missed?

~~~
throw_14JAS
I didn't downvote, but your comment came across as dismissive.

Personally, I think there's a world of difference between the aphorism of
"correlation does not equal causation" and actually understanding how
causation works. I understand the former quite well, but haven't much a clue
about the latter. (And I'm not alone!)

~~~
briefcomment
Got it.

I left my comment for the benefit of those who have learned regression. I read
the article expecting something new, seeing how highly upvoted the post was on
HN, but realized it was basic information.

If they removed the symbols from the article, I bet the article would be
accessible to way more people.

Regarding causation, I think the essential points are:

\- there is never any true causation (e.g. Even though it is always dark when
I close my eyelids, I can never be sure that closing my eyelids causes the
darkness. How can I know that the next time I close my eyelids, it won't be
something other than dark?)

\- you can see if a correlation may have a causal relationship by applying
common sense to the chronology of events (if darkness correlates highly with
closed eyelids, and if I notice that, chronologically, darkness has always
followed closing my eyelids, I can be more confident that the relationship may
be causal. Stats help formalize this process.)

~~~
BoiledCabbage
Try not to make comments that an article is trivial and common sense when you
admittedly missed 50% of the article, and of that 50% you weren't familiar
with any of the concepts.

Essentially that's reading an introduction, saying you knew everything in the
introduction and then claiming the article has no content.

~~~
briefcomment
That's unfair. Most of the article is basic.

The parts that I am unfamiliar with are the second paragraph in the "What is
my data telling me?" section where they describe DAG, and the last paragraph
in the "Can’t I just use XGBoost?" section, where they introduce TMLE.

Even then, DAG seems like a very straightforward concept that would be useful
to have in your toolkit, but is probably something most thoughtful people do
without explicitly thinking about it when doing regressions.

And the TMLE paragraph is essentially jargon that you would need to do your
own research to actually understand. Why couldn't they spend the article
describing this, as it seems that this what their value add as a service is.

Overall, this piece seemed more intended to market themselves ("Look! We do
more complicated regression than you know how to do, and we're not going to
explain it in any depth.") instead of well meaning teaching. Of course, I'm
probably seeing bad intentions when there aren't any, so I am likely wrong.

~~~
BoiledCabbage
> That's unfair. Most of the article is basic.

> Even then, DAG seems like a very straightforward concept that would be
> useful to have in your toolkit, but is probably something most thoughtful
> people do without explicitly thinking about it when doing regressions.

I appreciate the honest response. To elaborate a bit, I think you may be
mixing up the point of the article with the fact it starts with a low baseline
of expected familiarity / knowledge to be able to bring more readers along.

My read is that the purpose of the article is to explain how to perform more
accurate blocking and be explicit about you model assumptions by using DAGs.
Followed by why the common tool everyone reaches for of linear regression is
the wrong tool in this case, and TMLE is better.

That's the meat of what the article builds to, and is 9 out of the 30
paragraphs. Roughly 1/3rd (not 1/2 as guessed).

> Why couldn't they spend the article describing this, as it seems that this
> what their value add as a service is.

Agreed, but I think this is a different type of article. It's (in my opinion)
and article saying "hey, there is a problem or shortcoming in what you're
doing today. Gives justification as to why many people do have this problem,
and then says "These are the tools to help you fix it."

A person unaware knows what to look into to solve their problem. Presumably a
treatment on the solutions doesn't fit in a blog post - and it doesn't have
to. There is value in identifying a person's problem and telling them how to
learn about the solution. Even without fully detailing the solution. And I
think this is your main criticism with the post, is that it wasn't a post to
just present the solution in full detail.

That said, I don't want this to seem like im beating you up - but rather show
that there are many different ways to get value and information on a topic
without a detailed treatment of a solution.

And cycling back to your original point, I agree, I think the reason they
don't go into full detail on the solution is because at that point the amount
there'd need to cover to do a full treatment on the solution ends up having
the cost outweigh the value in a blog post that is motivated to bringing
people to their company/service.

