Want to make good business decisions? Learn causality 187 points by yarapavan 39 days ago | hide | past | web | favorite | 25 comments

 For those of you interest in learning more about causality, I just announced on my mailing list (https://data4sci.com/latest) that I'm working on a series of blog posts covering the contents of Judea Pearls's Causal Inference in Statistics - A Primer (https://amzn.to/39G5lWl, affiliate link) using Python.
 Super interesting article. I can recommend Pearl's Book of Why to learn more about itAlthough to me it's still quite difficult to apply the theoretical knowledge to day-to-day applications. Any references or ideas?
 I find that as well. I've also read Book of Why but it's difficult to know when and how to use them. Perhaps it's just more practice of writing causal diagrams, as Pearl seems to do with ease, just due to the years of practice he's had. Something like programming, I guess.
 Maybe this will help? https://news.ycombinator.com/item?id=21994976
 Here's a relevant video for causality:
 Well, there goes the afternoon.
 Causality.
 Nice article. But playing and learning, thinking and debating about causal loops is Better imho. So I created a loopy clone: https://nocomplexity.com/causalloopdiagram/
 Note that the article subtly admits that you can subvert the objectivity in your model-based decisions by making your subjecive choices of how to build the graph.
 I think the graph itself is useful for that exact reason. It lets us see the the specific model assumptions and generate the testable conditions.
 This is always true: you're always making some strong assumptions when trying to make a causal claim. Using a causal framework, DAGs in this case, at least makes those assumptions explicit.
 This is common sense. Nothing more than you would learn in an introduction to regression analysis class.Still helpful to those not familiar with it.*edit: To those downvoting, is there anything especially insightful that I missed?
 I didn't downvote, but your comment came across as dismissive.Personally, I think there's a world of difference between the aphorism of "correlation does not equal causation" and actually understanding how causation works. I understand the former quite well, but haven't much a clue about the latter. (And I'm not alone!)
 Got it.I left my comment for the benefit of those who have learned regression. I read the article expecting something new, seeing how highly upvoted the post was on HN, but realized it was basic information.If they removed the symbols from the article, I bet the article would be accessible to way more people.Regarding causation, I think the essential points are:- there is never any true causation (e.g. Even though it is always dark when I close my eyelids, I can never be sure that closing my eyelids causes the darkness. How can I know that the next time I close my eyelids, it won't be something other than dark?)- you can see if a correlation may have a causal relationship by applying common sense to the chronology of events (if darkness correlates highly with closed eyelids, and if I notice that, chronologically, darkness has always followed closing my eyelids, I can be more confident that the relationship may be causal. Stats help formalize this process.)
 Try not to make comments that an article is trivial and common sense when you admittedly missed 50% of the article, and of that 50% you weren't familiar with any of the concepts.Essentially that's reading an introduction, saying you knew everything in the introduction and then claiming the article has no content.
 That's unfair. Most of the article is basic.The parts that I am unfamiliar with are the second paragraph in the "What is my data telling me?" section where they describe DAG, and the last paragraph in the "Can’t I just use XGBoost?" section, where they introduce TMLE.Even then, DAG seems like a very straightforward concept that would be useful to have in your toolkit, but is probably something most thoughtful people do without explicitly thinking about it when doing regressions.And the TMLE paragraph is essentially jargon that you would need to do your own research to actually understand. Why couldn't they spend the article describing this, as it seems that this what their value add as a service is.Overall, this piece seemed more intended to market themselves ("Look! We do more complicated regression than you know how to do, and we're not going to explain it in any depth.") instead of well meaning teaching. Of course, I'm probably seeing bad intentions when there aren't any, so I am likely wrong.
 Upvoted you, because why not throw around internet points!It's easy to forget that things that are basic to some are extremely advanced to others. As HN grows, it draws in more diverse backgrounds; many of which don't have much statistics exposure.I know data scientists who work at FAANGs, for whom this would be almost entirely new information (the causality stuff).Thank you for taking the time to add more essential points. Do you think it's possible to make better decisions with partial causation, versus a more correlational approach? That is, do we really need true causation to improve our decision making? Or can we make better, albeit imperfect, decisions with imperfect causality? Is that better than what most statisticians do with correlation?
 Chronology isn't enough though. Roosters crow before dawn.
 Correct. That's why I mentioned common sense.
 Perhaps you had an exceptional intro to regression class. But the ones I’m familiar haven’t gone into the Rubin causal model, DAGs, or combining linear treatment effect models with non-parametric components.
 Fair. I didn't think too much about those points as they're basically footnotes, but I don't know about them either.
 On the contrary I think they’re the main point!
 Maybe, but it's hard to agree with that when the information is restricted to two paragraphs and a footnote.
 There is no way in hell an intro to regression analysis class is going into things like TMLE.

Applications are open for YC Summer 2020

Search: