Hacker News new | past | comments | ask | show | jobs | submit login
Causal Inference and Data-Fusion in Econometrics (2019) (arxiv.org)
61 points by viburnum 39 days ago | hide | past | web | favorite | 14 comments

I suspect we'll see causal techniques start merging with more traditional AI/ML tools over the coming years.

Causal forests are an example that extends random forests, but I imagine a lot of the value in current pipelines would be to use causality as regulariser. This could be a parameter that controls the weight of established causal links, or it could be as a scaffold; e.g. a first 'causal pass' is used to establish constraints (monotonicity, conditional variable selection, reject changes that result in predictions inconsistent with the initial causal model when there is a strong causal model etc).

RL is likely more promising. If agents could be made to search for causality in an environment these relationships could be made much harder to unlearn which would then enable more efficient exploration & incremental learning. Framed this way causality guides attention, limits the search space and locks in learning.

I've got some quarantine reading/experiments to try! :)

I totally agree as causation as a sort of generalization enhancer akin to regularization. In stats, there's the notion of some "true" parameter that's trying to be estimated, but you get all sorts of systematic errors creeping in if you estimate it wrong. If you get a good estimate of it, though, you've learned something "true" and that generalizes much better than systematically wrong versions. Like, if you figure out F and m and that F=ma, you're going to make really good predictions, regardless of how far from your original training you are. Other truths are still pretty limited (like the example of a social study finding the true treatment effect of something on affluent white 20 year olds in LA), but the scientific ideas of internal and external validity still apply quite nicely.

If you're wanting to read more about causal inference, I liked this flowchart on "Which causal inference book you should read"


The Book of Why by Judea Pearl may be a good starting point for anyone interested in this.

The topic is really interesting, and this book had a big impact in my understanding of the world, yet I found it pretty annoying to read. The grudge born by J. Pearl against the statistics community who rejected his ideas is way too present in the book IMHO. He's almost like “I was right all along you fuckers, who's the boss now!” on every single page, and I feel it's really disserving his ideas.

That was what put me off about that book too. I was really excited to learn about his math, but the continuous hard sell combined with attacking traditional statistical methods (which still have a lot of use) was pretty off putting. I will probably pick it back up, but it was not a great way to hook readers or bring them around to your way of thinking.

I only made it halfway but my impression was, "maybe your critics are correct because I can't tell if any of this makes sense."

I think "Probabilistic Reasoning in Intelligent Systems" is a better starting point for Pearl. Not much easier but more familiar ground if you're coming from today's ML mindset.

As an aside, once while reading through Wasserman's "All of Statistics", he somehow hypnotized me into seeing the title of Ch.16 as "Casual Inference", so anyone who knows me knows that I can't help making a dad-joke about casual inference when the topic comes up.

Elias Bareinboim [1] mentions "we are beta-testing a tool called ‘Fusion’, which offers an easy-to-use way of doing causal inference from 1st principles" [2]. Tantalizing, but I've not yet seen anything else about 'Fusion'.

[1] https://causalai.net/ [2] https://twitter.com/eliasbareinboim/status/11916094504628838...

A tool from Bareinboim is mentioned in a Technology Review article [1] on causal reasoning in AI. Per the article Pearl considers Bareinboim to be a protege.

quoting: "One of his systems, which is still in beta, can help scientists determine whether they have sufficient data to answer a causal question. Richard McElreath, an anthropologist at the Max Planck Institute for Evolutionary Anthropology, is using the software to guide research into why humans go through menopause (we are the only apes that do).

The hypothesis is that the decline of fertility in older women benefited early human societies because women who put more effort into caring for grandchildren ultimately had more descendants. But what evidence might exist today to support the claim that children do better with grandparents around? Anthropologists can’t just compare the educational or medical outcomes of children who have lived with grandparents and those who haven’t. There are what statisticians call confounding factors: grandmothers might be likelier to live with grandchildren who need the most help. Bareinboim’s software can help McElreath discern which studies about kids who grew up with their grandparents are least riddled with confounding factors and could be valuable in answering his causal query. “It’s a huge step forward,” McElreath says."

... looks "marginal" if you ask me. But I can't think of a better beta-tester than McElreath for this.

[1] https://www.technologyreview.com/2020/02/19/868178/what-ai-s...

Looks like you can request to beta test it here - https://docs.google.com/forms/d/1wOal6PSRxXcMmmzXXHygDC5_rGR...

I have been working on a PhD about causal graphs for the last 6 years

I wonder what the career perspectives that brings? I want to do no statistics, only programming

Lol, don’t you think it has applications to every context? :-)

What’s your PhD work on?

I studied the standard models, they are very limited. Causes and effects as DAGs, so there cannot be any cycles or feedback loops. And some assumptions fail when there are deterministic relationships.

I investigated adjustment sets and instrumental variables.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact