It's what drove me back to hard interventional experiments and eventually to adaptive clinical trial design.
(Asterisk): ok more like Pearl started it and Robins made it more practical with doubly robust designs etc. I still find most of the mechanics rather shady.
Interesting, can you expand on this? I have no experience with causal inference and would like to learn more. Thanks!
I'm wondering if their methodology is reasonable?
From the abstract: “Millions of times each year, judges must decide where defendants will await trial—at home or in jail. By law, this decision hinges on the judge’s prediction of what the defendant would do if released. … Yet comparing the algorithm to the judge proves complicated. … We only observe crime outcomes for released defendants, not for those judges detained. This makes it hard to evaluate counterfactual decision rules based on algorithmic predictions. … We deal with these problems using different econometric strategies, such as quasi-random assignment of cases to judges. … A policy simulation shows crime can be reduced by up to 24.8% with no change in jailing rates, or jail populations can be reduced by 42.0% with no increase in crime rates. Moreover, we see reductions in all categories of crime, including violent ones. Importantly, such gains can be had while also significantly reducing the percentage of African-Americans and Hispanics in jail. … While machine learning can be valuable, realizing this value requires integrating these tools into an economic framework: being clear about the link between predictions and decisions; specifying the scope of payoff functions; and constructing unbiased decision counterfactuals.”
I'm no economist, though. Perhaps this is novel at NBER. It's just odd to see someone acting like using an ensemble to enable data-driven model selection is something new.
nb. I didn't read the entire 76-page paper (partly because it's obscenely verbose). A quick skim and here are my from-the-hip remarks. If they suck, I'll refund every cent you paid me ;-)
“Outcome regression and various versions of propensity score analyses are the most commonly used parametric methods for causal inference. You may rightly wonder why it took us so long to include a chapter that discusses these methods. So far we have described IP weighting, the g-formula, and g-estimation–the g-methods. Presenting the most commonly used methods after the least commonly used ones seems an odd choice on our part. Why didn’t we start with the simpler and widely used methods based on outcome regression and propensity scores? Because these methods do not work in general. More precisely, the simpler outcome regression and propensity score methods–as described in a zillion publications that this chapter cannot possibly summarize–work fine in simpler settings, but these methods are not designed to handle the complexities associated with causal inference for time-varying treatments.”