Hacker News new | past | comments | ask | show | jobs | submit login

Check this one out, it is the classic in book on causal inference: https://www.amazon.com/Experimental-Quasi-Experimental-Desig...

Nah, Robins started (asterisk) all this causal/propensity-scored/pretend-experimental mania.

It's what drove me back to hard interventional experiments and eventually to adaptive clinical trial design.

(Asterisk): ok more like Pearl started it and Robins made it more practical with doubly robust designs etc. I still find most of the mechanics rather shady.

> I still find most of the mechanics rather shady.

Interesting, can you expand on this? I have no experience with causal inference and would like to learn more. Thanks!

Look up confounding by indication and some of the power/sensitivity studies for so-called doubly robust estimators. Counterfactuals are reasonable. A lot of "let's turn this observational study into a designed experiment with math" approaches turn out not to be. That's the gist of it; if you want rigor, you should read the papers. I don't have a reference to hand at the moment (on phone) but it shouldn't take more than a few minutes of searching google scholar to hit the appropriate vein. The bottom line is simply TANSTAAFL.

Have you seen the new paper, “Human Decisions and Machine Predictions”? http://scholar.harvard.edu/files/sendhil/files/w23180.pdf

I'm wondering if their methodology is reasonable?

From the abstract: “Millions of times each year, judges must decide where defendants will await trial—at home or in jail. By law, this decision hinges on the judge’s prediction of what the defendant would do if released. … Yet comparing the algorithm to the judge proves complicated. … We only observe crime outcomes for released defendants, not for those judges detained. This makes it hard to evaluate counterfactual decision rules based on algorithmic predictions. … We deal with these problems using different econometric strategies, such as quasi-random assignment of cases to judges. … A policy simulation shows crime can be reduced by up to 24.8% with no change in jailing rates, or jail populations can be reduced by 42.0% with no increase in crime rates. Moreover, we see reductions in all categories of crime, including violent ones. Importantly, such gains can be had while also significantly reducing the percentage of African-Americans and Hispanics in jail. … While machine learning can be valuable, realizing this value requires integrating these tools into an economic framework: being clear about the link between predictions and decisions; specifying the scope of payoff functions; and constructing unbiased decision counterfactuals.”

they seem to be fixated on how shiny and new L1 penalties are, in 2014. Greg Ridgway started using gradient boosting machines (GBM) for propensity scoring in the early 2000s, and I didn't see them cite him, so I kind of hate them already. On the other hand, at least GBM works well.

I'm no economist, though. Perhaps this is novel at NBER. It's just odd to see someone acting like using an ensemble to enable data-driven model selection is something new.

nb. I didn't read the entire 76-page paper (partly because it's obscenely verbose). A quick skim and here are my from-the-hip remarks. If they suck, I'll refund every cent you paid me ;-)

Looks like Chapter 15 in the Causal Inference Book agrees with you:

“Outcome regression and various versions of propensity score analyses are the most commonly used parametric methods for causal inference. You may rightly wonder why it took us so long to include a chapter that discusses these methods. So far we have described IP weighting, the g-formula, and g-estimation–the g-methods. Presenting the most commonly used methods after the least commonly used ones seems an odd choice on our part. Why didn’t we start with the simpler and widely used methods based on outcome regression and propensity scores? Because these methods do not work in general. More precisely, the simpler outcome regression and propensity score methods–as described in a zillion publications that this chapter cannot possibly summarize–work fine in simpler settings, but these methods are not designed to handle the complexities associated with causal inference for time-varying treatments.”

Interesting. It was my impression that Donald Rubin was the main instigator of propensity-score methods. Do you have some paper references in mind to give me a better sense of the timeline?

Quite possible -- Robins was pointed out to me as one of the pioneers by Sander Greenland, around 2008 or so? I'm more of an experimental/clinical statistician at this point. The work I did with propensity scoring and doubly robust estimation convinced me that shrinkage was a less-bad approach for the things I needed to do. YMMV.

Thank you, I was not aware of this book. I am currently working my way through the Counterfactuals and Causal Inference (2nd), and I would definitely recommend this book (arguably over JP's Causality) to those who are looking for a comprehensive, introductory text.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact