Just to reiterate the practical issue here: We, as people, are just exceedingly bad in having AND putting the right data at the right place in any model. It's really not the models fault, and the contribution of ML is marginal in that regard.
Whether it is subsidies of farmers, education, tax reduction, minimum wage, austerity measures... history is full of deliciously wrong predictions and policy measures.
Almost all of them can be reduced to the simple fact that the DPG is not stable when varying the policy, and that simple fact is due to people being deliberatively reactive.
In other words, you are missing data. Data about human behavior that is simply not observed, because it didn't happen, or because it happens inside people! And then, no matter how well you fit your conditional expectation (or other moment, or whatever you fit), the errors are simply not predictable.
We miss the counterfactual data, AND we aren't even smart enough to use all the data that we have. The less we theorize, the less we use prior logic, the more we run into these paradoxes where our policy does the exact opposite of what we intended.
This is pretty much the only real constant you can find in the last 100 years of social science.
It is therefore entirely correct that social science focuses more and more on causality - and where it can be identified. Yes, it is much harder, and the opportunities to do it correctly are scarce, but necessary. In this, trusting in more data and AI is precisely the wrong approach.
Whether it is subsidies of farmers, education, tax reduction, minimum wage, austerity measures... history is full of deliciously wrong predictions and policy measures.
Almost all of them can be reduced to the simple fact that the DPG is not stable when varying the policy, and that simple fact is due to people being deliberatively reactive. In other words, you are missing data. Data about human behavior that is simply not observed, because it didn't happen, or because it happens inside people! And then, no matter how well you fit your conditional expectation (or other moment, or whatever you fit), the errors are simply not predictable.
We miss the counterfactual data, AND we aren't even smart enough to use all the data that we have. The less we theorize, the less we use prior logic, the more we run into these paradoxes where our policy does the exact opposite of what we intended.
This is pretty much the only real constant you can find in the last 100 years of social science.
It is therefore entirely correct that social science focuses more and more on causality - and where it can be identified. Yes, it is much harder, and the opportunities to do it correctly are scarce, but necessary. In this, trusting in more data and AI is precisely the wrong approach.