Using predictive models for policy is not new, in fact it was the standard appro...

zwaps · on June 13, 2019

Just to reiterate the practical issue here: We, as people, are just exceedingly bad in having AND putting the right data at the right place in any model. It's really not the models fault, and the contribution of ML is marginal in that regard.

Whether it is subsidies of farmers, education, tax reduction, minimum wage, austerity measures... history is full of deliciously wrong predictions and policy measures.

Almost all of them can be reduced to the simple fact that the DPG is not stable when varying the policy, and that simple fact is due to people being deliberatively reactive. In other words, you are missing data. Data about human behavior that is simply not observed, because it didn't happen, or because it happens inside people! And then, no matter how well you fit your conditional expectation (or other moment, or whatever you fit), the errors are simply not predictable.

We miss the counterfactual data, AND we aren't even smart enough to use all the data that we have. The less we theorize, the less we use prior logic, the more we run into these paradoxes where our policy does the exact opposite of what we intended.

This is pretty much the only real constant you can find in the last 100 years of social science.

It is therefore entirely correct that social science focuses more and more on causality - and where it can be identified. Yes, it is much harder, and the opportunities to do it correctly are scarce, but necessary. In this, trusting in more data and AI is precisely the wrong approach.

westurner · on June 13, 2019

Yes, some combination of variables/features grouped and connected with operators that correlate to an optima (some of which are parameters we can specify) that occurs immediately or after a period of lag during which other variables of the given complex system are dangerously assumed to remain constant.

> In fact, this is exactly the blindness that led to people missing the financial crisis

ML was not necessary to recognize the yield curve inversion as a strongly predictive signal correlating to subsequent contraction.

An NN can certainly learn to predict according to the presence or magnitude of a yield curve inversion and which combinations of other features.

- [ ] Exercise: Learning this and other predictive signals by cherry-picking data and hand-optimizing features may be an extremely appropriate exercise.

"This field is different because it's nonlinear, very complex, there are unquantified and/or uncollected human factors, and temporal"

Maybe we're not in agreement about whether AI and ML can do causal inference just as well if not better than humans manipulating symbols with human cognition and physical world intuition. The time is nigh!

In general, while skepticism and caution are appropriate, many fields suffer from a degree of hubris which prevents them from truly embracing stronger AI in their problem domain. (A human person cannot mutate symbol trees and validate with shuffled and split test data all night long)

> Anyone trying to understand economic phenomena needs to be keenly aware of how inference can be done, which requires an understanding (or an approach to) - that is, a theory - of the underlying mechanisms.

I read this as "must be biased by the literature and willing to disregard an unacceptable error term"; but also caution against rationalizing blind findings which can easily be rationalized as logical due to any number of cognitive biases.

Compared to AI, we're not too rigorous about inductive or deductive inference; we simply store generalizations about human behavior and predict according to syntheses of activations in our human NNs.

If you're suggesting that the information theory that underlies AI and ML is insufficient to learn what we humans have learned in a few hundred years of observing and attempting to optimize, I must disagree (regardless of the hardness or softness of the given complex field). Beyond a few combinations/scenarios, our puny little brains are no match for our department's new willing AI scientist.

zwaps · on June 13, 2019

> ML was not necessary to recognize the yield curve inversion as a strongly predictive signal correlating to subsequent contraction.

> An NN can certainly learn to predict according to the presence or magnitude of a yield curve inversion and which combinations of other features.

> - [ ] Exercise: Learning this and other predictive signals by cherry-picking data and hand-optimizing features may be an extremely appropriate exercise.

If the financial crisis has not yet occurred, how will the NN learn a relationship that does not exist in the data?

The exercise of cherry picking data and hand-optimizing is equivalent to applying theory to your statistical problem. It is what is required if you lack data points - using ML or otherwise. Nevertheless, we (as in humans) are bad at it. Speaking of the financial crisis. It was not AI's that picked up on it, it was some guys applying sophisticated and deep understanding of causal relationships. And that so few people did this, shows how bad we humans are at doing this implicitly and automatically by just looking at data!

> Maybe we're not in agreement about whether AI and ML can do causal inference just as well if not better than humans manipulating symbols with human cognition and physical world intuition. The time is nigh! In general, while skepticism and caution are appropriate, many fields suffer from a degree of hubris which prevents them from truly embracing stronger AI in their problem domain. (A human person cannot mutate symbol trees and validate with shuffled and split test data all night long)

ML and AI certainly can do causal inference. But then you have to do causal inference. Again, prediction on historical data is not equivalent to causal analysis, and neither is backtesting or validation. At the end of the day, AI and ML improves on predictions, but the distinction of causal analysis is a qualitative one.

> I read this as "must be biased by the literature and willing to disregard an unacceptable error term"; but also caution against rationalizing blind findings which can easily be rationalized as logical due to any number of cognitive biases.

No. My point is that for causal analysis, you have to leverage assumptions that are beyond your data set. Where these come from is besides the point. You will always employ a theory, implicitly or explicitly.

The major issue is not the we use theories, but rather that we might do it implicitly, hiding the assumptions about the DGP that allows causal inference. This is where humans are bad. Theories are just theories. With precise assumptions giving us causal identification, we are in a good position to argue where we stand.

If we just run algorithms without really understand what is going on, we are just repeating the mistakes from the last forty years!

> If you're suggesting that the information theory that underlies AI and ML is insufficient to learn what we humans have learned in a few hundred years of observing and attempting to optimize, I must disagree (regardless of the hardness or softness of the given complex field). Beyond a few combinations/scenarios, our puny little brains are no match for our department's new willing AI scientist.

All the information theory I have seen in any of the Machine Learning textbooks I have picked up is methodologically equivalent to statistics. In particular, the standard textbooks (Elements, Murhpy etc.) treatment of information theory would only allow causal identification under the exact same conditions that the statistics literature treats.

I fail to see the difference, or what AI in particular adds. The issue of causal inference is a "hot topic" in many fields, including AI, but the underlying philosophical issues are not exactly new. This includes information theory.

You seem to think that ML has somehow solved this problem. From my reading of these books, I certainly disagree. Causal inference is certainly POSSIBLE - just as in statistics, but ML doesn't give it to you for free!

In particular, note the following issue: To show causal identification, you need to make assumptions on your DGP (exogenous variation, timing, graphical causal relations ... whatever). Even if these assumptions are very implicit, they do exist. Just by looking at data, and running a model, you do not get causal inference. It can not be done "within" the system/model.

If you bake these things into your AI, then it, too makes these assumptions. There really is no difference. For example, I could imagine an AI that can identify likely exogenous variations in the data and use them to predict counterfactuals. That's probably not too far off, if it doesn't exist already. But, this is still based on the assumption that these variations are, indeed exogenous, which can never be proven within the DGP.

In contrast, I find that most "AI scientists" care very much about prediction, and very little about causal inference. I don't mean this subfield doesn't exist. But it is a subfield. In contrast, for many non-AI scientists, causal inference IS the fundamental question, and prediction is only an afterthought. ML in practice involved doing correct experiments (AB testing), at best. It will sooner or later also adopt all other causal inference techniques. But, my point stands, I have yet to see what ML adds. Enlighten me!

AI, ML and stats will merge, if they haven't already. The distinction will disappear. I believe the issues will not. I employ a lot of AI/ML techniques in my scientific work. Never have they solved the underlying issue of causal inference for me!

westurner · on June 13, 2019

> AI, ML and stats will merge, if they haven't already. The distinction will disappear. I believe the issues will not.

All tools are misapplied; including economics professionals and their advice.

Here's a beautiful Venn diagram of "Colliding Web Sciences" which includes economics as a partially independent category: https://www.google.com/search?q=colliding+web+sciences&tbm=i...

westurner · on June 13, 2019

A causal model is a predictive model. We must validate the error of a causal model.

Why are theoretic models hand-wavy? "That's just because noise, the model is correct." No, such a model is insufficient to predict changes in dependent variables when in the presence of noise; which is always the case. How does validating a causal model differ from validating a predictive model with historical and future data?

Yield-curve inversion as a signal can be learned by human and artificial NNs. Period. There are a few false positives in historical data: indeed, describe the variance due to "noise" by searching for additional causal and correlative relations in additional datasets.

I searched for "python causal inference" and found a few resources on the first page of search results: https://www.google.com/search?q=python+causal+inference

CausalInference: https://pypi.org/project/CausalInference/

DoWhy: https://github.com/microsoft/dowhy

CausalImpact (Python port of the R package): https://github.com/dafiti/causalimpact

"What is the best Python package for causal inference?" https://www.quora.com/What-is-the-best-Python-package-for-ca...

Search: graphical model "information theory" [causal] https://www.google.com/search?q=graphical+model+%22informati...

Search: opencog causal inference https://www.google.com/search?q=opencog+causal+inference (MOSES, PLN,)

If you were to write a pseudocode algorithm for an econometric researcher's process of causal inference (and also their cognitive processes (as executed in a NN with a topology)), how would that read?

(Edit) Something about the sufficiency of RL (Reinforcement Learning) for controlling cybernetic systems. https://en.wikipedia.org/wiki/Cybernetics

em500 · on June 14, 2019

What's the point of dumping a bunch of Google results here? At least half the results are about implementations of pretty traditional etatistical / econometric inference techniques. The Rudin causal inference framework requires either randomized controlled trials or for propensity score models an essentially unverifiable separate model step.

Google's CausalImpact model, despite having been featured on Google's AI blog, is a statistical/econometric model (essentially the same as https://www.jstor.org/stable/2981553). It leaves it up to the user to find and designate a set of control variables, which has to be designated by the user to be unaffected by the treatment. This is not done algorithmically, and has very little to do with RNNs, Random Forests or regression regularization.

> If you were to write a pseudocode algorithm for an econometric researcher's process of causal inference (and also their cognitive processes (as executed in a NN with a topology)), how would that read?

[1] Set up a proper RCT, that is randomly assign the treatment to different subjects [2] Calculate the outcome diffences between the treated and untreated

For A/B testing your website, the work division between [1] and [2] might be 50-50, or at least at similar order of magnitudes.

For the questions that academic economists wrstle with, say, estimate the effect of increasing school funding / decreasing class size, the effect of shifts between tax deductions vs tax credits vs changing tax rates or bands, or of the different outcome on GDP growth and unemployment of monetary vs fiscal expansion [1] would be 99.9999% of the work, or completely impossible.

Faced with the impracticallity/impossiblility of proper experiments, academic micro-economists have typically resorted to Instrumental Variable regressions. AFAICT finding (or rather, convincing the audience that you have) a proper instrument is not very amendable to automation or data mining.

In academic macro-economics (and hence at Serious Institutions such as central banks and the IMF), the most popular approaches to building causal models in the last 3 or 4 decades have probably been 1) making a bunch of unrealistic assumpsions of the behaviour individual agents (microfoundations/DSGE models) 2) making a bunch of uninterpretable and unverifyable technical assumptions on the parameters in a generic dynamic stochastic vector process fitted to macro-aggregates (Structural VAR with "identifying restrictions") 3) manually grouping different events in different countries from different periods in history as "similar enough" to support your pet theory: lowering interest rates can lead to a) high inflation, high unemployment (USA 1970s), b) high inflation, low unemployment (Japan 1970s), b) low inflation, high unemployment (EU 2010s) c) low inflation, low unemployment (USA, Japan past 2010s)

I really don't see how a RL would help with any of this. Care to come up with something concrete?

westurner · on June 14, 2019

> What's the point of dumping a bunch of Google results here? At least half the results are about implementations of pretty traditional etatistical / econometric inference techniques.

Here are some tools for causal inference (and a process for finding projects to contribute to instead of arguing about insufficiency of AI/ML for our very special problem domain here). At least one AGI implementation doesn't need to do causal inference in order to predict the outcomes of actions in a noisy field.

Weather forecasting models don't / don't need to do causal inference.

> A/B testing

Is multi-armed bandit feasible for the domain? Or, in practice, are there too many concurrent changes in variables to have any sort of a controlled experiment. Then, aren't you trying to do causal inference with mostly observational data.

> I really don't see how a RL would help with any of this. Care to come up with something concrete?

The practice of developing models and continuing on with them when they seem to fit and citations or impact reinforce is very much entirely an exercise in RL. This is a control system with a feedback loop. A "Cybernetic system". It's not unique. It's not too hard for symbolic or neural AI/ML. Stronger AI can or could do [causal] inference.

zwaps · on June 13, 2019

I am at loss at what you want to say to me, but let me reiterate:

Any learning model by itself is a statistical model. Statistical models are never automatically causal models, albeit causal models are statistical models.

Several causal models can be observationally equivalent to a single statistical model, but the substantive (inferential) implications on doing "an intervention" on the DGP differ.

It is therefore not enough to validate and run on a model on data. Several causal models WILL validate on the same data, but their implications are drastically different. The data ALONE provides you no way to differentiate (we say, identify) the correct causal model without any further restrictions.

By extension, it is impossible for any ML mechanism to predict unobserved interventions without being a causal model.

ML and AI models CAN be causal models, which is the case if they are based on further assumptions about the DGP. For example, they may be graphical models, SCM/SEM etc. These restrictions can be derived algorithmically, based on all sorts of data, tuning, coding and whatever. It really doesn't change the distinction between causal and statistical analysis.

The way these models become causal is based on assumptions that constitute a theory in the scientific sense. These theories can then of course also be validated. But this is not based on learning from historical data alone. You always have to impose sufficient restrictions on your model (e.g. the DGP) to make such causal inference.

This is not new, but for your benefit, I basically transferred the above from an AI/ML book on causal analysis.

AI/ML can do causal analysis, because its statistics. AI/ML are not separate from these issues, do not solve these issues ex-ante, are not "better" than other techniques except on the dimensions that they are better as statistical techniques, AND, most importantly, causal application necessarily implies a theory.

Whether this is implicit or explicit is up to the researcher, but there are dangers associated with implicit causal reasoning.

And as Pearl wrote (who is not a fan of econometrics by any means!), the issue of causal inference was FIRST raised by econometricians BASED on combining the structure of economic models with statistical inference. In the 1940's.

I mean I get the appeal to trash talk social sciences, but when it comes to causal inference, you probably picked exactly the wrong one.

You are free to disregard economic theory. But you can not claim to do causal analysis without any theory. Doing so implicitly is dangerous. Furthermore, you are wrong in the sense that economic theory has put causal inference issues at the forefront of econometric research, and is therefore good for science even if you dislike those theories.

zwaps · on June 13, 2019

And by the way, I can come up with a good number of (drastic, hypothetical) policy interventions that would break your inference about a market crash - an inference you only were able to make once you saw such a market crash at least once.

If this dependence is broken, your non-causal model will no longer work, because the relationship between yield curve and market crash is not a physical constant fact. What you did to make it a causal inference is implicitly assume a theory about how markets work (e.g. - as they do right now -) and that it will stay this way. Actually, you did a lot more, but that's enough.

Now, you and me, we can both agree that your model with yield curves is good enough. We could even agree that you would have found it before the financial crashes, and are a billionaire. But the commonality we agree upon is a context that defines a theory.

Some alien that has been analyzing financial systems all across the universe may disagree, saying that your statistical model is in fact highly sensitive to Earth's political, societal and natural context.

Such is the difficulty of causal analysis.

westurner · on June 14, 2019

> By extension, it is impossible for any ML mechanism to predict unobserved interventions without being a causal model.

In lieu of a causal model, when I ask an economist what they think is going to happen and they aren't aware of any historical data - there is no observational data collected following the given combination of variables we'd call an event or an intervention - is it causal inference that they're doing in their head? (With their NN)

> Now, you and me, we can both agree that your model with yield curves is good enough.

Yield curves alone are insufficient due to the rate of false positives. (See: ROC curves for model evalutation just like everyone else)

> We could even agree that you would have found it before the financial crashes,

The given signal was disregarded as a false positive by the appointed individuals at the time; why?

> Some alien that has been analyzing financial systems all across the universe may disagree,

You're going to run out of clean water and energy, and people will be willing to pay for unhealthy sugar water and energy-inefficient transaction networks with a perception of greater security.

That we need Martian scientist as an approach is, IMHO, necessary because of our learned biases; where we've inferred relations that have been reinforced which cloud our assessment of new and novel solutions.

> Such is the difficulty of causal analysis.

What a helpful discussion. Thanks for explaining all of this to me.

Now, I need to go write my own definitions for counterfactual and DGP and include graphical models in there somewhere.

zwaps · on June 15, 2019

A further hint, here is a great book about causal analysis from a ML/AI perspective

https://mitpress.mit.edu/books/elements-causal-inference

I feel like you will benefit from reading this!

It's free!

zwaps · on June 15, 2019

> In lieu of a causal model, when I ask an economist what they think is going to happen and they aren't aware of any historical data - there is no observational data collected following the given combination of variables we'd call an event or an intervention - is it causal inference that they're doing in their head? (With their NN)

It's up for debate if NN's represent what is going on in our heads. But let's for a moment assume it is so.

Then indeed, an economist leverages a big set of data and assumptions about causal connections to speculate how this intervention would change the DGP (the modules in the causal model) and therefore how the result would change.

An AI could potentially do the same (if that is really what we humans do), but so far, we certainly lack the ability to program such a general AI. The reason is, in part, because we have difficulty creating causal AI models even for specialized problems. In that sense, humans are much more sophisticated right now.

It is important to note that such a hypothetical AI would create a theory, based on all sorts of data, analogies, prior research and so forth, just like economists do.

It does not really matter if a scientist, or an AI, does the theorizing. The distinction is between causal and non-causal analysis.

The value of formal theory is to lay down assumptions and tautological statements that leave no doubt about what the theory is. If we see that the theory is wrong, because we disagree on the assumptions, this is actually very good and speaks for the theory. Lot's of social sciences is plagued by "general theories" that can never really shown to be false ex ante. And given that theories can never be empirically "proven", only validated in the statistical sense, this leads to a many parallel theories of doubtful value. Take a gander into sociology if you want to see this in action.

Secondly, and this is very important, is that we learn from models. This is not often recognized. What we learn from writing down models is how mechanics or modules interact. These interactions, highly logical, are USUALLY much less doubtful than the prior assumptions. For example, if price and revenues are equilibrium phenomena, we LEARN from the model that we CAN NOT estimate them with a standard regression model!

This is exactly what lead to causal analysis in this case, because earlier we would literally regress price on quantity or production on price etc. and be happy about it. But the results were often even in the entirely wrong direction!

Instead, looking at the theory, we understood the mechanical intricacies of the process we supposedly modeled, and saw that we estimated something completely different than what we interpreted. Causal analysis, among other things, tackles this issue by asking "what it is really that we estimate here?".