
BlackRock shelves unexplainable AI liquidity models - gsanghera
https://www.risk.net/asset-management/6119616/blackrock-shelves-unexplainable-ai-liquidity-models
======
michaelbuckbee
I learned a new term in the context of AI recently: "Specification Gaming".

There's a big list here: [https://t.co/OqoYN8MvMN](https://t.co/OqoYN8MvMN)

But it's stuff like:

\- Evolved algorithm for landing aircraft exploited overflow errors in the
physics simulator by creating large forces that were estimated to be zero,
resulting in a perfect score

\- A cooperative GAN architecture for converting images from one genre to
another (eg horses<->zebras) has a loss function that rewards accurate
reconstruction of images from its transformed version; CycleGAN turns out to
partially solve the task by, in addition to the cross-domain analogies it
learns, steganographically hiding autoencoder-style data about the original
image invisibly inside the transformed image to assist the reconstruction of
details.

\- Simulated pancake making robot learned to throw the pancake as high in the
air as possible in order to maximize time away from the ground

\- Robot hand pretending to grasp an object by moving between the camera and
the object

\- Self-driving car rewarded for speed learns to spin in circles

All of which leads me to think that if you can't at some level explain
how/what/why it's reaching a certain conclusion that it may be reaching a
radically different end than you're anticipating.

~~~
renjimen
Not quite the same, but equally as disastrous is data leakage. This comes in
some fun and unexpected forms. One example is from the fisheries monitoring
competition on Kaggle, where NNs were learning to predict fish caught using
the background image of the boats rather than the fish shown on camera. i.e.
different boats tend to catch different fish

~~~
thisisit
I have recently started learning ML and trying competitions on Kaggle. And I
also seeing that in many of the competitions I have tried. The biggest
predictor turns out to be the feature which wont be present when the data is
created.

------
agentofoblivion
I hear this a lot. In my opinion, people overestimate their ability to
“understand” non-neural net models.

For instance, take the go-to classification model: Logistic Regression. Many
people think they can draw insight by looking at the coefficients on the
variables. If it’s 2.0 for variable A and 1.0 for variable B, then A must move
the needle twice as much.

But not so fast. B, for instance, might be correlated with A. In this case,
the coefficients are also correlated and interpretability becomes much more
nuanced. And this isn’t the exception, it’s the rule. If you have a lot of
features, chances are many of them are correlated.

In addition, your variables likely operate at different scales, so you’ll have
needed to normalize and scale everything, which makes another layer of
abstraction between you and interpretation. This becomes even more complicated
when you consider encoded categorical variables. Are you trying to interpret
each category independently, or assess their importance as a group? Not
obvious how to make these aggregations. The story only gets more complicated
for e.g. Random Forests.

I think it’s best to accept that you can’t interpret these models very well in
general. At least in the case of some models (like neural nets), they
approximate a Bayesian posterior, which has some nice properties.

~~~
disgruntledphd2
How do neural nets approximate a Bayesian posterior? Not snark, would really
like some references if possible.

On the major point, while I agree with you, its much nicer to be able to show
the "top" variables from a model, which is doable from logreg and forests, but
is much, much, much more difficult from a neural net perspective.

Additionally, as they tend to take longer to train, its harder to iterate with
them, and as they fit so very many parameters, I'm generally pretty sceptical
as to their generalisability. That being said, in some tests I've run I've
been pleasantly surprised at their performance.

~~~
cosmic_ape
>>How do neural nets approximate a Bayesian posterior?

Not sure what GP had in mind, but if a feature x appears in a dataset n times,
with pn times with positive label, and (1-p)n times with negative, and your
classifier is f(x) which is trained with the "cross-entropy" cost, then the
ideal value, that minimizes the cost should be f(x) = p. In this sense, f(x)
is the probability of positive given feature.

Whether neural nets really realize this and how reliable that is, is another
question. But that's the intention of the cross entropy cost.

~~~
beta_binomial
This does not make any sense to me and neither did OP's comment about NN's
approximating the posterior. In fact, if p were the solution then that would
simply be the maximum likelihood estimate, which would not include the
p(theta), or the prior, and hence would not be Bayesian.

~~~
cosmic_ape
Well, p definitely _is_ the solution in the case I mentioned. It is indeed the
maximum likelihood solution. You could incorporate prior info about theta via
a regularization term, if so inclined. What does not make sense in this?

Not sure what the OP meant, but I though it might be useful to mention how
estimators may be interpreted as anything probabilistic at all. Often,
arbitrary numbers between 0 and 1 are termed "probabilities", but in this case
there actually is some proportion or probability to which f(x) should ideally
correspond.

------
d--b
I applaud this decision.

If you can't explain the model, it means you don't know the assumptions that
went into the model's output, which means you won't see it coming when the
model doesn't work anymore. And if you don't want to look like a moron saying
"oh but the model said...", (and not getting sued for mismanaging investors
money).

Honestly, it's probably the investors asking questions that led them to this
decision, but nonetheless, this is reason talking.

~~~
fjp
> If you can't explain the model, it means you don't know the assumptions that
> went into the model's output

This is true, but there are many, many kinds of models that have basically
zero explanatory power but have higher predictive capabilities than models
that are easier to explain. They have been around a long time and are used for
many different practical applications.

Unfortunately, the draw of that seemingly infallible super-high-predictive
capability will almost certainly be heavily involved in financial markets
before long. I have no problem if some people want to risk a bunch of money in
a hedge fund that uses neural net models or whatever else, but having enough
money controlled by these models could pose a serious systemic risk.

~~~
wpietri
> having enough money controlled by these models could pose a serious systemic
> risk

This is the part that worries me. For a decade before the 2008 financial
collapse, people were quietly saying, "Gosh, there's a lot of activity in
derivatives and we don't really know where the risk is going."

One of many factors there was the way rating agencies gave very generous
ratings to mortgage securities. Critics note that it was in their short-term
financial interest to do that. If people can screw up that badly with models
they supposedly understand, it seems to me to be even more risky when working
with models where people have just given up understanding and put their faith
in the AI oracle. As long as they get the answers that maximize their end-of-
year bonus checks, they have a strong incentive not to dig deeper.

~~~
zeroname
In my estimation, existing models are just oracles with less variables.

> One of many factors there was the way rating agencies gave very generous
> ratings to mortgage securities.

Garbage in, garbage out.

------
resters
Here's the scenario that makes it sensible to shelve the superior AI models:

premise 1: financial crisis hits, requiring some firms to accept immediate
loans (or off books loans aka qe) to maintain solvency (classic 2008 scenario)

premise 2: firms will not have equivalent exposure, so some firms fail worse
than others, but as the risk is viewed as "systemic" all get the bailout

If some firms have AI that find risks hidden in investments that traditional
(explainable) models ignore, then those firms will sit out of markets that
will in the meantime be profitable for the firms that are unaware of the
actual risk. Metaphorically, why ruin the 70s with an accurate HIV test.

If the same models could be used to identify and securitize (and make a market
in) the invisible risk, it's possible that the market price of the risk would
similarly lead many firms to sit out of otherwise profitable markets, as the
yields of many of the traditional investments would (after the cost of
hedging) be poor.

All this would result in a shrinking of the pie without an analytical
explanation. "What do you mean the pie is smaller than we thought it was and
we have to grow at a slower rate than we thought?", the CEO might ask.

In most scenarios where quantitative approaches give better insight into the
future, the firm to develop the approach makes a fortune until others can
catch up.

But what we have today is a financial system where keeping the overall system
running hot is government policy, and so all participants have the incentive
to ignore information that would lead to rational reallocation of investments.

Once the system's _normal_ is leveraged/hot enough, the system becomes
resistant to certain kinds of true information.

------
lawlessone
They're right.

Why would we put a NN in charge of anything important if we can't explain how
a particular model works?

Would you want your car or an aircraft you're on piloted by neural net the
actions of which can't be explained?

What if it encounters an unforseen event that causes a flash crash or worse an
actual crash that kills people?

Do you want to trust something built from incomplete data and simulated
annealing with your life and livelihood?

~~~
bitL
Why do you trust moody, inconsistent people with their own biases as your
leaders in life-or-death situations then?

~~~
jerf
We have centuries/millennia/choose-a-time-frame of experience with what they
do, how they tend to perform, and while it's not necessarily perfect, we know
how to engineer around their limitations, again, with centuries/millennia/etc.
of experience in that field.

Computer models are much younger, and we know they tend to have weird
pathological corners, but unlike humans and their weird pathological corners,
we have a much less firm grasp on what they are.

In many cases, humans have skin in the game, too. No computer model is yet
sophisticated enough to be able to say that about.

There is also some irrationality in having someone to blame, etc. Certainly.
But it's not the only part of the story.

~~~
samstave
Your comment made me think of something, perhaps in the larger assessment of
the soundness/comfort-level we will/do have with AI decision making:

With the moody, inconsistent ___people_ __, we can understand and evaluate
their incentive reasoning - positive or negative and empathize with that
understanding to assign a level of trust.

With any AI - especially to ones who didnt design/create it - any and all of
its motivations, incentives are completely (emotionally) opaque to the
recipient of the outcomes of its behavior.

I know we are saying the same thing - but unless a "receipt" can be given for
a decision to ___why_ __an AI / NN made a decision, people will not learn to
build an understanding of a trust.

Otherwise, AI will always be hated by the humans who feel slighted by the cold
decisions of a machine against them.

And that is clearly going to be the real future.

There will be anti-AI terrorism/activism and violence against systems,
companies and countries that make life-bending decisions by AI against groups
of people.

The seed to this future is the gatekeeping to capital/resources through FICO-
like systems.

~~~
cr0sh
> There will be anti-AI terrorism/activism and violence

There's already been bomb attacks in Mexico against nanotechnology
researchers, so it's not a big leap from there to AI/ML and other similar
research (especially into GAI).

------
georgeek
David Freedman has this following dialogue in his Statistical Models: Theory
and Practice book:

Philosophers' stones in the early twenty-first century Correlation, partial
correlation, cross lagged correlation, principal components, factor analysis,
OLS, GLS, PLS, IISLS, IIISLS, IVLS, LIML, SEM, HLM, HMM, GMM, ANOVA, MANOVA,
Meta-analysis, logits, probits, ridits, tobits, RESET, DFITS, AIC, BIC,
MAXNET, MDL, VAR, AR, ARIMA, ARFIMA, ARCH, GARCH, LISREL[...]...

The modeler's response We know all this. Nothing is perfect. Linearity has to
be a good first approximation. Log linearity has to be a good secont
approximation. THe assumptions are reasonable. The assumptions don't matter.
The assumptions are conservative. You can't prove the assumptions are wrong.
The biases will cancel. We can model the biases. We're only doing what
everybody else does. Now we use more sophisticated techniques. If we don't do
it, someone else will. What would you do? The decision-maker has to be better
off with us than without us. We all have mental models. Not using a model is
still a mode. The models aren't totally useless. You have to do the best you
can with the data. You have to make assumptions in order to make progress. You
have to give the models the benefit of the doubt. Where's the harm?

------
chatmasta
Are the non-AI models any more “explainable?” Models built on multivariate
statistics, processing terabytes of data a day, spitting out numbers might be
“understandable” in the sense that there is some discrete representation of
how their inputs map to outputs. But can anyone really look at those
algorithms and explain _why_ they work? What’s really the difference between
NN and advanced statistical regression, beyond differing levels of
familiarity/comfort?

~~~
bitL
Buzzword bingo is more difficult to play with NNs than throwing some
(misunderstood) p-values around for people that understand a bit about
stats/optimization/etc. but aren't at the bleeding edge (i.e. a typical MBA
with an outdated tech degree).

~~~
raxxorrax
Came here for fluid mechanics. Don't know if my tech degree is outdated
already, but I certainly skipped getting an MBA. At least I wondered a bit as
to why BlackRock is into fluidity simulations too.

~~~
bitL
Black-Scholes computed on NVidia Teslas, right?

If your tech degree is not in computer science, it's likely not outdated. If
it is in CS, then it most likely is.

------
parallel_item
I think a key factor in this decision may be the perceived risk of putting
huge capital behind a single black box model. I would assume this differs from
more ML-heavy quant firms like Two-Sigma, because BlackRock's products
generally perform at a huge scale with some central idea behind them. Two-
Sigma probably can spread out the same amount of assets across many different
black-box models, diversifying and reducing risk through these means. In this
case, perhaps only 1 model dictating such a huge chunk of capital was just too
much uncertainty?

I have no evidence of the scale and diversification of both these, so evidence
would be helpful in refuting the above!

~~~
beta_binomial
I think so. The ultimate question is who are you going to sue and who is going
to sue you if something goes wrong? Imagine having to put your ML researcher
on the stand and having him say "I can't say for sure that this or that didn't
affect the outcome in a meaningful way"

------
reallymental
Who's to blame when the model sees 'red' ? Management needs a head, a model
isn't one yet.

------
rq1
Quite natural. AI in market finance is a fraud for the moment.

AI models totally fail to do what classical (and parsimonious, explainable,
cheap...) methods/algos/models achieve quite easily (BS, Hawkes, RFSV,
uncertainty zones, Almgren-Chriss/Cartea-Jaimungal... etc.). Actually, I'm
tempted to say that AIs don't work at all.

I've seen so far funds leveraging "big data" with AIs (eg. realtime processing
of satellite imagery, cameras, (more) news...) and get more/better information
(than the others) to finally calibrate and use these (parsimonious) models,
nothing (interesting) else.

Do not get fooled. Lots of banks announced that they use AIs, to surf on the
hype, because today if you don't do AIs, you're not in, because today everyone
is a Data Scientist, that's all.

------
Invictus0
Anyone have a mirror link?

------
fipple
Corporations exist in a world with governments and politics. It’s entirely
reasonable for senior management to require a methology that they can defend
in a televised Senate hearing even at the expense of some predictive power.

------
yters
Maybe just shelve ML and go back to traditional statistics which focus a lot
more on being explainable.

------
ZeroCool2u
This seems like a management problem, not a model issue.

------
zzzeek
content not available without a paid subscription?

------
fiveFeet
Is there a free version of the article available? The link requires paid
subscription.

------
bluetwo
Isn't it their choice to make?

------
m3kw9
The manager probably saw the model as a threat to his job security. Looked for
a way out and there it is, the always persistent problem of AI models

~~~
lucozade
> The manager probably saw the model as a threat to his job security

Indeed, because if the manager doesn't understand the model well enough to
either mitigate its weaknesses or reserve sufficiently against them, they'll
probably get fired some point down the line.

~~~
5minbreak
Both a fault of the employees that worked on this, and the manager.

Your deliverable should be an interpretable model. You can (and probably
should) make neural network models interpretable. If upper management does not
trust your performance evaluation enough to bet on it, either the evaluation
was weak (and no model should be deployed, however simple and interpretable)
or upper management doesn't know enough about modern ML to have to make these
decisions.

I have sympathy for the manager in charge for making a decision on a complex
model (while all they ever knew was simple survival models and basic
statistical models). But you got to move with the times. Your competitors will
use the most powerful models available (and some may go under due to improper
risk management). Your employees don't want to build logistic regression
models until eternity.

~~~
typeformer
Not even Google has come anywhere close to being able to make complex NN
models that have a human interperable "receipt" for decisions. In fact, for
certain classes of problem solving it's likely impossible and that is already
a huge problem.

~~~
5minbreak
I posit that complex NN models can achieve the same level of interpretability
as logistic regression. In part, because some interpretability methods use
logistic regression as a white box model to explain black boxes.

In other words: If you are comfortable OK'ing a logistic regression model
(because you looked at the coefficients and they made sense), you should be
comfortable OK'ing a complex NN model (because the evaluation and
interpretability modeling makes sense).

Nitpicking, but significant: Most models don't output decisions, they output
predictions. Decision scientists then build a policy on top of the model. Key
issue here is that the policy makers don't trust the predictions. But I posit
they have no reason to trust the predictions of a logistic regression model
any more than the predictions of a complex black box. Provided, of course, you
deliver interpretability UX, confidence estimates, and strong statistical
guarantees and tests. Which is possible for even the blackest of boxes.

If automatic justification is impossible for computers/black boxes, I believe
it is impossible for humans too (as per Church-Turing). But let's say it is
impossible. Do you think Google would use a white box model to optimize
Adsense, because they can't interpret powerful deep learning solutions (like
risk management for BlackRock: a very critical part of their business)?

I'd say Google came pretty close with [https://distill.pub/2018/building-
blocks/](https://distill.pub/2018/building-blocks/) (they are not the only
players in the interpretability field, and plenty of methods are becoming
available, in large part driven by academia not business: interpretability and
fairness are not too important for the bottom line).

