
Game Theory Reveals the Future of Deep Learning - curiousgal
https://medium.com/intuitionmachine/game-theory-maps-the-future-of-deep-learning-21e193b0e33a#.7v4y09kf4
======
Houshalter
Adversarial networks can be viewed as different agents competing against each
other. But beyond that, I don't see how game theory is relevant or what
insights it provides. This article is super vague and confusing to me.

>Where do these regularizations come from and how can we select a good
regularization? How do we handle impartial information? This is where a game
theoretic viewpoint becomes important. Generalization is sometimes referred to
as ‘structural risk minimization’.

Ok... and what useful insight do the game theorists have here? He doesn't
elaborate at all. Regularization is a very well studied concept already. I
don't see how this is connected to game theory at all. Regularization is all
about priors and Occam's razor.

~~~
0binbrain
I believe that the key point here is that as adversarial networks compete
against each other repeatedly they eventually will converge on a Nash
Equilibrium.

Nash Equilibrium from Wikipedia: "Amy is making the best decision she can,
taking into account Phil's decision while Phil's decision remains unchanged,
and Phil is making the best decision he can, taking into account Amy's
decision while Amy's decision remains unchanged."

At the point this happens, a stable solution exists which is almost like
saying that we have made the most rational decision given the information we
had. Nash was novel in this theory to say that if the participants were
rational they would converge on an equilibrium and not get lost in endless
recursion. Endless recursion being something like participants endlessly
guessing the other opponent move... If Amy knows Phil knows Amy knows Phil
knows Amy knows Phil... Obviously it would be problematic for DL to plunge
into infinite recursion.

I like to think of the Nash Equilibrium as a balanced teeter totter. When the
teeter totter becomes off balanced rational systems will relearn to balance
the odds again based the properties of risk (what Nash referred to as utility
I believe). Once they do so, they once again converge on an equilibrium.

~~~
PeterisP
The problem with statement "as adversarial networks compete against each other
repeatedly they eventually will converge on a Nash Equilibrium" is that they
often won't, and a big problem with training GANs is to ensure that they
converge.

Intuitively, what can happen is that one of the problems is simpler to learn
than another, and when one of the "players" becomes overwhelmingly good, then
the other part of the network stops receiving useful feedback and is unable to
find out a direction for improvement, the learning stops. A "sufficiently
smart" method would be able to go to a Nash equilibrium even in this case, but
current GAN methods can not, so you need to take steps to ensure that it
doesn't happen - e.g. extra normalization or less training for the "simpler to
learn" component.

It's not enough to train the "adversarial" networks to compete during each
inference, on a meta-level you must ensure a form of cooperation to ensure
that they are effective _teachers_ for each other during the learning process.
For a real life analogy, there's a difference between effective behavior
during combat and effective behavior during sparring.

~~~
0binbrain
Yeah, makes sense. It seems like when a network stops providing useful
adversarial feedback that in and of itself is a learning experience for the
smarter network. While it didn't reach an equilibrium it now knows it's at
least smarter then the other guy. I feel like it should be able to use that
experience of winning to beat the next guy.

It seems like for AI a perfect equilibrium wouldn't be easy to reach, its
often hard for the human brain to reach one after all. It'd be the fact that
an equilbrium exists though and that it's trained to find it that generates
the knowledge along the way. Kinda like a journey not the destination learning
method. I'm just theorizing though.

------
graycat
My background in game theory is from:

(1) careful study, teaching, and applying the detailed mathematics of linear
programming and, in particular, its approach to proving the basic von Neumann
saddle point theorem of game theory,

(2) applications of game theory to US strategic systems analysis, e.g., the
_Blue_ side positioning submarines and targeting multi-warhead nuclear
missiles and the _Red_ side allocating submarine search resources so that if
Red shoots first Blue gets maximum expected target value destroyed,

(4) study of game theory from

Guillermo Owen, _Game Theory_ , W. B. Saunders.

(5) study of game theory, including Nash's result with a nice proof by Lemke
and Sion's result, in

T. Parthasarathy and T. E. S. Raghavan, _Some Topics in Two-Person Games_ ,
ISBN 0-444-00059-3, American Elsevier.

With this background, I read the OP and found no clear connections with
anything in _game theory_.

~~~
westoncb
So is [1], for example, misleading?

[1] [https://arxiv.org/abs/1406.2661](https://arxiv.org/abs/1406.2661)

~~~
nl
That paper doesn't mention game theory or the Nash equilibrium at all.

Perhaps you mean [2], which does. However, the article seems to misinterpret
this: the fact that that system approximates the Nash equilibrium should be
expected. Any system (including humans) that performs well in those types of
games will, almost by definition.

[2]
[https://arxiv.org/pdf/1603.01121.pdf](https://arxiv.org/pdf/1603.01121.pdf)

~~~
westoncb
It doesn't mention Nash equilibrium, but the whole approach they are
discussing comes from the idea of pitting two networks against each other in a
"minimax two-player game".

"We propose a new framework for estimating generative models via an
adversarial process, in which we simultaneously train two models ... This
framework corresponds to a minimax two-player game."

Edit: Regarding the paper you linked to. The fact that a Nash equilibrium is
expected is an important part of why it's interesting to structure things this
way: it's an optimization process guaranteed to stabilize (whereas it's
problematic that other approaches will diverge).

"When applied to Leduc poker, Neural Fictitious Self-Play (NFSP) approached a
Nash equilibrium, whereas common reinforcement learning methods diverged"

~~~
nl
I think we agree?

The articles claims (based on the paper I linked, and others) that:

 _What we see in these 3 players are 3 different ways game theory plays in
Deep Learning. (1) As a means of describing and analyzing new DL
architectures. (2) As a way to construct a learning strategy and (3) A way to
predict behavior of human participants. The last application can make your
skin crawl!_

Of these claims, I think all are wrong.

One can certainly make the case that it is interesting to measure the
performance of a system compared to the Nash equilibrium (when possible), but
the author seems to think that the designers of the system are somehow using
game theory to design the system.

~~~
westoncb
They are in one of the cases pointed out, not the other two. The novel idea is
to have two networks compete against each other in a two-player minimax game,
developing progressively better strategies until something like Nash
equilibrium is reached--at which point you have one well-trained classifying
network and one well-trained generating network.

------
aub3bhat
The author is wildly extrapolating limited results ("Nash Equilibrium outcomes
achieved by agents trained using deep reinforcement learning") to the entire
field. Just because in DRL agents approach nash equilibrium in limited set of
games does not means that somehow using nash equilibrium is going to help
design future algorithms. E.g. for several games such as Pong, Go on which we
know that DRL outperforms traditional methods, there is no tractable Nash
equilibrium.

Also let alone so many other problem which come under umbrella of deep
learning such as recognition / detection/ Seq to Sequence prediction where
there is no "Game theory" component.

~~~
westoncb
My understanding is that the use of game theory in this context has nothing to
do with the domain in which the network is being trained. It's not about
creating game playing AI's—it's about a 'game' between two neural networks,
one who competes by getting better at classifying, and the other by getting
better at generating. That basic architecture is quite general—which I think
is why you also see e.g. LeCun being excited about this direction.

To be fair, the article does a kind of weird thing in discussing three
different ways game theory has shown up in the context of machine learning
recently—but they're all separate ideas really.

Edit: a more precise description in the abstract here:
[https://arxiv.org/abs/1406.2661](https://arxiv.org/abs/1406.2661)

------
arcanus
Game theory is a fun way to tell stories... but it is seldom predictive.

Conversely, machine learning seldom provides a story (at least in a human
readable form) but it is predictive by nature.

~~~
sgt101
Well here as are a few predictions made by game theory.

1\. There wasn't a nuclear war today

2\. Every telco in the world gives large chunks of money to governements due
to auctions

3\. Ebay sellers get higher prices than they should

Also, I can read association rules, decision trees and lisp programs ok... and
yet my machine learning programs (ok ok nowadays the ones I get out of various
apis and packages) have written them...

~~~
taeric
1 and 3 aren't predictions, but post-hoc explanations.

There is an argument that game theory helped lead to 1, I suppose. But, in
general, I think that is giving them more credit than they would even ask for.
Mutually assured destruction is far from their central theorem.

2 could just as easily be explained by standard supply/demand scenarios. There
is a very limited supply of that is auctioned off to telcos and they have a
very vested interest in buying as much as they can. Even if they are not going
to utilize it.

~~~
sgt101
What!!!

1) You can allocate credit how you like, but there's a dominant strategy for
that too.

2)Ok for the UK case see

[http://www.nuff.ox.ac.uk/users/klemperer/biggestpaper.pdf](http://www.nuff.ox.ac.uk/users/klemperer/biggestpaper.pdf)

And look them up.

Interestingly your position makes me wonder if you are actually an exec
working in a senior finance position in a major telco?

~~~
taeric
I'm looking forward reading that more in depth, but I am curious what you
think my claim was.

I gladly admit I am using a layperson view that the spectrum auctioned off to
telcos is a) likely to just get bought up by entrenched actors, and b) is
unlikely to lead to any innovations in the marketplace.

Evidence that I am wrong on either of those points would be fun to read. My
second claim, in particular, is one that I suspect is ill-founded.

~~~
oli5679
EU economies auctioned off 3G spectrum at the same time. The ones with
sloppily designed allocation rules raised far less money (controlling for
GDP).

~~~
taeric
Yeah, that is what the linked paper was about. However, I'm not sure I care
too much for the goal of raising more money. In some ways, I would view money
raised as heat generated for a mechanism. That is, a necessary by product that
is certain to correlate heavily with any outcome, but ultimately waste that
should be minimized.

The cynic in me sees this as yet another way that economists have found to
increase the wealth generation capabilities of valuable things, while at the
same time losing sight of the utility that was there.

~~~
oli5679
So the money either goes to government or the telcos' employees and
shareholders. It's a valid goal for the government to attempt not to undersell
the asset, but as you say it's primarily a transfer. If you want an example
that changes more than market prices, have a look at Roth's work on matching
for hospitals, schools and organs:

[https://www.google.co.uk/url?sa=t&source=web&rct=j&url=http:...](https://www.google.co.uk/url?sa=t&source=web&rct=j&url=http://www.nobelprize.org/nobel_prizes/economic-
sciences/laureates/2012/popular-
economicsciences2012.pdf&ved=0ahUKEwiylv-1gMnRAhXmJcAKHa-9DK0QFgg1MAI&usg=AFQjCNHR0sQAdtUdaxy6uJhd0np917-zpA&sig2=Z8UCv7AUPgB26C7tcEFIcA)

------
lngnmn
The Game Theory has nothing to say about partially observable, stochastic,
continuous environments and has no concept of future.

~~~
nabla9
> The Game Theory has nothing to say about partially observable, stochastic,
> continuous environments and has no concept of future.

This is not true. What makes you think this way?

Game theory studies games with stochastic outcomes, adversaries, and partial
observability. Bayesian Nash equilibrium, trembling hand equilibrium and
correlated equilibrium are generalizations of Nash equilibrium into these
domains. Game theory also extends to continuous environments and sequential
games that have concept of past and future.

~~~
aub3bhat
Whats is the Nash equilibrium of Pong? or Go? or Chess?

All the generalizations you cite are fancy economics methods which loose
applicability at the first contact with complex real world scenarios.

Game Theory is like asking question such as "Is chess a win for white? [1]"
Sure answering this question might be important theoretically but its of no
use when designing an agent that plays and learns.

[1]
[https://en.wikipedia.org/wiki/Solving_chess](https://en.wikipedia.org/wiki/Solving_chess)

~~~
mafribe

       loose applicability at the first contact
    

When thinking about complex domains, having a simplistic model is helpful for
a variety of reasons, one of them is to establish a shared vocabulary to
communicate with others, the other is to _study_ where/why/how the simplistic
model fails to deal with the complex domain. This gives an indication for how
to go on about dealing with the complex domain, namely by selectively making
the simplistic model more complicated until it is sufficiently rich.

One example of the power of simplicity is the habit of theoretical computer
scientists to use simplistic devices such as Turing machines as models of
computing and to conflate "effective computability" with the complexity class
P and polynomial reducibility (in LOGSPACE). We know that this 'looses
applicability at the first contact with reality', but it's currently our only
approach to have a meaningful theory of computational complexity and has lead
to many deep insights already such as interactive proof systems and zero-
knowledge proofs.

~~~
aub3bhat
Hahahahaha reminds me of a saying about spherical cows.

If Game Theory was being used as an "explanation" to show why deep learning
outperforms other methods then maybe it would have been fine. But then we
already have Statistical Learning Theory and PAC learning which try to
explain/prove bounds on Machine Learning models.

There is nothing wrong with having a theory but these days Deep Learning has
become a bandwagon that any/everything gets related to. And I just don't see
how Game Theory of all mathematical tools is applicable to Deep Learning.

Finally the misguided theory driven approach was exactly the reason why Deep
Learning was so controversial initially. Turns out making a theory like SLT
and deriving ML models like SVMs is a really bad idea. Deep Learning succeeded
because the ML and Vision community adopted empiricism over the
fanciest/longest proof with covex loss. So when someone goes around claiming
Game Theory as a savior/future of Deep Learning I find it perplexing.

------
pizza
I'm guessing a lot of you have read Neuromancer; I liken generative-
adversarial networks to be a lot like neuromancer-wintermute networks :)

Imagine Robert Axelrod's iterated artificial intelligence agents tournaments
[0], _reloaded_ ; let's pit GANs with LSTM cells, deep-q-networks, style-
transferring networks, random forest algos, MCMC algorithms, word2vec
architectures, etc. against one another, if only to see possible new failure
modes of tit-for-tat.

[0]
[https://en.wikipedia.org/wiki/Tit_for_tat#Background](https://en.wikipedia.org/wiki/Tit_for_tat#Background)

~~~
hiddencost
The second half of your post reads like word salad made from headlines about
ML. They're all words, but the fact that you thought they form a thought with
semantic content makes me worried. We need fewer MBAs coked out of their mind
meddling with this stuff.

The advancements that are interesting in ML mostly come from deep statistics
and information theory and very sophisticated analysis, not combining the most
popular black boxes that you read a medium post about.

~~~
pizza
I think you might be over-fitting a bit with your model there

------
rdlecler1
One way to avoid overfitting is to trim spurious interactions Wij -> O. This
constraint forces the network to economize on the neural circuitry, thereby
driving the system to find an optimal network topology. Hypothetically this
should favor certain circuit-design principles. I don't know how much work has
been done in artificial neural networks but there has been quite a bit of work
gone with artificial gene regulatory networks which utilize the same
mathematical model of integrating weighted inputs and passing this through
some kind of sigmoidal/logistic function.

------
westoncb
If you're curious what this is all about, I'd recommend jumping to this point
in a video linked to in the article:
[https://youtu.be/IbjF5VjniVE?t=42m11s](https://youtu.be/IbjF5VjniVE?t=42m11s)
—Yann LeCun will begin saying, "the best idea ever ... is adversarial
training."

The article is kind of confusing because it's discussing three different ways
game theory has shown up in the context of machine learning recently—but
they're all totally separate ideas.

------
EternalData
A lot of advances in deep learning have come from providing more data and
semi-structured information, but there are a finite amount of fields for which
this can apply. Using game theory dynamics can move AI to transfer of domain
knowledge and can create more adaptable deep learning architectures. It could
also be a pathway to "stronger" forms of AI.

------
nl
_and (3) A way to predict behavior of human participants. The last application
can make your skin crawl!_

A lot of machine learning is about predicting human participants. Thing like
"winning" at ImageNet have devolved into predicting the exact labels humans
will put on a subset of the ImageNet dataset (as the "easy" labels are solved
and it now comes down to working out what the most humans labelled confusing
images as).

If that makes the authors skin crawl, one would have to wonder what they think
ML actually is?

------
visarga
Supervised learning - is learning from pairs of input -> desired output.

Unsupervised learning - is learning based on predicting future, present or
past data from current data.

Both use immediate feedback.

Reinforcement learning is based on learning from dynamic datasets as opposed
from static datasets.

Games are like dynamic datasets. Of course game theory has a lot to do with
deep learning. If the signal you are learning is dynamic, it's related to game
theory.

~~~
mcintyre1994
> If the signal you are learning is dynamic, it's related to game theory.

I'm not really familiar with the literature in this area, so sorry if this
sounds dumb. But what you're saying here seems to make sense and sounds almost
profound - like there's a vast amount of research (game theory) that's
immediately related in some way. Is there somewhere I can read more about the
relationship?

~~~
dandermotj
The references made in the article are probably the best place to start.

------
jonmc12
I think the argument for using physics models for constraints in a DL system
was a more clear argument
[https://www.youtube.com/watch?v=5MdSE-N0bxs](https://www.youtube.com/watch?v=5MdSE-N0bxs)

------
jonbaer
Computing Equilibria in Multi-Player Games -
[http://theory.stanford.edu/~tim/papers/sym.pdf](http://theory.stanford.edu/~tim/papers/sym.pdf)

------
caycep
the best explanation of this are Ian Goodfellow's talks (i think there are
video transcripts online somewhere) or at NIPS this past year. He and a few
others also have a new deep learning book just coming out this past month from
MIT press

------
jondubois
This adversarial approach sounds a lot like evolutionary computation.

The idea of making Artificial Neural Networks compete with each other based on
a selection function has been around for many years. I remember learning about
it during my undergrad studies 6 years ago.

This just sounds like a more refined version of that. Using RNNs instead of
ANNs.

