
The Emerging Revolution in Game Theory - Anon84
http://www.technologyreview.com/view/428920/the-emerging-revolution-in-game-theory/
======
forrestthewoods
This is roughly the 5th article on a "Revolution in Game Theory" that I've
seen and not a single one of them has successfully described what the so
called revolutionary strategy is.

Here is a direct link to a PDF of a paper defining the Zero Determinant
Strategy. <http://arxiv.org/pdf/1208.2666v1.pdf> I don't really understand it.
Neither do journalists it seems. Can someone explain?

~~~
jorleif
Actually, the original paper is this one:
[http://www.pnas.org/content/early/2012/05/16/1206569109.full...](http://www.pnas.org/content/early/2012/05/16/1206569109.full.pdf+html)

Here is an description of the result and its implications
[http://golem.ph.utexas.edu/category/2012/07/zerodeterminant_...](http://golem.ph.utexas.edu/category/2012/07/zerodeterminant_strategies_in.html)

~~~
DennisP
Wow. Not long ago I was trying to test whether an idea for a certain kind of
online interaction would work out the way I hoped. I built a simulation with
agents that evolved their strategies, trying to optimize their own outcomes.

Maybe that's not such a good test after all.

~~~
AnIrishDuck
I don't believe this outcome is optimal from an evolutionary standpoint. It
requires the other player to know what your strategy is and accept the
ultimatum (I will do x% better than you). The other player still has the
option to reject the ultimatum, screwing over both parties.

In other words, for this strategy to work it needs "pragmatic" other players
that are (a) intelligent enough to know what the other player is doing and (b)
willing to accept the unequal outcome. This strategy isn't symmetric: if you
turned two of these bots against each other the total outcome would be much
worse than if two TFT bots played each other.

Thus it stands to reason that TFT or some small variant is still the
evolutionary winner.

~~~
DennisP
What I'm thinking is, it appears that players with a theory of mind may not
actually use the optimal evolutionary strategy. Since humans have a theory of
mind, my simulation may be a poor predictor of what my users would actually
do.

------
ced
It's inspiring to note that Bill Press is 64 years old, and Freeman Dyson is
88! Here's a more technical explanation, gleaned from
<http://arxiv.org/abs/1208.2666>.

Each prisoner dilemma round has four possible outcomes: CC, CD, DC and DD
(C=Cooperate,D=Defect). We study probabilistic strategies which depend only on
the previous round, so for instance, P(player 1 plays C at round i+1|CD was
played at round i)=0.3. From the paper:

 _Such games can be described by a Markov process defined by the four
probabilities that characterize each of the two player’s strategies [9]
(because this is an infinitely repeated game, the probability to engage in the
first move–which is unconditional–does not play a role here). Each Markov
process has a stationary state given by the left eigenvector of the Markov
matrix, which in this case describes the equilibrium of the process. The
expected payoff is given by the dot product of the stationary state and the
payoff vector of the strategy. But while the stationary state is the same for
either player, the payoff vector–given by the score received for each of the
four possible plays CC, CD, DC, and DD–is different for the two players for
the asymmetric plays CD and DC. Because the expected payoff is a linear
function of the payoffs, it is possible for one strategy to enforce the payoff
of the opponent by a judiciously chosen set of probabilities that makes the
linear combination of determinants vanish (hence the name ZD strategies). Note
that this enforcement is asymmetric because of the asymmetry in the payoff
vectors introduced earlier: while the ZD player can choose the opponent’s
payoff to depend only on their own probabilities, the payoff to the ZD player
depends on both the ZD player’s as well as the opponent’s probabilities. This
is the mathematical surprise: the expected payoff is usually a very
complicated function of six probabilities (and four payoff values, for the
four possible plays). When playing against the ZD strategy, the payoff that
the opponent reaps is defined by the payoffs and only two remaining
probabilities that characterize the ZD strategies._

------
gizzlon
" _Nevertheless, in a single game, the best strategy is to snitch because it
guarantees that you don't get the maximum jail term._ "

Huh? That isn't correct is it? Snitching is best because it gives the highest
payout regardless of what the other person does (snitching strictly dominates
not snitching).

If the other does not snitch and you do, you go free. If the other person does
snitch, you should as well, because it gives you 3 months instead of 6.

~~~
rubidium
You describe exactly what that sentence says...

~~~
masterzora
Not really. The sentence says that the reason you defect is specifically to
avoid the worst possible result. This is bad logic. The reason defect is
considered dominant is because, regardless of what the other player does, your
outcome is better if you defect. It is a very different statement.

------
mindcrime
Hrrm... I'm no game theory expert, but I thought that it had been known for
some time that there were strategies which could beat "tit for tat" in IPD.
But this article makes it sound like "tit for tat" was the state of the art
until this particular discovery.

Guess I need to go back and do some more reading on the subject. This reminds
me, I've been meaning to read _The Evolution of Cooperation_ [1] forever, and
haven't found time yet. _sigh_

[1]: <http://en.wikipedia.org/wiki/Evolution_of_cooperation>

------
cheshirecat
A separate example of very advanced strategies in the simplest game: Rock
Paper Scissors Programming Competition

<http://www.rpscontest.com>

Personally I think it is somehow ahead of these academic researches in some
sense.

~~~
adrianN
Except it is not difficult to prove that you can't win in expectation against
a random RPS player. The programming competition is only fun because the
players are nonrandom.

~~~
tripzilch
Nope, the random player ends up in the middle, on average.

If you'd join a tournament with _only_ random players then your optimal
strategy, actually doesn't matter because you're going to randomly win/lose
(because your opponents always play random) and on average, end up in the
middle rank, just like everybody else.

If there are other non-random players, then you can aim for the middle rank by
playing random (and one of the other non-random players will do better than
the other, and you will do worse than that one and better than the other). Or
you can try to beat the non-random players (while still doing near 50/50 odds
on the random ones).

So if you play random, and there is more than one non-random player, you will
lose.

Also, check out some of the strategies. There's a couple of really well-
performing ones that are incredibly clever, but not all that hard to
understand.

One goes like this:

It's based on trying to predict the opponent's strategy and playing against
that. But what if the opponent _expects_ that and plays _you_ against that?
A-ha but what if _you_ play against your opponent trying to play _you_?

You'd think that this reasoning can go on forever, but since there are only
three moves in RPS, and they circularly defeat eachother, you only need to
consider these three steps as the next one brings you to the move of square
one.

The algorithm then basically keeps a running score of these three different
depths of pre-emptive strategy, and picks the one that works best based on
historical moves "what would have won" (and runs random while collecting
data).

This algo did extremely well in the first couple of tournaments held many
years ago. I don't know what the most popular strategy is currently though.

------
gizzlon
A little OT, but I'm currently watchin Benjamin Polak's game theory elctures
and they are great!

<http://www.academicearth.org/courses/game-theory/> (you don't need an
account)

------
chubbard
I'm getting pretty tired of technology review extremely poor explanations for
science discoveries. I only know vaguely more than I did when I started
reading the article. When I got to the bottom I didn't know if I should be
excited or not.

------
emmelaich
Sort of related in that it pertains to Prisoner's Dilemma, and suggested to
some that there is a 'solution' to PD:

[http://www.schneier.com/blog/archives/2012/04/amazing_round_...](http://www.schneier.com/blog/archives/2012/04/amazing_round_o.html)

The relevant vid link within:

<http://www.youtube.com/watch?v=S0qjK3TWZE8>

------
dohko
So, this paper is not really demonstrating anything against the standard game
theory applied to the prisoner's dilemma.

In a single game, the proposition of the dilemma still holds true. The best
solution is if both don't talk.

In an iterative game, then things change.. but that's pretty obvious. So not
sure what is so revoluationary about this paper.

~~~
baak
I thought it said in a single game your best strategy is to talk? This
guarantees you will not take the maximum penalty.

------
ramblerman
"This and similar "quant" nonsense is precisely what has led to the biggest
uncollectable pile of derivative securities and assorted options of the too
big to fail banking disaster we are currently confronting. All these gambling
scenarios are counter productive to improvements necessary to increasing the
world's real physical wealth. These studies are worse than useless, they are
parasitic cancers on society."

This comment below the article made me smile. I kind of feel inclined to
agree, even though it is of course too simplistic a view.

------
novaleaf
Not linking to original sources. Is this poor journalism, or market strategy?

(see @jorleif's post here for the links)

------
Steko
Good old rock, nothing beats rock. Doh!

~~~
geuis
Do you actually think this comment is intelligent or otherwise improves the
discussion others are having about this topic? For one, I am pretty much
completely lost in terms of what this article is saying, and inane commentary
does not help in the process of understanding.

~~~
Steko
The gist of the article is that a long held strategy has been found inferior.
This has been hailed as a revolution by some (the author and no doubt some of
the community he's writing about) while the longer view, which can be seen in
the post's comments is that it's expected:

 _"This just seems to be an extension of the folk theorem to the family of
probabilistic strategies. 1. Any level payoff level can be supported in
equilbrium (Press & Dyson), 2. None of those equilbria survive evolutionary
refinements (this paper). Interesting stuff, but not quite revolutionary as it
is suggested. "_

The reference I was making was a famous line from the Simpsons where Bart's
strategy of "good old rock, nothing beats that" is adapted to by Lisa who
picks paper. Yeah I misquoted, season 4 was a long time ago :)

<http://www.youtube.com/watch?v=NMxzU6hxrNA>

No it probably wasn't the most intelligent or informed comment I could have
made. Sometimes the not most intelligent comments can be useful. Maybe this
one wasn't one of those but, /shrug I would probably make it again.

