
Resilient cooperators stabilize long-run cooperation in Prisoner’s Dilemma - mizzao
http://www.nature.com/articles/ncomms13800
======
mizzao
Lead author of the paper here; AMA.

The main novelty about this work is that we had 100 people play repeated PD
for a _month_ instead of <1 hour which is typical in behavioral experiments,
in an attempt to make rather stylized cooperation experiments in the lab
closer to real life (admittedly, it is still only a small step.)

The findings can be summarized in a sentence as "a minority of nice people can
make everyone better off."

~~~
espeed
What if you modify the game a bit...?

Give players the ability to remember players they have engaged with and add a
third option: 1. Cooperate 2. Defect 3. Wait (defer, don't engage, don't
play). New players will seek out players and offer to engage with them. When
you defect, your reputation history declines, and when you die, your
reputation history resets to zero and you lose the option to defer.

HYPOTHESIS: Cooperators will wait/defer, Defectors will wait/defer, new
players will choose to cooperate or defect, and since defecting against defer
is an all-but-certain lose, new players will choose to cooperate hoping the
other player will too. Cooperators will naturally organize into groups of
cooperators and flourish. Players in cooperator groups will cooperate tit-for-
tat with known cooperators and will wait/defer with known defectors or
new/unknown players no one in the cooperator group has engaged with (call it
the no-fools strategy). New players and players who have been struggling in
the wilderness outside of cooperative groups will want to get into a
cooperative group and so they will seek to engage with players using the tit-
for-tat strategy until they find a cooperative group (call it the pay-it-
forward strategy). Fools who don't learn will stumble around in the wilderness
until they eventually die off.

~~~
mizzao
Really cool suggestions - they actually encompass a couple of ideas that we've
had.

First, we discovered in the experiment that PD actually has 3 actions.
Cooperate, defect, and ... wait as long as possible to defect to punish your
opponent who is not cooperating. This happened because participants are
playing in a real-time web app, and they have some time to make a decision.
Although there is a lot of literature on how costly punishment can be used to
enforce cooperation, it was pretty amazing how it emerged organically in our
study after a few days. Someone could certainly grab our dataset and get a
free publication out of just that.

Also, given the month-long design of the study, we were thinking of something
where players get to stay on or "die" based on their daily score, and
therefore the population evolves based on who gets the highest payoffs.
Evolutionary game theory has some ideas about this, but they are all
theoretical or simulation, and not conducted with real people. I'm actually
not sure if cooperation would be sustained or not in this design, but I agree
that it would go a long way toward pinning down the origins of cooperation. I
also agree with you that groups or network structure would affect the result a
lot, allowing cooperative bands to flourish even when defectors are running
around.

~~~
espeed
Are you familiar with Jonathan Haidt's work on the psychology of morality and
the formation of superorganisms?

[https://en.wikipedia.org/wiki/Jonathan_Haidt](https://en.wikipedia.org/wiki/Jonathan_Haidt)

NB: Also see this timely post on "information theory and the foundations of
life"
[https://news.ycombinator.com/item?id=13496133](https://news.ycombinator.com/item?id=13496133)

~~~
mizzao
I've read some of Haidt's work but not that one in particular. Thanks for the
pointer.

~~~
espeed
Here is one of Haidt's TED talks that touches on these ideas, cooperation,
group selection, and the free-rider problem:

[http://www.ted.com/talks/jonathan_haidt_humanity_s_stairway_...](http://www.ted.com/talks/jonathan_haidt_humanity_s_stairway_to_self_transcendence)

Also this Edge article on "Contingent Superorganism"
[https://www.edge.org/response-detail/10386](https://www.edge.org/response-
detail/10386)

------
oli5679
In case anyone wants a primer on the prisoner's dilemma:

Suppose both you and a partner get arrested whilst trying to rob a bank.
Without a confession from one of you, the police can only convict you of a
minor offense (say trespassing).

The police place you in different cells, and offer you both the following
bargain. If you confess to the robbery, and your partner does not, you get a
complete pardon, and your partner gets a full sentence (say 120 months in
jail). If you both confess, you get a slightly reduced 100 months in jail.
Finally, if you both remain silent then each gets a 3 month sentence for
trespassing.

You both now have this tension between individual and collectively optimal
action. Your personal sentence is smaller (independent of what your partner
does) after confessing, but you both benefit from coordinating to not confess.
You can draw the payoffs in a matrix (left is row player and right is column
player) to visualise this.

    
    
             c         c'  
        c  100,100    0,120
        c' 120, 0      3,3
    

This tension applies to lots of other interactions (e.g. firms colluding to
restrict output, communities collaborating to maintain public goods,
neighbours not pissing each other off by throwing massive parties).

Even if there is a repeated interaction for a finite amount of time, you can
expect rational self-interested agents to be unable to collaborate in the
final period. This means they have no way of punishing in the penultimate
round, and so by induction you would expect 'unravelling' to non-cooperation
right from the beginning.

This paper found that a subset of the population appearing not to be self-
interested was sufficient to prevent this complete unravelling. Even a selfish
player may wish to collaborate in early interactions if there is a high enough
chance that they won't be immediately punished.

[https://en.wikipedia.org/wiki/Prisoner's_dilemma](https://en.wikipedia.org/wiki/Prisoner's_dilemma)

~~~
thilmo
When I was a kid and I read about the Prisoner's dilemma in the context of
morality, I was always confused. Surely the moral thing is to _confess if you
're guilty_. If you're guilty, and your accomplice is guilty, the right thing
to do is confess to the police so that you can be rightfully punished.

As an adult I understand that _this isn 't the point_ of the story, and the
police-and-prisoners aspect of the story is completely extraneous to the point
trying to be made. Still, I can't help but think that the police-and-prisoners
is a bad example of the broader point, since there's always a third set of
interests (that of the police, and of society in general) which is callously
(and immorally) tossed aside in the phrasing of the problem.

~~~
Nomentatus
But the problem has to explicitly exclude altruistic behavior; since if both
subjects view the consequences to the other subject as being roughly as
serious as consequences to themselves, there's no dilemma at all. They both
shut up and help each other. The whole point is to show a situation exists in
which two purely rational, purely selfish actors get an outcome that's
actually worse for them from a selfish point of view than the outcome that two
altruistic or irrational players would get. That they are criminals
"establishes", to use the story-telling term, that these subjects are
(unusually) not altruistic at all - the punchline is that being self-serving
turns out not so well for them.

------
MarkMc
It seems to me that in countries with systemic corruption there is a huge game
of prisoner's dilemma being played, with everyone making the 'wrong' decision.

Here's a fascinating quote from a book about Kenyan graft [1]:

\--------------------

Where does each individual draw the limits of his or her compassion, beyond
which duties of kindness, generosity and personal obligation no longer apply?
I was raised in a household where my parents drew them in totally different
places, according to their very different characters and backgrounds.

As an Italian, my mother grew up in a country whose government had given birth
to Fascism, formed a discreditable pact with Hitler, and launched itself on a
series of unnecessary wars which left Italy occupied and battle-scarred. There
then followed a seemingly endless series of short-lived, sleaze-ridden
administrations. The experience left her utterly cynical about officialdom.
Although she dutifully voted in every election, the malevolence of the system
was taken for granted, and she would happily have lied and cheated in any
encounter with the state had she believed she could get away with it. But no
one worked harder for her fellow man, for in the place of the state she
maintained her own support network. An instinctive practitioner of what
sociologists call 'the economics of affection', my mother had a circle of
compassion drawn to include a collection of lonely acquaintances. She visited
their council flats bearing cakes, sent amusing press cuttings to their prison
cells, queued at the gates of their psychiatric hospitals. Hers was a world of
one-on-one interactions, in which obligations, duties, morality itself, took
strictly personal form, and were no less onerous for it. The glow she radiated
was life-enhancing, but its light only stretched so far, and beyond lay utter
darkness. Protecting one's own was vital, for life had taught her that the
world outside would show no mercy. She was not alone in her ability to get
things done without the state's involvement. 'Il mio sistema' Italians call
it: 'my system'. Italy is, after all, the birthplace of the Mafia, the
ultimate of personal 'sistema', and my mother's mindset was instinctively
mafioso.

My father, in contract, was typical of a certain sort of law-abiding,
diffident Englishman for whom a set of impartial, lucid rules represented
civilisation at its most advanced. He was raided in a country which pluckily
held out against the Germans during the Second World War and then set up the
National Health Service in which he spent his career, and his trust in the
essential decency of his duly elected representatives was so profound that he
was shocked to the core by British perfidy during the Suez crisis, and
believed Tony Blair when he said there were weapons of mass destruction in
Iraq. When, as an eleven-year-old schoolgirl, I mentioned - with a certain
pride - that I usually managed to get home without paying my bus fare, he
explained disapprovingly that if everyone behaved that way, London Transport
would grind to a halt. Remove the civic ethos, and anarchy descended. A
logical man, he saw this as the only practical way of running a complex
society. It also, conveniently for an Englishman awkward with personal
intimacy, enabled him to engage with his fellow man at a completely impersonal
level. Not for him my mother's instinctive charm, the immediate eye contact,
the hand on arm. He felt no obligation to provide for nieces and nephews, and
had a cousin come up for a job before one of the many appointment boards on
which he sat, he would have immediately excused himself. Nothing could be more
repugnant to him than asking a friend to bend the rules as a personal favour.
What need was there for a rival, alternative sistema, if the existing
arrangement of rights and duties already delivered?

My father's world view was typically northern European. My mother's
characteristically Mediterranean approach would have made perfect sense to any
Kenyan. In an 'us-against-the-rest' universe, the put-upon pine to belong to a
form of Masonic lodge whose advantages are labelled 'Members Only'. In the
industrialised world, that 'us' is usually defined by class, religion, or
profession. In Kenya, it was inevitably defined by tribe

[1] [https://www.amazon.com/Its-Our-Turn-Eat-Whistle-
Blower/dp/00...](https://www.amazon.com/Its-Our-Turn-Eat-Whistle-
Blower/dp/0061346594)

~~~
visarga
Great post and wonderfully illustrated!

The prisoner dilemma also ties into the tragedy of the commons: when multiple
players decide to defect and exploit the commons to the fullest, then commons
lose their value for all.

It is also related to temporal discounting - we prefer to pollute today and
ignore the price tag we will face in the future.

We know so much about game theory yet we can't apply it in politics, because
it requires long-term thinking and optimizing for the success of everyone as
opposed to just a few. And we're in a point where words don't mean as much,
we're flooded with propaganda from all sides and can't speak reason over the
cacophony.

~~~
mizzao
The n-player PD is also known as the public goods game, and is used to study
cooperation as well.

Another meta-conclusion I would draw from our paper is that current game
theory is insufficient to explain what we observe in real life, including
politics, negotiation, and so on. It is folly to apply hyperrationality
(strict economic modeling) to real human behavior, and part of the goal of our
work is to stimulate more realistic models of what people do in these
situations.

~~~
Nomentatus
The PD is important because it's mostly the rational selfish actors who we
need to worry about. Irrationally altruistic actors are rare, and irrational
selfish actors tend to self-immolate or get arrested quickly. Many things go
on during church socials that don't require any problem-solving on the part of
game theorists; because widespread altruism is already a solution, just the
hardest solution to actually get to.

------
jonathankoren
The great thing about these experiments and other experiments in the realm of
behavior economics is that that they gather real data about how humans
actually behave, rather than the traditional underpinnings of economics and
philosophy which are basically, "I really want to believe this is is true, so
I'll write a book about it, and then then in the 20th century gussy it up with
some straight line graphs."

So, people aren't strictly rational actors, and altruism is inherent and
beneficial. Which isn't exactly surprising. However I don't hold out hope for
evidence-based science to central to socio-economic policy, since it would
force wide swaths of the population to give up thier cherished beliefs about
how they wish the world works, and instead embrace how the world actually
does.

------
sinan_si
Similar research to whom anyone interested in these topics :
[https://www.researchgate.net/publication/303656078_The_Dose_...](https://www.researchgate.net/publication/303656078_The_Dose_Makes_The_Cooperation)

~~~
saycheese
Resilent cooperation is intentional, but cooperating based on a lack of memory
is not - guess the question is if it matters if the source of constant
behavior matters, and if so, why?

~~~
Nomentatus
I certainly presume that it matters because the other player must have a
theory of mind, and see a string of non-cooperative plays as an intentional
penalty, and therefore predictive; that is something said other player can
control by behaving better themselves in future.

------
saycheese
Resilient players, regardless of if they are cooperators or not would
obviously add long-term stability; to me, this reads as adding stability
creates stability.

Majority of players are do not express resilient behaviors, regardless of the
impact on them; that is they are random, unable to create a consistent valid
strategy, etc. - it is not that they are just greedy, since if that were the
case, the system dynamics would be much more stable.

~~~
jakeargent
I'd wager that resilient defectors would not create long-term stable
cooperation. They might create stable defection, so in that way adding
stability would create stability. But only one of those stable outcomes is
actually desirable.

~~~
saycheese
My guess is there's a relationship between the duration to reach stability and
how soon after the "fake" players are removed.

------
Ninjalicious
I didn't read the article but the headline makes me postulate that this is
because the optimal individual strategy in Prisoner's Dilemma is tit-for-tat.

Respond in kind to every action taken against you. If someone offends, attack,
if they offer peace, be peaceful.

Nice people will more often offer the olive branch (sub-optimally), so anyone
else playing the optimal strategy will offer it back. Over time this probably
results in a lower violence rate. Even mean people (also sub-optimal) will get
dragged into niceness over time since nice/mean probably changes dynamically.

~~~
andybak
A strategy that could beat tit-for-tat consistently was discovered* but tit-
for-tat is still fascinating because it is incredibly simple and is _almost_
the optimal strategy.

( * I think it's this: [https://sciencehouse.wordpress.com/2012/09/04/a-new-
strategy...](https://sciencehouse.wordpress.com/2012/09/04/a-new-strategy-for-
the-iterated-prisoners-dilemma-game/) )

~~~
Ninjalicious
People get mad when you don't read the article.

~~~
saycheese
(People get mad when you say you didn't read the article, especially in the
first sentence; that is they read that, downvote, and don't read the comment.)

