
DeepStack: Expert-Level Artiﬁcial Intelligence In Heads-Up No-Limit Poker - apetresc
http://fermatslibrary.com/s/deepstack-expert-level-artificial-intelligence-in-heads-up-no-limit-poker#email-newsletter
======
LeanderK
How can one create visualisations like Figure 3 (assuming it's latex)[0]? I
have to write my Bachelor thesis soon and need to improve my latex-skills. I
only know to to use the math-mode and write text.

[0] the paper:
[https://arxiv.org/pdf/1701.01724.pdf](https://arxiv.org/pdf/1701.01724.pdf)

~~~
mikebjohanson
Hi - I'm one of the authors on this paper. We had professional designers do
the figures for this paper; Fig3 in particular was far better than what we
could do ourselves.

~~~
andr3w321
Just working my way through the pseudocode to try and gain a little insight. I
think there may be a small typo. Line 29 VALUE should be VALUES?

------
mastazi
With regards to Texas Hold'em, Conterfactual Regret Optimisation [1][2] was
already able to beat humans in limit games, while it seems that this new
approach is able to do the same in no-limit games. My naive understanding is
that no-limit games are harder to crack because there are more possible
outcomes, thus the available information (poker is an "imperfect information"
game) becomes "even more imperfect".

[1] [https://arxiv.org/abs/1407.5042](https://arxiv.org/abs/1407.5042) (yes,
the author is the same Oskari Tammelin who created Jeskola Buzz, a popular
piece of music software).

[2] [http://www.nature.com/news/game-theorists-crack-
poker-1.1668...](http://www.nature.com/news/game-theorists-crack-
poker-1.16683)

------
PaulHoule
One thing I haven't seem come up in discussions or media coverage about this
is that the heads up game is entirely a different thing than playing with 5
other players.

~~~
Tenoke
For what is worth, heads-up is a very popular game, and from reading the
paper, the approach should more or less be applicable to 6max or full ring,
except it will be way more computationally expensive (but not unfeasibly so
like before).

~~~
jumpCastle
In my understanding the approach is based on game theoretic principles that do
not extend naturally to more than two players. If that's correct then it's not
only a computational issue. See [http://www.nature.com/news/how-rival-bots-
battled-their-way-...](http://www.nature.com/news/how-rival-bots-battled-
their-way-to-poker-supremacy-1.21580)

~~~
peternilson
Yip, exactly this. The CFR algorithm they speak of is based on finding a nash
equilibrium for each strategy-pair in the game tree. Something that I think
can only be computed for 2 players, and no more.

~~~
Tenoke
Surely you can adapt it to compute an approximation of a mixed Nash
equilibria. My game theory isnt at a high level, but there are similar
computations you can perform, and when I used to play poker, you'd have common
calculations based on it for more than 2 players when you are playing with a
limited stack. I dont see how that wouldnt be adaptable (abeit more
computationally expensive) for more players, but I might be missing something.

~~~
jumpCastle
I'm not sure Nash equilibrium exists for more than two players in this game.

~~~
mikebjohanson
Nash equilibria are still guaranteed to exist. But it's only the 2p zero-sum
perfect recall case where an equilibrium has useful properties, like being
robust against any opponent strategy, including a worst-case opponent who
knows your strategy. In a multiplayer (> 2 players) game, opponents can
collude against you and playing your part of a Nash gives no useful guarantees
on performance. Even if they aren't colluding, in poker games, the presence of
a bad player just before you in turn order can hurt your EV worse than their
own. And even if all players independently compute their own Nash equilibria
(there can be many) and use the piece for their position, then that
combination of strategies may not _itself_ be a Nash equilibria.

~~~
jumpCastle
Thanks for the explanation. I will also appreciate a reference to a good
survey paper if you are in the mood :)

~~~
mikebjohanson
I don't know of a survey paper on CFR for multiplayer, but it's been showing
up in conference papers and theses.

Here's a link to a shorter conference paper where CFR _does_ converge to a
Nash, in 3p Kuhn poker. It describes a family of equilibria, where one player
(the second to act, IIRC) has a parameter that can't affect their own EV
(...or else it wouldn't be a Nash), but _does_ determine how much the other
two players win/lose from each other. This illustrates the problem in
equilibria for multiplayer games: if you're players 1 or 3, then even if you
are playing a Nash, and everyone else does too (albeit different equilibria),
then you can still lose.
[https://webdocs.cs.ualberta.ca/~games/poker/publications/AAM...](https://webdocs.cs.ualberta.ca/~games/poker/publications/AAMAS13-3pkuhn.pdf)

For a longer read, the best I know of is probably Rich Gibson's PhD thesis. He
focussed on CFR for multiplayer games.

[https://webdocs.cs.ualberta.ca/~games/poker/publications/gib...](https://webdocs.cs.ualberta.ca/~games/poker/publications/gibson.phd.pdf)

~~~
jumpCastle
Thanks again. So I guess for multiplayer games a better approach would be
reinforcement learning with no game theoretic heuristics?

~~~
xapata
Game theory is always applicable to games :-)

Nash equilibrium is just one aspect of some games.

~~~
jumpCastle
Game theory is domain specific. Generic methods in AI tend to dominate domain
knowledge over time. Although I agree that other game-theoretic techniques
might help here.

~~~
xapata
Game theory is specific to the domain of agents optimizing outcome in
adversarial, cooperative, or (rarely) solitary systems. That's a pretty big
domain.

------
andr3w321
Anyone know where to find the pseudo code? It references item 10 in the
References and Notes which states See Supplementary Materials.

~~~
foob
They're included in the arXiv version of the paper:
[https://arxiv.org/abs/1701.01724](https://arxiv.org/abs/1701.01724)

------
maaaats
Pretty helpful with the annotation along the way, explaining mbb/g and what it
means, for instance.

------
grumpopotamus
Any hints on how to implement CFR-D on the GPU?

------
IMTDb
This seems to be a major threat to online casinos. Is there any study on how
they plan on fighting the AI's that will likely be flooding the market soon ?

~~~
gdulli
Heads-up is a contrivedly simple scenario compared to solving poker in
general. There have been bots for a long time that have co-existed with real
players, and we're likely not close to a scenario where bots can reliably beat
real players in general for real world scenarios.

However, in theory even a break-even bot could be a gold mine because it can
earn loyalty points. That's how it was the last time I was playing online, I
don't know if loyalty programs still work the way they used to. Of those
programs would just go away if abused enough.

~~~
Tenoke
Loyalty points are paid out of (and lower than) the rake, so the bot still
needs to be better than break even.

~~~
alasdair_
Online poker has the concept of "bonus whoring" where one takes advantage of
various loss-leader promotions offered by different sites.

A typical promotion would be offering a 20% bonus on a $1000 deposit, with the
caveat that you must play in, say, 1000 raked games before cashing out. It's
possible to play in games where the total expected rake if the user breaks
even is, say, $100, leaving $100 profit.

