
DeepStack: Expert-Level Artificial Intelligence in No-Limit Poker - maurycy
https://arxiv.org/abs/1701.01724
======
MikeTV
> DeepStack becomes the first computer program to beat professional poker
> players in heads-up no-limit Texas hold'em

Whether any others have been made before now is anyone's guess. Botting is a
known problem in online poker. If there's a golden goose out there, I'm sure
it's being kept under wraps.

~~~
zitterbewegung
You can collude multiple bots or perform other tasks which could make the
botting problem in Texas Holdem not equivalent to the same achievement that
they present in the paper.

~~~
femto113
I believe the primary botting "problem" is not rule breaking activity like
collusion but the farming of lower-skill players at lower limits than a
professional would be willing to play at. A bot will happily rake in a 1x big-
blind/hour advantage that a comparably skilled human would consider a complete
waste of time. It's my understanding that the real state of the art here is
not in the play algorithms (existing bots are more than good enough to beat
weaker players) but in avoiding detection by both human and automated
monitors.

~~~
bcassedy
Correct. When I last played for income in 2012, the site I played on had 1-2
bots at virtually every table from the $10 buyin cash games all the way up to
the $200 buyin games. Around the time I stopped playing it came out that there
was a botting ring that had been winning at $1000 buyin games for some time.

Most of their income does come from the weaker players at the table, but many
of these bots were good enough to breakeven or do slightly better against the
pros at the table too.

~~~
valarauca1
Most these bots are likely using standard well established techniques.

Strategic play, card counting, etc. A handful of heuristics make you an above
average player and will get you thrown out of a casino. They'd be trivial to
program.

I guess all that is _novel_ here is a bot learned these techniques on its own.

~~~
LewisJEllis
With all due respect, do you have any idea what you're talking about? This is
poker, not blackjack. Card counting is useless and there's no strategy that
will get you thrown out of a casino because it's not the casino's money you're
playing for.

I'm not sure what sort of "standard well-established techniques" that "would
be trivial to program" you're talking about. Optimal play in no-limit hold'em
is a tremendously complicated mixed strategy with a massive decision tree. To
make things even more interesting, in almost any given situation there will be
multiple "correct" plays that, to avoid being exploitable, should each be made
some percentage of the time at random.

This AI is novel because it achieved a result that (as far as anybody knows)
has never been achieved before, not because it figured out how to do something
on its own.

~~~
pokearl
I think it's pretty clear he either legitimately thinks this is about
blackjack or is completely clueless.

------
osti
To be fair, none of the so called pros are considered big names in today's no
limit heads-up games. They should probably challenge ppl like WCGRider,
Jungleman etc. next.

On another point, CMU just can't seem to catch a break, their thunder
continuously being stolen by UofAlberta in poker research, first in limit, now
no limit. UofA clearly tried to publish this before the CMU poker challenge
that's supposed to begin soon.

To read more about the CMU challenge
[http://www.cmu.edu/news/stories/archives/2017/january/poker-...](http://www.cmu.edu/news/stories/archives/2017/january/poker-
pros-vs-AI.html)

~~~
grizzles
Doug Polk is WCGRider.

~~~
dsp1234
Doug Polk is not one of the professionals who was part of this study. The list
is in Table 1 of the paper.

------
natecarroll
The players they recruited were incentivized by a $8,000 prize pool up for
grabs among the 34 of them...average $EV $235. They have to play 3000 hands to
get a shot at that money, which is probably around 10 hours of multitabling.
So that's ~$24/hr in expectation.

And then of course you don't get anything unless you're one of the top three
winners against the bot, so there's likely nothing to be gained from grinding
out a marginal victory. You should just go ahead and play kinda stupid/aggro
and hope you win some of the big flips and whatnot. There's literally nothing
at stake for you except time value, so you might as well flame out early and
then quit or run up a big stake to give yourself a shot at top 3.

Basically, the study design ensures the bot faces off against weak players
playing in a way that would be sub-optimal in any other situation. Not
surprised the bot won by a decent margin, nor that they are trying to spin
this real hard in advance of the CMU poker bot matchup next week, which will
be much more rigorous.

------
lawn
I think I can wrap my head around neural nets being superior at games with
perfect information like chess or go. But how would you teach bluffing and
randomness to a neural net?

~~~
falcolas
I imagine it's mostly just playing the percentages. Bet when it has a high
percentage of winning, fold when it doesn't. It doesn't need to read its
opponents if it can play the percentages perfectly.

~~~
reverend_gonzo
This is true for Limit Hold'em, but very much not true for No Limit. Limit
Hold'em is a solved game, because as long as you're playing the odds, you can
play perfectly. No Limit changes things because the bets can vary wildly. If
you play a tight game (just play the odds), and opponent will get out whenever
you're in, and will bluff just to see if you call or fold.

Bluffing is a major component in No Limit, and there are very different
profitable playing strategies.

~~~
jdmichal
I fell to this when I randomly decided to play some limit hold'em one night.
Kept losing to a guy that chased every chance at odds he could, because I
couldn't make bets big enough to scare him out. Lesson learned!

------
ChuckMcM
I love it, research that pays for itself :-) I think of poker and other card
games as imperfect but predictable information. So while you don't know what
cards the other players have you can certainly estimate the likelyhood of what
they have and prune your choices that way. Think single deck card counting in
Blackjack.

------
esseti
the fact that they used hearts and spades instead of number for affilition is
just lovely.

------
philosopheer
most people (including here on HN) are complete n00bs when it comes to
understanding how poker is played and how computers can play it, so just to
straighten y'all out at the git-go here:

computers are _better_ at bluffing and randomness than humans are. Bluffing is
an important optimizing strategy in playing poker well, and it entails
tracking the expected value of a pot (which includes cost expectations, don't
forget) and it entails randomness, necessary to obfuscate patterns of betting
that could give away evidence of your bluffing strategy. Like chess and go, we
may not be "there" yet with computers, but n00bs need to understand the
theory.

What computers _can 't_ do is read "tells", so if you are a master poker
player via tells (whether it's unconscious or conscious thinking on your part)
then you will beat other humans better than a computer will; but, by the same
token, the computer will not give you tells to read nor be fooled by your fake
tells. I think the mistake in thinking newbies (even highly experienced ones)
make is mixing together "the psychology" of the game with the mathematics of
the game.

So to give an oversimplified concrete example of a poker bluffing strategy
(inspired by Nesmith Ankeny's book), if odds of you drawing one of the cards
you need to win a showdown are 1 out of 4 but the expected payoff is 20x then
you not only need to stay in purely on expected value, but it is also an
optimal time to bluff if you don't get your card. It is informationally better
to have a bluffing strategy that masquerades as an "I have good cards"
strategy _and gives random information after the showdown_ rather than
"bluffing" being something you do sheerly when you have shit cards. And to
enforce a _random_ strategy on yourself, he recommends using a system of the
cards in your hand as the random number generator to tell you whether to bluff
or not: as you can see, his strategy designed for human players is more
perfectly implemented by a computer.

~~~
feral
No - If the only thing computers couldn't beat humans at was reading tells,
they'd win online poker.

But they don't yet do that: this paper is about beating humans at heads up,
which is a much more limited domain than a full table.

If you want to learn about why to bluff I'd recommend reading about using game
theory to solve Kuhn poker.

~~~
philosopheer
online poker has the tremendous flaw that collusion between players is the
_most optimum_ strategy, and there is just _noooo way_ to stop it.
Collaborating poker-bots who outsource their peppy poker chatter to Bangalore
(your feedback is important to them!) will soon be running all the tables if
they aren't already. No, they'll never be the champions, because that
suboptimal strategy would lead to discovery, but as a giant grist milling farm
grinding out profit, seems irresistable.

I did a quick google review of Kuhn poker and I don't see how any of that
would not benefit from the understanding I was attempting to convey in my
initial post.

~~~
6nf
Collusion is avoided entirely by playing heads up only.

------
brador
Heads-up is solvable by just crunching the known probabilities, so i'm not
sure what the achievement is here. Maybe the complexity of work involved to
build the program is worthy of merit? Not sure.

