
Experimenting with a Neural Network-based Poker Bot  - matt1
http://www.mattmazur.com/2009/10/experimenting-with-a-neural-network-based-poker-bot/
======
jpwagner
Certainly a fun thing to ruminate over, but the model Matt uses here is way
too simplistic to draw any conclusions from about the sufficiency of using a
NN.

A short stack raising >11% when he shouldn't be is a pretty severe leak ;)

Just a thought: I don't see the implementation of the algorithm here, but if
you're using the values of individual cards scaled to 1, then you are saying
that v(A) - v(K) = v(8) - v(7) which seems absurd. Imagine that there were
(fantasize for the sake of argument) a few shoves in front of you, then should
be that v(AK) - v(KQ) >> v(87) - v(76).

I'd need to give more thought to this problem in general, but I have to ask
Matt...why did you choose short stacking full ring NLHE? It seems mechanical
for someone very familiar with the game, but it's so situational. The first
area I'd think you'd be most effective would be LHE-HU-SNG's.

~~~
bravura
_Just a thought: I don't see the implementation of the algorithm here, but if
you're using the values of individual cards scaled to 1, then you are saying
that v(A) - v(K) = v(8) - v(7) which seems absurd_

Instantiating the features correctly is crucial to getting good neural network
accuracy.

It might be the case that a thermometer representation for the card value is
more appropriate. The thermometer representation is that there are thirteen
bit-values, and the first n values are on for the value n.

Using this sort of input representation, the network can model the non-linear
relationship between different card values, instead of assuming a strictly
linear correlation.

~~~
jpwagner
It's a neat idea, but I doubt it would work well because the hand value in
this context is so qualitative (whereas in HU LHE it would be more
quantitative.)

------
Flemlord
His tests are so simple that he's not really testing anything. For some real
analysis of how neural nets stand up to other player prediction models, check
this out:

<http://spaz.ca/poker/>

In particular, read the paper titled "Opponent Modeling in Poker: Learning and
Acting in a Hostile Environment". Section 4 outlines several prediction
methods, one of which is a neural net.

------
johnwatson11218
Does anybody know about these 2 ideas? 1) a site that lets people upload their
bots and play bot vs. bot 24/7. The idea is to try and find the best
algorithm.

2) A site that a human player plays at against the computer. The computer also
tries to model the player using a NN and tell the player how "Big" of a NN was
needed to model their play. Perhaps how quickly the computer was able to start
predicting the player's play? The idea being that you want your play to be
very hard to model with a NN so that you are more unpredictable in real life.

~~~
hazzen
Based on personal experiece, I feel like (2) would not work very well (nor
would using a similar idea to test your bot). When real money is on the line,
my entire state of play changes. There is a large psychological difference
between playing a bot for fun, and playing a bot for money. Playing for fun, I
am more likely to place large bets, bluff, test its logic, and in general play
loosely. For real money? No thanks.

Note that this is all based on personal experiece, and may not be the common
case, but I feel like the average hacker will play to test the bot, not to
win.

------
mattmaroon
You really don't need anything that complicated to beat low-stakes no-limit
poker short stacking. I used to know a few people who made decent hourly rates
just mechanically following a very small spreadsheet. I did it myself for
awhile and it seemed like an easy win, though not as profitable as simply
being good at the limit games I was playing and nowhere near as
challenging/fun. My personal sample size was small, but I still suspect a
small set of if/then statements would have been a winner.

The real challenge is finding enough tables to play. The bad thing about short
stack is that sometimes someone calls you and you win, and now you've got at
least a medium stack. Most sites don't let you come back to a no-limit table
with less money than you left with until some period of time (for instance 2
hours) has passed. Party Poker did not have this feature (maybe still doesn't)
so it would have been easy back then, but for an American now I don't know of
any site of substantial volume where you wouldn't quickly burn through all
available tables and find yourself waiting.

Even for a bot, there might not be much profit in it.

~~~
matt1
It seems easy, right? That's what I thought too.

I purchased 7M+ NL200 hand histories from a guy who mined them all day every
day. I filtered that down to hands where someone was playing with 15 - 25BB
stacks and then for every person that played more than 5,000 hands in that
range, calculated their profits and winrates.

Here's the breakdown: <http://www.mattmazur.com/images/ShortstackingNL200.png>

It's important to remember when looking at these numbers that the vast
majority of people who try shortstacking never make it to 5,000 hands so this
is heavily skewed toward the survivors. It would be like analyzing the revenue
of startups that are over five years only and then using that average for the
average revenue of all startups. There's only 80 people in the full ring group
and 19 in the 6-max group. Just looking at these folks, only about 1 in 5 win
more than 2bb/100 and in my experience that's not easy to do. The people that
are profitable shortstackers would likely win a lot more playing with a normal
stack.

I don't think there are many long term profitable shortstackers following
rules. Again, I think the majority of the profits come from a few strategic
plays that would be impossible for a spreadsheet, or a bot, to emulate.

~~~
mattmaroon
Idk, those numbers seem pretty reasonable to me and don't indicate any real
level of difficulty. 58% of people in it are even or worse, which means if
there is any sample bias at all, it's very small. The fact that losers tend to
drop out over time is probably mitigated at that level by the fact that
winners will quickly rise from low stakes to at least middle ones.

2-4 bb/100 is great even for a human since you can do huge volume that way.
The spreadsheet method can easily be applied to 8 tables at once, maybe more.
2 bb/100 (thats $4 as calculate by PT right?) could be $30/hr. 4bb is obv $60,
which is more in line with what people I knew were making.

Also when short stacking, opponents at higher level games don't necessarily
respond any more appropriately. In fact a lot of times they play worse against
it because they have more idea what to do vs. normal opponents. Low level
players like to call big bets too much, which makes them accidentally better
vs. a short stacker. My friends claimed they were getting higher bb/100 at the
$3/$6 and $5/$10 games than the $1/$2 because of that.

~~~
matt1
I just realized that I have the hand histories for all cash No Limit hands
played on PokerStars from Dec 1, 2007 to Mar 12, 2008. I've got all the tools
built to do the analysis, so I can pick apart the results from the lower and
higher stakes as well. Any recommendations on how to do it?

Not PTBB/100, just BB/100. PokerTracker couldn't handle all the data.

~~~
mattmaroon
Wow, how many hands is that? That's really cool.

I suppose you could possibly control for play style, thinks like what Poker
Tracker tracks. VPIP, etc., and of course buy-ins if you do some technomagic
to figure them out. (I'm assuming someone who short stacks all the time plays
differently than a normal player who just lost half their stack.) Not sure how
good your findings would be though.

------
tom_b
Have you checked out "The Mathematics of Poker" by Bill Chen and Jerrod
Ankenman? This books does some neat game theory based analysis of poker and
comes up with some surprising results. I've pondered what some sharp hacker
type might do given that text and some time to grok on a bot.

I think the one result most applicable in this scenario is the section with
jam or fold tables based on hand and stacksize. Basically, they show that
there are completely defined hand ranges when you should jam or fold when
stacked at a certain number of BB at no-limit.

I've pondered doing a neural network, but for limit rather than no-limit. I
still may, mainly because I think working on a bunch of inputs that are hard
to quantify on the fly when playing would be fun to experiment with via the
neural net.

Thanks for your posts on poker bots - although I have to admit I've basically
quit playing online over the last year because of the proliferation of HUDs,
data mining automation, and bots.

Oh, and for anybody interested in game theory and/or poker, you simply have to
grab the book. It's a bunch of fun and not scary on the math for those on this
forum.

~~~
matt1
I've read The Mathematics of Poker; the binding is broken from keeping it
propped open too much. That being said, it was not very helpful actually
building a bot. They do a lot of nice little game theory examples, but most of
it is not applicable to real life game play.

The push/fold tables are another story, and I'll write about that soon.

------
chasingsparks
For those interested in the PokerBot aspect, check out:
<http://pokerai.org/openholdem.html>

It allows for bots in Perl and has a DLL interface. I've been meaning to make
a V8 based DLL to write bots in Javascript for a while, but it's not a high
priority for me.

P.S. Did anyone notice this guy has a Tetrinet bot?! That game made high
school tolerable.

~~~
matt1
The TetriNET bot was a blast. You can download the full source code on the
site as well.

------
prat
One of the issues with NN is that there is no way to find out the relative
significance of the input variables and how they affect the output response.
Atleast I couldn't find a way when I tried using them a couple of years back.
Secondly, I guess there isn't a way to cluster similar cases together to
analyze what rules bind different decisions together. Also, one cannot find
the partial dependence of the output on a given input variable. All these
problems are solved by the random forest classifier which I have been working
with and really like.

~~~
clevercode
I agree that analysis of neural networks is not easy, but there are certainly
ways to do it:

 _there is no way to find out the relative significance of the input variables
and how they affect the output response_

Calculate partial derivatives of inputs w.r.t. outputs.

 _there isn't a way to cluster similar cases together to analyze what rules
bind different decisions together_

PCA analysis of activations of the hidden layer. Could also do k-means
clustering of activations of hidden layer.

The random forest classifier looks interesting, I'll have to investigate that.
Any suggestions for papers/tutorials?

~~~
bravura
_there is no way to find out the relative significance of the input variables
and how they affect the output response_

"Calculate partial derivatives of inputs w.r.t. outputs."

Yes. There are other ways to interpret the output. See, for example,
"Visualizing Higher Layer Features of a Deep Network." by Erhan et al 2009:
[http://www.iro.umontreal.ca/~lisa/publications/?page=publica...](http://www.iro.umontreal.ca/~lisa/publications/?page=publication&kind=single&ID=247)

It is true, though, that more powerful models are less explainable. But in
return you have more compact modeling and training via gradient descent. This
is faar faster than the combinatorial optimization involved in ensembles of
trees. (Trust me, I've implemented both.)

 _there isn't a way to cluster similar cases together to analyze what rules
bind different decisions together_

"PCA analysis of activations of the hidden layer. Could also do k-means
clustering of activations of hidden layer."

I concur. However, for 2-d or 3-d visualization, you should use the more
recently developed t-SNE algorithm instead of PCA or other alternatives. t-SNE
does waay better. Software (in Matlab and Python) is available at:
<http://ict.ewi.tudelft.nl/~lvandermaaten/t-SNE.html>

Also, you should look at the JMLR paper
([http://ict.ewi.tudelft.nl/~lvandermaaten/t-SNE_files/vanderm...](http://ict.ewi.tudelft.nl/~lvandermaaten/t-SNE_files/vandermaaten08a.pdf))
and supplemental material
([http://ict.ewi.tudelft.nl/~lvandermaaten/t-SNE_files/Supplem...](http://ict.ewi.tudelft.nl/~lvandermaaten/t-SNE_files/SupplementalMaterial.pdf)),
to see the visualizations produced by t-SNE and competing methods. This
qualitative evaluation by looking at pictures speaks for itself.

 _Also, one cannot find the partial dependence of the output on a given input
variable._

If your model assumes that the output is a non-linear combination of inputs,
then yes, it is hard to express the output in terms of a linear decomposition.
But that was your choice of modelling assumption, presumably because linear
models are insufficiently powerful to fit the underlying variations.

"The random forest classifier looks interesting, I'll have to investigate
that. Any suggestions for papers/tutorials?"

Random forests were developed by Leo Breiman (RIP).

"Ensemble methods in machine learning" by Diettrich (2000) compares different
tree ensemble methods. He concludes that boosting an ensemble of decision
trees is better, except when the data are very noisy, in which randomized
trees are better. (Boosting is when you focus on the examples that the model
is currently doing the worst. Randomized instead works on random subsets of
examples.) The main reason boosting is worse than randomized trees in the
noisy case is because the AdaBoost exponential loss is sensitive to outliers.
Which is to say, AdaBoost boosts the wrong loss function. Boosting an
appropriate choice of loss function (perhaps a regularized log-loss) is
probably superior to randomized trees in most circumstances.

"Improved boosting algorithms using confidence-rated predictions" by Schapire
and Singer (2000) is a great introduction to boosting.

Around the same time Llew Mason and Jerome Friedman independently demonstrated
that boosting is essentially fitting an additive model using gradient-based
methods to select the features that have the steepest loss gradient. So you
should follow up by looking at their work.

------
LiveTheDream
This bot attacked Texas Hold'em. Presumably most of the others do as well,
given that it is far and away the most commonly played game online
(disclaimer: complete assumption).

What about attacking 7-card stud, stud8, omaha, and o8? These games have more
information to analyze than hold'em because more cards are visible. It seems
to me that would provide more of an edge to a computer, which could calculate
drawing odds faster and more accurately than a person.

