
Open source game-theoretic poker player - adamsmith
https://github.com/adamsmith/game-theory-poker
======
moccajoghurt
There is actually a business behind bot-poker-player.

I have played about 20k tables Texas Hold'em Double or Nothing and stopped
playing after 6 months with a profit of 1000$.

I have met quite a few bots and an experienced player will most likely
recognize such bots. However it took a while to realize it and I had to look
up the stats in order to see it.

The algorithm the bot used was really simple. In poker terms you'd call the
bot-player a rock. He bets when he has good cards and will always go all-in
whatever happens after the flop.

You'd think that this algorithm is too simple to be succesful but that's
wrong. There are two factors that make this strategy profitable:

1\. If you play on low limits the players usually play incredibly aggressive
and will nearly always lose a lot of money whenever your bot bets.

2\. Even if you have a winrate of only 55% (which is necessary to not make a
minus, because you never play for free, there is a fee for each table) you
will make profit because of the cashback your online poker provider will give
you after each month. This is also why you have to play a lot of tables. The
bot played about 800 tables each day, which is insane. However it does
increase the cashback and the more you play, the more money will get each
month.

~~~
aneth4
Seems like a rather easily detectable bot, playing 800 tables a day and
playing predictably.

I thought all poker sites banned such bots. Is that just a lie?

~~~
relix
Since there were bots the poker sites have started detecting bots. If a player
played every day exactly 800 tables (or exactly e.g. 8 hours), it would be
instantly banned for being a bot.

Bots have a lot of randomization now to make it appear human: sleep randomly
to emulate "thinking", click on different parts of the button, randomly move
the cursor around but still humanly, click next to the button, move the cursor
on one button before moving it to another button to actually click, and so
on...

------
unreal37
OK now that Chess and Jeopardy have been conquered.

I wonder if no-limit texas hold'em poker is something that massive computing
power can consistently conquer as well. Imagine if you had 10,000's of
instances of EC2 churning at playing one hand of poker against the world's
best opponents...

Is it possible?

~~~
dminor
There's a group at the University of Alberta that does research in poker
strategy. They have a program that plays heads-up poker (2 players) as well as
the best in the world. Adding more players into the mix significantly
complicates things though, so it will be a little while yet.

There are people that run bots on online poker sites, to varying degrees of
success.

~~~
wilfra
Those bots have gotten significantly better the last few years. I would say
the best bots today are beating nearly all of the recreational players and
even the lower rungs of professional players - but the very best players can
still beat them, should they be paying attention to their tendencies. The bots
have a big advantage over them though in that they are robots: they don't get
tired or hungry and their judgment never gets clouded on runs of bad luck.

PokerStars and FullTilt have worked hard to rid their sites of bots but many
other networks have turned a blind eye to them, some even explicitly allowing
them - since the bots pay rake like any other player. Every so often the
players will revolt and the sites will crack down but the bots always find
their way back again.

~~~
jcampbell1
I agree with everything you have said, but it is worth pointing out that the
best online players are assisted by bots. The top players use software that
can immediately show your last 10 bets and the outcomes as soon as you raise.

My point is that if you play low stakes, you get crushed by patient bots. If
you play high stakes, you get crushed by cyborgs.

~~~
andyakb
Im not positive that the current HUDs do that , and even if they did, i dont
think its a huge advantage. Players use statistical tools which could be seen
as unfair , except they are so common now that a significant percent of
players from the lowest stakes to the highest use these tools. Just having
this data is not enough to make soemebody a winning player as it does not
suggest a specific move, the player still needs to figure that out. High
stakes players are just much, much better than lower stakes players, with or
without the HUD.

------
philh
> But we can compute the optimal strategy for an abstract version of poker
> that, for example, during pre-flop betting treats pairs of aces the same way
> as pairs of kings.

Not a big deal, but the wording of this seems off. It doesn't sound like
you're computing the optimal strategy for poker-prime, where poker-prime has
the property that in pre-flop betting (but nowhere else?) pocket aces are no
more valuable than pocket kings.

Rather it sounds like you're computing a sub-optimal strategy for poker, by
taking an optimal strategy and making it computationally simpler at the
expense of some correctness.

~~~
querulous
It's still optimal because your preflop play with KK is going to be exactly
the same as with AA. Similarly, you are going to play 9s8c exactly the same as
9h8d.

~~~
andyakb
this is not true. suit simplifications can be made without any loss to the
value of the solution, but saying AA=KK is not a lossless simplification

~~~
querulous
EV is different, but optimal course of action at each branching point is
identical

~~~
andyakb
You have no way of knowing this. If the EV is different, how can you say with
any certainty that the action at each branching point is identical? What if
there is an ace or a king on the board? These hands are very similar and
likely have similar strategies, but they are absolutely not the same and any
strategy that treats them that way is not optimal.

------
orensol
Will be interesting to see how/if it can scale on cheap cloud based cpu
oriented machines, such as Amazon High-CPU instances.

~~~
unreal37
From what I have read today, the current poker bot programs do a lot of work
in advance, and just follow decision trees when to raise, call and fold along
that tree. They already know how they will bet with J10 suited before the flop
when they are the first to act before the game starts.

The real challenge is the number of permutations of that, which raise memory
requirements into the petabytes range. Not to mention multi-player games, and
the no-limit version where betting gets more complex.

Not sure if having high-cpu instances at your disposal helps during game play.

------
IsaacL
I wrote a pokerbot for my university third year project:
<https://github.com/IsaacLewis/FYP>. I haven't been able to spend any more
time on that project since finishing it (though I wanted to), but I still find
the space fascinating.

Unlike the linked bot, which is an "equilibrium" (or "game-theoretic") player,
mine followed an "exploitative" strategy. What's the difference? Equilibrium
strategies find (or attempt to find) a Nash equilibrium, and follow that. As
the OP said, this minimises their losses, but also prevents them exploiting
weaknesses in an opponent's playing style. Wheras an exploitative player
adapts its strategy to take advantage of its opponent, but that leaves it open
to being exploited itself.

The OP used RPS as an example - it's clear that the Nash equilibrium is
picking each move with 1/3 probability. No matter what your opponent does,
your expected value is 0. But what if your opponent decides that they will
always pick rock? The EV of the equilibrium strategy is still 0, but you could
switch to an exploitative strategy of always picking paper, in which case your
EV is 1. For this reason, exploitative strategies will almost always win
multiplayer RPS tournaments, because they can consistently beat the weaker
players, whereas the equilibrium players will stay in the middle of the pack.
It might seem like a surprising result that playing an exploitative strategy
_always_ leaves you open to exploitation yourself, but the maths works out.

If you an intuitive grasp of this idea, consider that to exploit your
opponent's strategy, your play must be adapted based on observations of their
play. But this means they can play with style X, leading you to play style X'
which is dominant, before they catch you out by switching to style X'', which
dominates X'. If you have experience playing poker with competent humans, they
do the same thing.

In computer poker, AFAIK equilibrium players generally perform better. I think
this is because poker is a more complicated game than RPS, so both humans and
bots consistently make mistakes, so just playing solidly gives equilibrium
bots the edge. But writing an exploitative bot is still pretty interesting,
because it seems closer to human poker, which is more about bluffing and
outthinking your opponents than mathematically optimising your play.

My bot wasn't especially interesting - it was based on an existing algorithm
called Miximix, and I used Weka to try and machine learn a model of the
opponent's strategy. Still, it could do interesting stuff - eg, if it played
against an opponent that could be intimidated out of hands by large bets, it
would realise that it could bet large without having good hands - ie, it
successfully taught itself to bluff. What I thought would be really
interesting was a bot with multiple-level opponent modelling - "what does my
opponent think I have?" or "what does my opponent think I think he has?". Good
human players think this way, and "recursively modelling other minds" seems
integral to conscious thought, so it'd be cool to look into in more depth.

The other thing that would be cool to look into is "explanation-based
learning". Normal machine learning approaches require large amounts of data to
draw inferences, but human poker players seem capable of forming conclusions
about their opponent based on very limited information. Explanation-based
learning uses a domain model to help this.

Hmm, writing this comment has reignited my interest in this space - I really
should dig out my old code and work on this again some time.

~~~
adamsmith
With the advent of Deep Belief Nets, which can learn from lots of unlabeled
data, it seems reasonable to believe you could duplicate or exceed the human
ability to quickly understand a player's behavior based on past humans played.

If only we had access to the backend histories of an online poker site!

~~~
IsaacL
Oh, but you can... <http://www.pokertableratings.com/buy-hand-histories>

------
mbell
Interesting, how much did you try to optimize this code? Some of it is a bit
fishy.

------
martinced
It's Limit Texas Hold'em. (of course)

Limit Texas Hold'em and No-Limit Texas Hold'em, are two entirely different
game. They happen to share a few things in common but from a game theory
they're two wholly different beast.

They're as different that, basically, limit Texas Hold'em is a solved problem:
good bots can rival with the best professional players (playing Limit Hold'em
for money online is risky: you can be playing vs a bot or vs someone entering
the moves of a bot).

But No-Limit Texas Hold'em? There are players who've won several major
tournaments. The psychological element is very, very important.

And unless we make amazing AI discoveries, it's going to be very difficult to
write bots able to beat good players at No-Limit Texas Hold'em.

But you can find bots online, even for NLHE, able to beat beginners and the
rake at very low limits (called the nanostakes and the micro-stakes, but not
above).

Another thing: there's so much money to be made (as in millions of $) by
writing a bot able to beat mid-stakes and high-stakes online no-limit Texas
Hold'em that the last thing someone who'd write such would do would be to
publish it online.

Major sites like PokerStars do pro-actively look out for bots: the EULA states
that they have the right to scan the entire memory of your computer and your
entire hard disk. And you can't install such a software without giving the
root/admin password of your system. And you cannot legally use a VPN: if they
detect one you're out (you still technically can if you manage to fly 100%
below the radar). And you can't use remote desktops. It's overall very
restrictive.

They're regularly busting bot-rings and chinese-colluders rings and
confiscating their money (and redistributing it to other players).

And if they suspect an account of multi-accounting, they'll do tricky things
like moving and resizing all the poker tables at once, while simultaneously
showing a captcha.

If you fail to enter it, you'll have a hard time convincing the site to not
confiscate your money...

But back in the wild wild west days, it was amazing: some people had "war
rooms" made of tens of PCs, all playing online poker and making very very big
money. It was a big business.

But games got tougher, poker "black friday" hit the US hard, bot detection has
vastly improved, etc.

So the "gold rush" is over for most botters.

~~~
icelancer
As someone who worked for one of the online poker sites doing bot detection
and collusion investigation, I find your optimism amusing. It is certainly not
over.

