Hacker News new | comments | show | ask | jobs | submit login
Artificial Intelligence, Poker, and Regret (medium.com)
120 points by matco11 179 days ago | hide | past | web | favorite | 21 comments

Well written and very interesting. There is a small typo in your article (there is both positive regret and positive regret).

What's the difference between positive regret and positive regret?

There are many typos, in fact. I personally find it really distracting and wish authors of articles with interesting content would get someone to proofread before publishing them.

I'm grateful authors take the immense time required to write something like this and then publish it free for all to read. I personally could care less if there are typos.

*couldn't care less

Not to start an argument but relevant https://xkcd.com/1576/

Is like to see an xkcd where one character ponders of another character is doing something just to get a certain reaction that they have just the right online comic to counter with.

Since web content is dynamic, I like to think that we're the proofreaders (that is, if the author makes themselves available for contact and makes said correction when someone points out a typo).

The world is a better place for having this post in it, typos and all. I'm glad they shipped it instead of waiting for it to be perfect.

Despite the title, it's a tutorial on how to build an AI for Rock, Paper, Scissors.

Isn't this game solved by game theory?

Yes, you can solve it "by hand" in a few lines of algebra.

You can't solve Texas Hold'em by hand, though, because there are too many parameters.

The same technique they used there for RPS can be used on poker to create strong strategies

Sadly, the blog is only about rock paper scissor.

I too wish it were about Poker, as the title suggests.

> "Unlike many recent important breakthroughs in A.I Research, like Deepmindā€™s AlphaGo, CFR does not rely on Neural Networks to calculate probabilities or the value of a particular move. Instead by playing itself over millions if not billions of games, it can begin to sum up the total amount of regret for each action it has taken in particular positions."

If the goal is to build a general-purpose AI, this approach seems like a dead-end. The distinguishing feature of a general-purpose AI is knowing what to do when it encounters novel situations. In contrast, the CFR algorithm above sounds more like a training program where the "AI" teaches itself using empirical results, what to do for every single scenario.

Such an empirical approach may work well for scenarios that have been frequently encountered in the past, but when dealing with novel scenarios, it seems to me that a deductive approach is what's truly needed.

True, but search algorithms are the foundation for most game playing AIs. What I would take out of this is that there is a missing component that would bridge specialized AI to general purpose AI that mimics humans. For example a trained professional player is aware of the history of plays prevalent to a game and I doubt most professionals can/eventually invent new plays, and this affords by-passing the brute force millions of games needed to train an AI.

For those who want to read more about CFR, I'd start with some of Michael Johanson's papers. I think his thesis was specifically on CFR and poker, but a reasonable amount of searching on the UAlberta site will probably find you the right papers. You can also look at his Quora answer here for a (much more readable, IMO) overview:


In the same vein, you might want to look up "fictitious play" as a related topic for finding Nash equilibria in two player games by iterating through best-response strategies.

Does it manage betting as well? Betting obviously is a large part of poker and requires analyzing other players bets.

Generally you "abstract" betting to a few sizes (2-4 usually), and create a mapping back to real sizes when using the bot you created.

In no-limit poker the abstraction is a few fixed multiples of the blinds if you're leading out from a reasonably sized stack (1x big blind through maybe 6x big blind) or a few fractions of the pot if you're raising or betting in a later round (1/5, 1/4, 1/3, 1/2, 2/3, 3/4 or 1x the pot, or all-in).

That's really interesting and something I never knew. Is that just something to simplify writing a decent bot, or is that something human players think in terms of?

Humans need to simplify the game more than bots do, really. Bots can calculate exact odds ratios. Humans do rough estimates, because that's faster. Also, you don't need exact odds as it's common to bet such that your opponent doesn't have the correct odds to call.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | DMCA | Apply to YC | Contact