
Truth from Zero? - carlchenet
https://rjlipton.wordpress.com/2017/12/17/truth-from-zero/
======
alasdair_
". It announces an algorithm called AlphaZero that, given the rules of any
two-player game of strategy and copious hardware, trains a deep neural network
to play the game at skill levels approaching perfection."

I have a particular interest in having a computer play the game Magic: the
Gathering at an expert level. In some ways, this seems like an ideal fit for
AlphaZero - it's a two-player strategy game with a finite number of valid
plays at each state, yet I'm still not sure how to deal with four key
elements:

1\. The game is random - it has a deck of cards that are shuffled.

2\. The game has hidden information - cards in hand are hidden.

3\. The cards can change the rules of the game itself. For example, a card
could change the win conditions of the game. There are over 20,000 unique
cards with more regularly being printed.

4\. A major part of the game (perhaps the most important) is that each player
chooses their own combination of sixty cards from the ever-increasing pool.
This makes understanding the "metgame" extremely important - a "deck" that was
good a week ago may be outclassed the following week as players react.

In addition, there are many decks that win with odd combinations of cards and
often the only way to beat these decks is to know in advance how to stop the
combo.

Any ideas on how best to approach this problem?

Right now I am working on building a very simple version of the game with both
players playing the same three cards with no randomness. This will eventually
feed a simple genetic algorithm to test different quantities of each of the
three cards to find an optimal "build".

~~~
yorwba
In absolute generality, it's impossible. The game is Turing complete:
[https://www.toothycat.net/~hologram/Turing/](https://www.toothycat.net/~hologram/Turing/)

Of course you may still be able to find a strategy that works well against
humans, but that would still be a cutting-edge research task.

Hidden information is a big problem, the classic benchmark for that is poker.
The latest breakthrough I'm aware of was for heads-up limit Texas hold'em
[http://science.sciencemag.org/content/347/6218/145](http://science.sciencemag.org/content/347/6218/145)
That may be a difficult game, but it has vastly fewer possible states. The
same approach as used in the paper would almost certainly not scale for MTG.

Dealing with cards that can arbitrarily change the rules of the game would
require human-level AI almost by definition, unless you want to manually
translate them into code, in which case the system would be unable to deal
with new cards.

Now that I have told all about how the problems you identified are much too
difficult, I want to add that that shouldn't discourage you from just playing
around. Starting with a minimal test case is how poker research took off, so
you're definitely on the right track.

~~~
alasdair_
>Hidden information is a big problem, the classic benchmark for that is poker.
The latest breakthrough I'm aware of was for heads-up limit Texas hold'em
[http://science.sciencemag.org/content/347/6218/145](http://science.sciencemag.org/content/347/6218/145)
That may be a difficult game, but it has vastly fewer possible states. The
same approach as used in the paper would almost certainly not scale for MTG.

I'm a big AI poker fan (and a big fan in general). The current state of the
art is now a bot that can consistently beat the very best heads-up no limit
(i.e. there are a very large number of possible legal bets) players in real
money games.

[http://science.sciencemag.org/content/early/2017/12/15/scien...](http://science.sciencemag.org/content/early/2017/12/15/science.aao1733.full)
\- note the date was yesterday :)

~~~
marcoperaza
How do online poker sites deal with this? No one would want to play if they
were constantly getting rinsed by bots.

~~~
alasdair_
Most people don't play heads-up no limit, they play 9-10 person games which
are tougher for AI to beat currently.

Eventually though, online poker will be full of bots.

~~~
lern_too_spel
If you can put your bots in more seats at the table, they can share
information and collude in their betting strategy. No amount of smartness in a
single player's strategy is going to beat cheating.

------
bo1024
I like the idea of starting with simpler games to get a feel for how close AZ
is to the minimax optimal strategy. For instance, just put chess or go on a
smaller board. Part of DeepMind's claim is that their algorithm doesn't need
tweaking when one changes the game, so this should be an easy experiment for
them to run.

I wonder if there is any principled statistical way to test, given a set of
players, how far the best of them is from optimal. Hmm....

