
Teaching an AI to play a simple game using Q-learning - ml-student
http://www.practicalai.io/teaching-ai-play-simple-game-using-q-learning/
======
hervature
One of my favourite undergraduate projects was applying Q-learning to the game
Flappy Bird.
[http://www.mast.queensu.ca/%7Emath472/FlappyQ.pdf](http://www.mast.queensu.ca/%7Emath472/FlappyQ.pdf)

~~~
ghostbrainalpha
That was a cool read. Thank you.

------
pigscantfly
If anyone is interested in learning more on this topic, Mykel Kochenderfer's
"Decision Making Under Uncertainty" offers a stellar treatment of
reinforcement learning from the ground up. [https://mitpress.mit.edu/decision-
making-under-uncertainty](https://mitpress.mit.edu/decision-making-under-
uncertainty)

------
CGamesPlay
This game really is quite simple! The go-to example I use for a simple game is
called 21.

\- There are N (usually 21) tokens in a pile. \- A turn consists of removing
1, 2, or 3 tokens from the pile. \- The player who removes the final token is
the winner. \- The opponent will always take tokens equal to n mod 4 if that
is a valid move, otherwise will play randomly (this is the optimal strategy).
\- The AI plays first.

You can see my write-up here: [1]. One of the most interesting things for me
was visually inspecting the action scores (at the end) to see how the agent
learned the optimal strategy over time. My configuration took 3000 games to
reach the optimal strategy against against a strong opponent (opponent epsilon
= 0.1), and substantially longer as the opponent starts to play worse.

[1]
[https://www.dropbox.com/s/eooqlhgg98zc398/Q-Learning%2B21.ht...](https://www.dropbox.com/s/eooqlhgg98zc398/Q-Learning%2B21.html?dl=0)

~~~
bluetwo
How does it work for that _other_ game called "21"?

~~~
hervature
This is an awesome idea, thanks for the weekend project!

~~~
bluetwo
Let us know how it turns out!

