
Bandit based Monte-Carlo planning [pdf] - gwern
https://web.engr.oregonstate.edu/~afern/classes/cs533/notes/uct.pdf
======
zhanwei
UCT is a very simple idea that works surprisingly well across very diverse
domains. Can't emphasize its generality enough, you can throw different
problems at it and it can give you decent (may not be the best) result without
using any additional domain knowledge.

It is also very easy to work with, you can easily tweak the algorithm and add
heuristics for your specific domains.

Also relevant:

UCT applied to partially observable game (Poc-man)
[http://papers.nips.cc/paper/4031-monte-carlo-planning-in-
lar...](http://papers.nips.cc/paper/4031-monte-carlo-planning-in-large-
pomdps.pdf)

Another approach for Monte-Carlo planning
[http://papers.nips.cc/paper/5189-despot-online-pomdp-
plannin...](http://papers.nips.cc/paper/5189-despot-online-pomdp-planning-
with-regularization.pdf)

~~~
nurettin
I've coded and tested sparse lookahead in a trading algorithm before using a
comprehensive example as a guide. Do you know of any comprehensive walkthrough
examples implementing a UCT scenario that I can use to implement and verify my
results?

~~~
zhanwei
This tutorial seems quite good. Covers the basic and various useful UCT
extension:
[https://webdocs.cs.ualberta.ca/~mmueller/courses/2014-AAAI-g...](https://webdocs.cs.ualberta.ca/~mmueller/courses/2014-AAAI-
games-tutorial/slides/AAAI-14-Tutorial-Games-5-MCTS.pdf)

However, the tutorial doesn't work towards a working implementation. I think
you can verify your results against benchmark problems. There are a number of
good implementations around:

UCT implementation with many MDP benchmark problems:
[https://github.com/bonetblai/mdp-engine](https://github.com/bonetblai/mdp-
engine)

My favorite implementation. The code is quite easy to read:
[http://www0.cs.ucl.ac.uk/staff/D.Silver/web/Applications_fil...](http://www0.cs.ucl.ac.uk/staff/D.Silver/web/Applications_files/pomcp-1.0.tar.gz)

~~~
nurettin
Thank you. I like to verify as much as I like to implement :-)

