Hacker News new | past | comments | ask | show | jobs | submit login

The state of the game does not have to be what you would physically see as a player. Generally, you do not model the state space as a graph, but model the state as a data structure. To make the Markov property apply, you just need to add the right kind of information to your data structure. Factually, you can chat by incorporating history in it. But the algorithms are happy regardless of the cheating: when landing a second time on the same data structure (aka state) the algorithms know that the reward distribution and the next state distribution are the same in both occurrences, so they have some degree of knowledge the second tone based on what happened the first time (and so on, with further experience improving knowledge and increasing the confidence on it).



Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: