Hacker News new | past | comments | ask | show | jobs | submit login

So if you ever find yourself at the mercy of a Superintelligence, simply challenge it to a round of Solaris for the Atari 2600 ;)

I still don't understand how the "prediction function" is generating frames?

From the last line of the paper it seems to suggest MuZero is generalizable to other domains.

But the appendix states "the network rapidly learns not to predict actions that never occur in the trajectories it is trained on"

Consider the problem of predicting the next N frames of video from a one minute youtabe sample chosen at random. Where there is a high probability of some sort of scene transition in the interval. Short of training on a large subset of the youtube corpus.




It doesn't generate frames at all! Actually that is the main idea of this paper. (It's not my words, the paper itself calls it "the main idea".) To quote:

> The main idea of the algorithm ... is to predict those aspects of the future that are directly relevant for planning. The model receives the observation ... as an input and transforms it into a hidden state... There is no direct constraint or requirement for the hidden state to capture all information necessary to reconstruct the original observation, drastically reducing the amount of information the model has to maintain and predict.


It is the main idea of the algorithm, but as the paper itself says, there has been a trend towards this recently. I haven't read many of the papers they cite, but I do remember one of them ("Learning to predict without looking ahead: world models without forward prediction"). They had a neat site (learningtopredict.github.io). They write:

"We speculate that the complexity of world models could be greatly decreased if they could fully leverage this idea: that a complete model of the world is actually unnecessary for most tasks - that by identifying the important part of the world, policies could be trained significantly more quickly, or more sample efficiently".

That is just what this paper does, as I understand it, by bringing it together with the tree search from AlphaZero.




Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: