Hacker News new | past | comments | ask | show | jobs | submit login

You're mistaken, DeepMind uses a different agent for each game. What's the same across games is the learning method. But the model is trained separately for each game.



I feel like I'm a bit confused at what constitutes a model. You basically have the same 'agent' to start off on the RL process. You can just group the learning agents together and make it (easier to tell it) automatically detect the game that has to be played.

I guess I wanted to say that there isn't a lot different to be done when an agent is being trained for different games. And that makes it a general game playing agent, doesn't it?


If you group the agents together and call it a single "general" agent, the research community (and pretty much everyone else) will call you out on your BS. That's the difference.


Them calling 'out on my BS' doesn't make it any less of a general agent if you train it sufficiently with enough games. If you consider yourself a 'general agent' who can play games with reasonable scores, I will ask you this. If I give you a new game you've never played, how will you score compared to your favorite game which you played a lot in your childhood? With the favorite game, you basically remember the gameplay and you use it when you play again. So is this me 'calling you out on your BS' at your ability to play games because you remembered your gameplay? The same way that I'm suggesting that a general agent remember its gameplay and use it to make a general game playing agent?

I really doubt we will see an agent which can be an expert at a game without even doing some computations which can fall into a grey area which people consider to be game specific computation and not general gameplay. The general agent which you are thinking of, which can be the best at a game without any thinking (about what its game playing process should be) is a fantasy. It will definitely need to have a 'gameplan' which it can get by simulating the outcomes without actually playing it.


Humans can learn how to figure our the score, can learn the state/action space itself, and can learn new games without external intervention. All those things are hard-coded into a system where you just string together a bunch of agents trained separately on different games, and where another human has to add a new agent for a new game. Humans learn the gameplan and any game-specific features. We don't have programmers plugging it into us for each new game.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: