Hacker News new | past | comments | ask | show | jobs | submit login
Playing FPS Video Games with Deep Reinforcement Learning (arxiv.org)
61 points by fitzwatermellow on Sept 21, 2016 | hide | past | favorite | 19 comments

The paper links to some YouTube videos of the AI in action:


It appears to have the aim & shoot part down, but is less good at dodging (or even choosing a good weapon).

dodging is not really an FPS skill, is it? Specifically in a close quarters game like Doom with primarily non-projectile weapons. Having unpredictable or confusing movement patterns is a useful skill against humans at some levels of play, but in arena-style deathmatch FPS, usually you want to keep moving as it makes you a harder target and exposes you to more potential kills. So 'dodging' ends up being simply intermittent zig zags which may or may not marginally protect you from certain enemies, but definitely will not get you more kills. I don't think a camping bot would get nearly as much reward (kills) as this guy.

it definitely is, although it's pretty game dependent how it works, and rarely called simply dodging. i'll talk about counter strike because i know that game the best. sound and movement are huge components of the game. in the 1.6 days there was a trick called russian walking [1], where you bound the crouch command to mousewheel down. that allowed you to have the silent footsteps of a crouched player while moving at close to full speed, with your player model jumping around like crazy because of all the crouch commands. the classic bunnyhop [2] let you take advantage of quirks of the physics system in the game to move waaaaay faster than normal and beat people on the opposing team to contested areas of the map and be significantly harder to hit. and then the basic shoulder peek [3], which is ideally meant to bait out sniper fire so you can more safely push somewhere. then just normal little movement things; jumping out from contested corners to be harder to hit, crouching when you're spraying up close to be a smaller target, etc.

1. https://www.youtube.com/watch?v=fE-fhaIs_cQ 2. https://www.youtube.com/watch?v=M-3EMeU1DFg 3. https://www.youtube.com/watch?v=jbKnP7gmVqM

It is, actually. E.g. typically you can only shoot at one opponent at a time. Even in Doom, using cover effectively reduces the number of opponents who can shoot at you while you finish killing a single opponent.

In faster-paced shooters with more range of motion, moving randomly makes it harder to get hit.

Video of the network actually playing deathmatch: https://www.youtube.com/watch?v=oo0TraGu6QY&index=1&list=PLd...

The video really is amazing. We can see the reward system in play. The player runs through map like a maniac (even running through groups of enemies) as a calculated attempt to score kills.

Even human players -- knowing full well what it takes to win -- still maintain some semblance of self-preservation. We'll use the map to our advantage and seek strategic high ground. The robot is simply a killing machine.

That only works if the people you run past and forget about as soon as they're off your screen kind of don't shoot you. Good luck being a "killing machine" in that fashion against good players :P

But of course if you train it against that different kind of opponent, you would see very different behaviour.

Yeah, and I'd be super keen to see that :)

Not end-to-end learning. They've included a lot of apriori information into their models to significantly constrain the search space. Maybe I'm being a little too harsh in my opinions, but with this work coming out of CMU I was expecting a little more.

They trained it with some flags to say whether an enemy or item was onscreen, but it ran on pixel data only. I'm sure a bit more work could get it to train on pixels only as well, but my guess is it would take more time and tweaking than they were capable of applying.

However it runs on test time from raw pixels only.

I'm anxious to see if the sorts of networks that learned how to play Atari games could be taught to okay slither.io in the same way. I was interested in learning more about the sort of pathfinding challenges slither.io presents (safest path to large food reward with moving obstacles) and started to peel apart some JavaScript "bots" and realized they're doing some basic math that's not able to outperform a human. I wanted to make an iOS version of slither.io whose AI snakes for offline play did better than the current official app, but I don't know enough about the topic to dive into the deep end of it.

So if anyone has pointers on getting started with something like this for a game not already configured for the "gym" used by many current networks playing games, I'd love to hear more!

So, can the AI bots play each other in Deathmatch? Could that then be used as additional training sets?

It's only a hop-skip-and-a-jump from this to creating real life murder robots.

I quipped the other day after reading it, 'In retrospect, training the first RL agents on Doom may not have been the best idea.'

Is it just the camera angle, or is the AI crouching the whole time?

I wonder if you could make use of the timescale command that exists in certain game engines for faster training.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact