
Playing FPS Video Games with Deep Reinforcement Learning - fitzwatermellow
https://arxiv.org/abs/1609.05521
======
kalid
The paper links to some YouTube videos of the AI in action:

[https://www.youtube.com/watch?v=oo0TraGu6QY&list=PLduGZax9wm...](https://www.youtube.com/watch?v=oo0TraGu6QY&list=PLduGZax9wmiHg-
XPFSgqGg8PEAV51q1FT)

~~~
loeg
It appears to have the aim & shoot part down, but is less good at dodging (or
even choosing a good weapon).

~~~
elif
dodging is not really an FPS skill, is it? Specifically in a close quarters
game like Doom with primarily non-projectile weapons. Having unpredictable or
confusing movement patterns is a useful skill against humans at some levels of
play, but in arena-style deathmatch FPS, usually you want to keep moving as it
makes you a harder target and exposes you to more potential kills. So
'dodging' ends up being simply intermittent zig zags which may or may not
marginally protect you from certain enemies, but definitely will not get you
more kills. I don't think a camping bot would get nearly as much reward
(kills) as this guy.

~~~
rfrank
it definitely is, although it's pretty game dependent how it works, and rarely
called simply dodging. i'll talk about counter strike because i know that game
the best. sound and movement are huge components of the game. in the 1.6 days
there was a trick called russian walking [1], where you bound the crouch
command to mousewheel down. that allowed you to have the silent footsteps of a
crouched player while moving at close to full speed, with your player model
jumping around like crazy because of all the crouch commands. the classic
bunnyhop [2] let you take advantage of quirks of the physics system in the
game to move waaaaay faster than normal and beat people on the opposing team
to contested areas of the map and be significantly harder to hit. and then the
basic shoulder peek [3], which is ideally meant to bait out sniper fire so you
can more safely push somewhere. then just normal little movement things;
jumping out from contested corners to be harder to hit, crouching when you're
spraying up close to be a smaller target, etc.

1\. [https://www.youtube.com/watch?v=fE-
fhaIs_cQ](https://www.youtube.com/watch?v=fE-fhaIs_cQ) 2\.
[https://www.youtube.com/watch?v=M-3EMeU1DFg](https://www.youtube.com/watch?v=M-3EMeU1DFg)
3\.
[https://www.youtube.com/watch?v=jbKnP7gmVqM](https://www.youtube.com/watch?v=jbKnP7gmVqM)

------
taliesinb
Video of the network actually playing deathmatch:
[https://www.youtube.com/watch?v=oo0TraGu6QY&index=1&list=PLd...](https://www.youtube.com/watch?v=oo0TraGu6QY&index=1&list=PLduGZax9wmiHg-
XPFSgqGg8PEAV51q1FT)

~~~
matt_wulfeck
The video really is amazing. We can see the reward system in play. The player
runs through map like a maniac (even running through groups of enemies) as a
calculated attempt to score kills.

Even human players -- knowing full well what it takes to win -- still maintain
some semblance of self-preservation. We'll use the map to our advantage and
seek strategic high ground. The robot is simply a killing machine.

~~~
PavlovsCat
That only works if the people you run past and forget about as soon as they're
off your screen kind of don't shoot you. Good luck being a "killing machine"
in that fashion against good players :P

~~~
FooHentai
But of course if you train it against that different kind of opponent, you
would see very different behaviour.

~~~
PavlovsCat
Yeah, and I'd be super keen to see that :)

------
rayuela
Not end-to-end learning. They've included a lot of apriori information into
their models to significantly constrain the search space. Maybe I'm being a
little too harsh in my opinions, but with this work coming out of CMU I was
expecting a little more.

~~~
devindotcom
They trained it with some flags to say whether an enemy or item was onscreen,
but it ran on pixel data only. I'm sure a bit more work could get it to train
on pixels only as well, but my guess is it would take more time and tweaking
than they were capable of applying.

------
spdustin
I'm anxious to see if the sorts of networks that learned how to play Atari
games could be taught to okay slither.io in the same way. I was interested in
learning more about the sort of pathfinding challenges slither.io presents
(safest path to large food reward with moving obstacles) and started to peel
apart some JavaScript "bots" and realized they're doing some basic math that's
not able to outperform a human. I wanted to make an iOS version of slither.io
whose AI snakes for offline play did better than the current official app, but
I don't know enough about the topic to dive into the deep end of it.

So if anyone has pointers on getting started with something like this for a
game not already configured for the "gym" used by many current networks
playing games, I'd love to hear more!

------
mountaineer22
So, can the AI bots play each other in Deathmatch? Could that then be used as
additional training sets?

------
empath75
It's only a hop-skip-and-a-jump from this to creating real life murder robots.

~~~
aab0
I quipped the other day after reading it, 'In retrospect, training the first
RL agents on Doom may not have been the best idea.'

------
abrookewood
Is it just the camera angle, or is the AI crouching the whole time?

------
mitts
I wonder if you could make use of the timescale command that exists in certain
game engines for faster training.

