
Beating the World’s Best at Super Smash Bros. with Deep Reinforcement Learning - willwhitney
https://arxiv.org/abs/1702.06230
======
gwern
Note: it doesn't learn from pixels but features directly from RAM; and
superhuman reaction time, with performance badly degrading when human-like
delays added.

Good discussions on Reddit:
[https://www.reddit.com/r/MachineLearning/comments/5vh4ae/r_a...](https://www.reddit.com/r/MachineLearning/comments/5vh4ae/r_a_new_foe_has_appeared_170206230_beating_the/)
[https://www.reddit.com/r/smashbros/comments/5vin8x/beating_t...](https://www.reddit.com/r/smashbros/comments/5vin8x/beating_the_worlds_best_at_super_smash_bros_melee/)

~~~
SerLava
This reminds me of Starcraft AI experiments. They can't actually make the
computer smart, so they just jam 2000 button presses per second down the tube,
giving every single unit its own simultaneous AI, and it out micromanages
anyone.

With Marines usually.

~~~
RoboTeddy
I heard that the DeepMind Starcraft project intends to limit their AI's APM
(actions per minute) down to something human-like.

~~~
wapz
I read that too but I hope they know the difference between APM and EPM. Pros
spam APM that they could never do real actions on but their EPM is
considerably lower (if the bots make actions based on pros APM they will have
an insurmountable advantage).

~~~
placeybordeaux
Pros spam APM to keep warm, during battles or production macros they will
frequently have a high EPM as well.

~~~
leereeves
There's also accuracy.

Unlike a human, a bot will always "click" exactly where it intends to.

------
brilee
Video of the AI here, playing as the black captain falcon:
[https://www.youtube.com/watch?v=dXJUlqBsZtE](https://www.youtube.com/watch?v=dXJUlqBsZtE)

------
swanson
We all know that Mew2King is first reinforcement learning AI capable of
beating Super Smash Bros pro players.

[https://www.youtube.com/watch?v=z-1YfhUFtbY&feature=youtu.be...](https://www.youtube.com/watch?v=z-1YfhUFtbY&feature=youtu.be&t=285)

~~~
forgotmysn
and he still can't beat Armada

~~~
Sniffnoy
I am possibly being here the person who accidentally takes the joke literally,
but Mew2King has in fact beaten Armada on three occasions: Once at SKTAR 3,
once at Smash Summit 2, and most recently at UGC Smash Open.

~~~
forgotmysn
haha i know he has, but M2K isn't performing the way he used to and the record
is like 12-3 in favor of Armada, if I'm not mistaken

------
jwtadvice
While the AI might be cheating by taking salient features from RAM rather than
from pixel values, this is still an incredible feat. Just a few years ago we
did not have generic algorithms that could take even salient features and
self-learn policies to near this level this quickly.

~~~
willwhitney
Yup, it's definitely an advantage to get all the correct values from the game
state. But not as much as you might think; the vision portion of a DQN or
similar trains quite quickly.

Plus, our bot doesn't have any clue about projectiles. We don't know where
they live in memory, so the network doesn't get to know about them at all.

~~~
dyselon
Can I ask what the feature set looked like? I always kind of wanted to do this
with the Skullgirls AI, but never had the time while we were developing it. As
a developer, I obviously had full access to the game state, but I'm still not
really sure what the best way to represent that state to a neural network is.

~~~
vladfi1
It was just basic stuff like player positions, velocities, and animation
states.

------
smaili
As someone who's played for quite a while I can tell you SSBM is one of the
most complex games I've ever come across.

~~~
jensv
Why do you think the game is complex? Fairly simple game with low barrier to
entry which is great when you invite guests over for games. Super Simple
Button Mash!

~~~
chrisdbaldwin
Likely due to the advanced, non-intuitive mechanics that have been discovered
over the years. The entry barrier may be low, but the skill cap is high.

------
lanius
I'm impressed it beat the likes of S2J and Zhu. I wonder how it'd fare against
the Five Gods?

------
WhitneyLand
What's the key insight here compared to previous systems?. As far as I can
tell, still no one can beat simple non-deterministic games that require some
planning.

My favorite example is Ms. Pac Man because it seems so old and simplistic.
Been tried by a dozen teams and no one can beat a decent human.

------
cerved
Civ AI has denounced this research

------
fiatjaf
I was expecting a video.

