
Deep Reinforcement Learning to Play StarCraft - wbthomason
https://arxiv.org/abs/1609.02993
======
wamatt
Generalized reasoning in strategy games AI is an especially difficult problem
for machine learning. IRL, top StarCraft players routinely model their
opponents mental states and psychology to create an edge.

So perhaps it's worth pointing out, that this paper specifically addresses a
sub-problem of Starcraft play, micromanagement ('micro') [1]

The game engine runs at 24 frames per second. (As an aside, 'frames' in this
context likely does not map to physical FPS of the display).

 _> We ran all the following experiments with a skip_frames of 9 (meaning that
we take about 2.6 actions per unit per second)._

The research team found that attempting to move at a superhuman pace (eg one
action every frame), resulted in a subpar performance and hyper-
parameterization indicated 2.6 to be an ideal action per second.

In context, this translates to an APM of 156. Or, roughly half that of
professional Korean e-athletes. [2]

[1]
[https://en.wikipedia.org/wiki/Micromanagement_(gameplay)](https://en.wikipedia.org/wiki/Micromanagement_\(gameplay\))

[2]
[https://en.wikipedia.org/wiki/Actions_per_minute](https://en.wikipedia.org/wiki/Actions_per_minute)

~~~
aab0
"The researchers found that attempting to move at a superhuman pace (eg one
action every frame), resulted in a subpar performance."

Moving at extremely fine-grained timesteps can make learning much more
difficult, because now a reward arrives millions of timesteps delayed rather
than hundreds or thousands. It's like trying to teach a NN to compose piano
music by starting down at the 1ms raw audio level. This is part of why audio
synthesis was so difficult up until recently with DeepMind's WaveNet. In
theory, being able to move every frame should enable extremely superhuman
performance, but in practice, you can't learn your way there. So often people
will chunk data to make it easier to learn the higher-level concepts: operate
on words, rather than characters, for example.

~~~
raus22
Why not go the other way and decrease the actions per minute so you learn the
overall point of the game , And with each game the actions per minute
increases.

~~~
comex
Or maybe extend the traditional categories of macro and micro with another
one, call it 'nano'... the micro agent indicates where each unit ought to be
in 9 frames, and the nano agent figures out how to take them there. Since the
timescale is so short, the agent could brute-force enumerate possible moves to
some extent and figure out which is optimal, like chess AI. Or use a separate
network.

I guess that's inelegant when a deep network already has its own concept of
fine-grained versus coarse-grained layers, and should be able to do this on
its own with the right training method.

------
anirul
After loosing on the Go territory seams like fb is trying to challenge
Alphabet on StarCraft. DeepMind already declared they will go for StartCraft
as a next challenge does it mean that they accept the challenge? I'm actually
happy to see what could be the result! The only weak point for the StarCraft
community is that it would be on SC1 and not SC2.

~~~
whorleater
SC1 is not a weak point, it's generally considered to be better balanced, and
is the "gold standard" for RTS's. SC1 also has far more training dataset than
SC2.

~~~
ufo
In addition to the training data, there is also the BWAPI project, that lets
the bots play the game against other bots or humans.

[http://bwapi.github.io/](http://bwapi.github.io/)

There isn't something similar available for SC2 due to a mix of technical and
nontechnical issues:

[https://github.com/bwapi/bwapi/wiki/FAQ#will-there-be-an-
api...](https://github.com/bwapi/bwapi/wiki/FAQ#will-there-be-an-api-like-
this-for-starcraft-ii-faq-sc2)

~~~
whorleater
Yep! But as someone that built something on top of the BW API, I'd wager
anyone going after SC1 AI to probably write their own thing. It's still an
amazing API for general heuristics and modeling, but there's a few issues with
it that stand in the way of making it a scaleable foundation.

~~~
ktRolster
What is wrong with it?

------
h4nkoslo
It's a misconception that StarCraft is a strategy game. If you look at how
it's actually played by human pros, it looks closer to a fighting game; very
reflex-driven & heavy on micro-interactions. You would expect an un-gated AI
with effectively infinite actions per second to do very well.

~~~
PhearTheCeal
It seems like most people in this thread are claiming that high APM and the
ability to have perfect control over everything happening on the map will give
the AI an advantage that will force a win.

Here is a video of one of the best current StarCraft bots losing to an D-rank
(low skill) human player. The bot's APM is ~5500 while the human's is ~200.
[https://www.youtube.com/watch?v=ztNYOnx_YQo](https://www.youtube.com/watch?v=ztNYOnx_YQo)

The fact is, no AI has ever beaten even an amateur player in a tournament.
Even with great micro, if your play is too predictable then the human will
learn it and exploit it.

I, for one, am very excited to see the development of new StarCraft AIs. And
especially SC2 AIs so that it can challenge the current world champions.

------
Analemma_
This is pretty cool, although I think MOBAs (Dota, LoL) would be an even
better test of AI skills than StarCraft. They also have imperfect information,
but place more importance on strategy and less on micro than StarCraft;
require some game theory and bluffing in the draft, and would need multiple
agents to cooperate (assuming you set it up so that you had 5 AIs play the
game, with well-defined communication channels, rather than one controlling
the five players, which I think is the right way to go).

Seems like there's more potential for useful AGI techniques in that direction.

------
loser777
Increased AI performance in RTS is always exciting, but part of me is
disappointed by the fact that the AI doesn't "see" or interact with the game
that humans do. That is, humans don't play the game by querying the
state/status of each unit and then issuing commands via some API. It would be
fun (though complicate things significantly) to produce an AI that at least
has some notion of a mouse/keyboard so that you could see it in action from a
first person perspective.

------
NikolaeVarius
Screw Go. Beating the Terran Emperor in a game of Starcraft is when the
computers will finally take over the world.

It would be interesting though. How would a program that has PERFECT micro
fare against a professional Starcraft player. Would a program reliably figure
out how to kill 10 banelings with a few marines and a medivac by using the
fact you can micro them to be able to do it without taking losses? Even if it
could, would it know WHEN its even worth it to do so?

~~~
danielvf
There are already incredibly good micro StarCraft AI, able to operate at
thousands of APM - this isn't the limiting aspect of current StarCraft AI.

The "Hard" part of StarCraft is that it is a huge Rock Paper Scissors game
with only the information you fight for. You have to be able to piece together
a picture of your opponents actions and forces from small cues.

~~~
pkfrank
Wow, I hadn't seen this before.

Here is "Automaton 2000" controlling 20 marines vs 40 banelings, without
losing a single unit.

[https://youtu.be/DXUOWXidcY0?t=52](https://youtu.be/DXUOWXidcY0?t=52)

Pretty cool.

~~~
danielvf
And as cool as that is, this is even more terrifying, as a hundred zergslings
dodge seige tank cannons and destroy them.

[https://youtu.be/IKVFZ28ybQs](https://youtu.be/IKVFZ28ybQs)

It's enough to make you scared for the future of humanity.

~~~
nickpsecurity
What... the... hell..!? That's basically what they show on movies where the
action stars have superhuman movement. Except, it's zergling's perfectly
coordinating the demolition of siege tanks. Awesome demo of AI micro.

------
pinouchon
I think Starcraft is a very interesting challenge for AI because it involves
planning in an environment that is only partially observable: you must scout
in order to see what your opponent is up to, and even then, you don't see
everything. If DeepMind works on this, I really hope that they constrain the
AI (APM-wise) so that its only chance of winning is by good planning and
strategy, not super-fast micro.

~~~
danielvf
According to this paper Facebook is only working on micro - attempting to win
a few simple (one to two unit types) battles that humans can win 100% of the
time against AI.

As you said, the glory of StarCraft is it's strategic level information game.
Will be interesting to see what comes out of attempting to learn that.

------
nickpsecurity
Prior work and why I love StarCraft as a testbed for AI described here:

[http://webdocs.cs.ualberta.ca/~cdavid/starcraftaicomp/report...](http://webdocs.cs.ualberta.ca/~cdavid/starcraftaicomp/report2015.shtml)

The two papers in RTS techniques sections are a must read for an idea of what
problems it poses along with results of prior attempts. The ability of human
pro's to detect AI patterns and defeat them with bluffs is pretty consistent.
StarCraft, like Poker, involves lots of psychological analyses and ploys.

Even if Google or Facebook make one, I still think of humans as superior until
it can learn how to beat them with mere dozens to hundreds of games rather
than what was fed into AlphaGo. That wasn't human equivalent or superior so
much as approximating the results of nearly all human activity in the space
then focusing it against one human. You could call it superhuman but it
required tons of activity by brilliant humans. Brilliant humans require little
with the champions a lot less than the automated techniques. Lots of self-
discovery with limited data. I want to see the AI's pull that off plus keep it
going when encountering humans with innovative, never-before-seen strategies.
That's when I'll give them credit as useful on barely-defined problems with
curveballs like humans.

------
scrollaway
If anyone is interested in deep learning around Blizzard games, there is an
active AI community around Hearthstone in the `#hearthsim` and `#hearthsim-ai`
channels on Freenode. cf [https://hearthsim.info](https://hearthsim.info).
Starcraft AI discussions welcome!

We're also discussing support for such projects using game replays from
HSReplay.net :)

