Hacker News new | past | comments | ask | show | jobs | submit login
DeepMind and Blizzard Open StarCraft II as an AI Research Environment (deepmind.com)
603 points by nijynot on Aug 9, 2017 | hide | past | web | favorite | 270 comments

A lot of people here seem to be underestimating the difficulty of this problem. There are several incorrect comments saying that in SC1 AIs have already been able to beat professionals - right now they are nowhere near that level.

Go is a discrete game where the game state is 100% known at all times. Starcraft is a continuous game and the game state is not 100% known at any given time.

This alone makes it a much harder problem than go. Not to mention that the game itself is more complex, in the sense that go, despite being a very hard game for humans to master, is composed of a few very simple and well defined rules. Starcraft is much more open-ended, has many more rules, and as a result its much harder to build a representation of game state that is conducive to effective deep learning.

I do think that eventually we will get an AI that can beat humans, but it will be a non-trivial problem to solve, and it may take some time to get there. I think a big component is not really machine learning but more related to how to represent state at any given time, which will necessarily involve a lot of human-tweaking of distilling down what really are the important things that influence winning.

I think a big component is not really machine learning but more related to how to represent state at any given time, which will necessarily involve a lot of human-tweaking of distilling down what really are the important things that influence winning.

I agreed with everything you said until here. Developing good representations of state is precisely what today's machine learning is so good at. This is the key contribution of deep learning.

You seem to be supposing that a human expert is going to be carefully designing a set of variables to track, and in doing so conveying what features of the input to pay attention to and what can be ignored. Presumably the ML can then handle figuring out the optimal action to take in response to those variables.

I think it's much more likely to be the other way around. ML is really good at taking high dimensional input with lots of noise and figuring out to map that to meaningful (to it, if not to us) high-level variables. In other words, modern AI is good at perception.

What it's significantly less good at compared to humans is what might formally be called the policy problem. Given high level variables that describe the situation, what's the best course of action? This involves planning. We think of it in terms of breaking the problem into sub-objectives, considering possible courses of action, decomposing a high level plan into a sequence of directly executable actions, etc. AIs might "think" of this problem in different terms than these, but it seems like it still has to do this kind of work if it is going to have a chance to succeed.

We don't have obvious ways to model this part of the problem. For the perception/representation building problem, I can almost guarantee the solution is going to be a ConvNet to process individual frames combined with a recurrent layer to track state over time. On the other hand, I'm seeing some plausible solutions to the policy problem emerging in the literature, but it's still very much an open question what will emerge as the go-to. In AlphaGo, this part of the problem is where they brought in non-ML algorithmic solutions like Monte Carlo tree search, and one of the reasons StarCraft is interesting compared to Go is that those algorithmic solutions are harder to apply.

i feel like you misunderstood that part of the argument.

he is saying representing the state is very hard, and you are saying: given a well represented state, ML is very good at finding the important features, reducing the dementionality, and finding mathematical transformations, etc.

deep learning has been so successful with images because representing them is trivial - flattened pixel vector.

with your last paragraph is that in starcraft, that raises some questions on what rules is the AI going to adhere to.

in SC, you don't view the entire board. you view the minimap / hear noises and alerts and decide were to focus your attention on the map. in battle, being able to click and accurately place attacks quickly is important.

Do you give the computer full view of what they would be able to see? does the computer have 10 million clicks per second abilities, essentially every action is like hitting pause and then making the next action?

I was actually assuming the input representation would just be a video stream, which (combined with audio) is enough for human players, but looking more into it, it's a lot more than a video feed[1].

It feels a little like cheating, but I guess processing the game UI video feed isn't the interesting part of the problem. Plus, it makes the problem much more accessible to hobbyists who can't afford the GPU cluster required to productively experiment on models that process streams of 1080p video.

Still, in principle, I think modern ML modeling approaches could handle the problem of transforming the video feed into a useful high level state representation. I don't think I misunderstood the OP in that regard at least.

[1] - https://github.com/deepmind/pysc2/blob/master/docs/environme...

Using just the video feed, the AI would be required to reconstruct an overview of the strategic situation, and then develop a forward strategy on top of that involving individual units. Even for a much simpler game like doom, video-only input is enough for strategies like "see an enemy, target and shoot it as fast as possible".

For an AI to be able to effectively compete in a complex game like SC2, preparing high-level inputs is important. Look at these like shortcuts, heuristic approximations of task that would be hard to represent and train with deep learning. I would guess an implementation would need multiple independent nets for various tasks, combined with heuristics. Then each could be separately trained to do the given task.

People should just read the article, I think. It answers all the things you are debating (limit on APM, what features are used, what models they already tried and how well they perform).

>a ConvNet to process individual frames combined with a recurrent layer to track state over time. > are harder to apply Thats an understatement: Starcraft is immune to Monte-Carlo approach or anything based on analyzing pixel data: The tree state of actual battle has thousands of choices pet unit per second with minor variations in location, there is no discrete state of chessboard(at best millions of cells): viewing the game at low-level(pixels) creates gigantic amount of data. units constantly move/attack/die and get blocked by other units/terrain.

Predicting an enemy move(MC simulation) will be impossible and you can make several moves per second(even at 120-140 APM) easily. That means 1.you need real-time response, unlike Go there isn't a time buffer to decide 2.you always need to react at the current time(or allowing enemy advances) 3.there are very few "good moves" in starcraft(moving randomly on the "board" will just waste time) , so MC simulation will miss them more than 99% of time due randomness.

MC approach is vastly inferior in this case, i think they'll be forced to operate on higher level strategy rather than just microing every unit optimally(i.e. treating it like chess in real-time). Brute-forcing billions of potential moves simply won't work.

>Brute-forcing billions of potential moves simply won't work.

The problem is all AI/ML is essentially recorded, recursive, constrained brute forcing.

You can apply it on higher level like that guy who bruteforced the 7roach rush for Zerg in SC2. http://lbrandy.com/blog/2010/11/using-genetic-algorithms-to-... Problem is that build orders are just optimizing the opening economy and these unbalanced openings will be just patched out in the future.

I'd argue modern AI is sort of terrible at taking high dimensional data and finding an effective representation of it. It works better than a lot of other methods in ML but as far as I know pure reinforcement learning applications are sort of lack luster, and even dimensionality reduction success stories tend to rely on scrubbed, careful data treatment by people.

I wonder if we will see any advanced cheese strats come out of this. I'm assuming some implementations will eventually develop micro control that is far beyond any human player's capabilities, which would make things like all-in probe rushing much more viable. Instead of playing the normal meta in a computer-vs-human, I imagine an advanced AI would simply send all of its workers off the mineral line as soon as the game starts, and attempt to out micro the human opponent before they can build an army-producing building.

I know this isn't the exact same as the article, but when genetic algorithms were introduced to solve for build orders, the "seven roach rush" was in vogue, something that was unexpected at the time and "discovered" using GA.

I think there is a space for finding strategies that have more leeway in execution and thus are more suitable for humans to pilot rather than have machine level micro.

I love the story of the Seven Roach Rush. To quote the linked article, "The most interesting part of this build, however, is how counter-intuitive it is. It violates several well-known (and well-adhered-to) heuristics used by Starcraft players when creating builds."

I'm fairly certain that this application of machine learning will present some surprising strategies.


Part of the problem SC2 units are not very balanced and each patch tries to make them "more balanced". SC2 design settled on making unique units at cost of balance.

Roaches in fact are quite overpowered, with very fast regen and quick ranged attack(they move as fast as hydralisks). A versatile and low-cost unit(cheaper than a Hydralisk). And the reason they're so powerful, SCII units of other races in general are more powerful than broodwar units and have less weaknesses. Broodwar instead has weak, easy dying units that force micro to extend their lifespan. SC2 units always have easy regen/heal/repair and the player just masses them in huge attack groups with minimal micro(their blocking boxes are tiny and pathing is good enough). The rock-paper-scissors from broodwar(which ephasized soft-counter) morphed into hard counters to everything, which lowered the strategic depth to "make whatever kill the majority of enemy unit type"(since its the best cost-effect decision at any point). SC2 "pro matches" are never decided in micro battles, they're most a competition on who can more effectively spend resources. SC2 micro is laughably unoriginal and tactically irrelevant(resource competition is far more important). ..And the reason SC2 can't have good micro in principle is not the 3D engine overhead, its server latency and lag. Perfect LAN games in broodwar with sub 10ms latency and millisecond reflexes can't exist within central servers hosting hundreds of players.

SC2 is very balanced when you look at the total game across a range of skill levels. Each unit is balanced around costs, attention, utility, requirements, and other units. Ex queens are awesome early game as they hit air, don't take gas or larva, and have high HP, but late game it's all about healing ultralisks plus anti air.

Roaches take a lot of supply, are armored so take more dammage from tanks and can't hit air which is their counter. They are also ranged so they don't share dammage upgrades with lings or ultralisks. Late game players will sacrifice them for more useful units and they need a larger investment in overlords.

PS: If you watch some high level games units generally have their time to shine as part of individual games progression.

Currently bot micro can perfectly time hit and run "dance" maneuvers on all their attacking units independently (several top bots are Terran and do this with Vultures). But solving this in a way that takes terrain into account is much, much harder, and a skilled human could chase the whole army into a wall and kill it.

Similarly there are worker rush bots that do some impressive things against other bots, but positioning is hard and a skilled human can beat the bot by clumping its workers up in the right shape.

Previous AI StarCraft tournaments have put an actions-per-minute (APM) cap on the AIs, which prevents them from micro-managing individual units.

I think this definitely needs to be the case if this is going to be an interesting research project at all.

Edit: Nevermind, my intuition in this seems to be wrong - someone more knowledgeable about this claims below that computer mechanics are still not as good as a good human's in the SC1 version of this.

In SC1 it's trivially easy to make a bot that spams so many actions it prevents your units from functioning properly. In fact it's easy to do accidentally. A lot of my early efforts into making a bot have been trying to find ways to reduce its APM without making it harder to code.

"More actions = better" makes sense because we're used to human players who are using all their actions for something relatively effective and because (I'll assert) they're well below the optimal APM. But the optimal APM is probably something like 1000, not the 10k a bot can easily reach.

So the challenge for the AI would be to figure out which action to do with the limited supply of actions and time.

I'd say we're certainly going to see crazy advanced cheese strats - ones that humans wouldn't be able to hope to pull off. This could definitely be done with computer micro and wouldn't be defendable with human micro.

An example would be moving probes around in such a way to maximize their shield regen - or switching the top clickable unit while stacked - who knows...

But if that's all the AI can do, and people know it, then defending should be pretty easy. Any worker or land-based rush can be fairly trivially defended by walling off.

A sensible thing for human-AI matches is to enforce a maximum number of actions per second and/or actions over period of time, which would be in line with a standard human player.

Actions per minute (APM) could be limited to a max of 100.

As a long time StarCraft fan I don't share your point of view :

People usually refer to StarCraft as a strategy game but there's actually really little strategy involved : during the first weeks after a new map pool is released, the pro players explore different build orders that are strong on it. And after this period, when the meta-game has settled, the winner of a match (best of 3 or 5) is almost always the one who has the best mechanics (including scouting, unit micro-management and multi-tasking) and sc1 AI are already way better than humans in that field.

Unless you add some artificial limitation to the AI (for instance, a hard limit of APM[1], at an arbitrary level) I don't really think the challenge will be exciting. Imho it will look like a race between a cyclist and a motorcycle : on the mechanics point of view, the machine wins easily without need for intelligence.

[1] action per minute

As someone who probably has played at a higher level than you I disagree.

Yes, it's true that mechanics are a large determinant in who wins between people. But pro humans are not easily thrown off by odd or novel strategies or tactics. They can react to things that introduce small wrenches into their build order without serious issues. Players can even adapt to things they've never seen before. The issue with Starcraft is that the state space is so large that it will likely be hard to get an AI that can flexibly and intelligently react to unusual or bizarre things that mess with their build order, because the neural nets will have nothing to account for, say, a mid-game cannon rush, or whatever.

If it was simply a matter of computers taking humans things and executing them better, computers would already be better than humans at Starcraft (there have been plenty of AI competitions using Brood War), but they're not. Not even close.

> But pro humans are not easily thrown off by odd or novel strategies or tactics.

In tournament, with all the inducted stress they definitely do : see Lilbow vs Life in blizzcon 2015, or the whole run of Symbol in Iron Squid one. For BW, see Flash vs Stork in whatever MSL or OSL finals (in 2009 or 2010 I think).

> the neural nets will have nothing to account for, say, a mid-game cannon rush, or whatever.

The AI just need to know how many 2-2 zerglings you need to destroy a cannon in that position (or hydras, or whatever unit it has available around and pick the most cost-effective way to deal with the cannon). The thing is that the AI can deal with this in the most efficient way while perfectly microing two groups of mutalisks and defending against a reaver drop[1].

In fact you don't even need deep learning for that since there's a finite number of encounters like this (cannon vs any unit) and I'm pretty sure some guy on TeamLiquid already covered it in depth :p.

> there have been plenty of AI competitions using Brood War

AI competition featuring matches of AI vs AI are interesting, my point is that AI vs man probably won't.

[1] OK, I'm mixing events really unlikely to occure at the same time but you get the idea.

He said pros are “not easily” thrown off by novel strategies, not that they are “never” thrown off by such, so your examples are the exceptions that prove the rule.

FWIW I think you also drastically underestimate how many things the AI has to take into account. It isn’t just how many 2-2 zerglings you need to destroy a cannon, it’s making educated guesses of what you’re opponent is doing while you’re attacking the cannons, or how the terrain affects how you can attack, or what units the opponent may have be in the fog of war ready to ambush. Represting all these factors, let alone calculating them, is no trivial task.

I'm pretty sure you're aware that Life has been banned from professional SCII for match fixing, but if not, I just want to throw that in there.

I used to be pretty active on iccup, and I was a masters-level sc2 player for a while during the beta and when it was first released. So I'm definitely familiar with starcraft and what it takes to become a good player.

I think you're misunderstanding a big part of what is "easy" and "difficult" for humans vs ai. Yes, go is absolutely a more challenging games for humans than starcraft (I also play go, although not very well - currently around ~7k on igs). Starcraft is strategically a much simpler game than go. You are correct in stating that mechanics is what makes starcraft hard for most people, and yes if the computer knew exactly what to do, it would be able to execute it faster and without making any multi-tasking mistakes. But strategy is not what makes starcraft a challenge for ai. Tasks that are trivial for humans can be extremely difficult for ai.

Computers are way better at tree searching than humans, for the obvious reason that they run much faster than brains. So games with relatively small state-spaces, like checkers, are solved quickly. But as you increase the state space, it becomes impossible to search all possible future moves, and this was why go was intractable for such a long time.

The big advancement in alphago is that by using deep learning it is able to evaluate different board-states without doing any search, using a neural net. This allows it to massively prune the search space. Humans are able to do this through "intuition" gained through experience - talk to any advanced go player and ask them about specific moves and they will tell you things like "this shape is bad" or "it felt like this was a point of thinness". AlphaGo was able to gain this "intuition" by training on a massive dataset of go board positions.

In go, the rules are very simple - 19x19 board, each turn you can put a stone in any not-surrounded open space. Its also a turn based game. The state at any given time is fully known. Starcraft is real-time, there are tons of different actions you can take, the actions are not independent (pressing attack does something different if you have a unit selected or not), the game state is not fully known and a given state can mean different things depending on what preceeded it. Not to mention that the search space is massively massively larger. To create a representation of this that can be fed into a neural net and give meaningful results (something like at a given tick, score all possible actions and find the best one) is going to be incredibly difficult. An order of magnitude more difficult than go, imo.

>The big advancement in alphago is that by using deep learning it is able to evaluate different board-states without doing any search, using a neural net.

It still uses a Monte-Carlo Tree Search to get to the level where it can beat human pro players.

>Starcraft is real-time, there are tons of different actions you can take, the actions are not independent (pressing attack does something different if you have a unit selected or not), the game state is not fully known and a given state can mean different things depending on what preceeded it.

And yet StarCraft is extremely primitive as far as strategy games go. Most of the stuff you can do in the game simply doesn't matter, and the stuff that matters could be modeled at a much coarser level than what people see on the screen. Knowing how this stuff works, I'm willing to bet this is exactly how Deep Mind will approach the problem. They will try many different sets of hand-engineered features and game representations, then not mention any of the failed efforts in their press releases and research papers.

The choice of StarCraft as their next target reeks of a PR stunt. Sure, there might be no AIs that play at pro level now, but there wasn't any serious effort or incentive to build one either, and now Google will throw millions of dollars and a data-center worth of hardware at this problem.

As far as I'm concerned, real AI research right now isn't about surpassing human performance at tasks where computers are already doing okay. It's about achieving reasonable level of performance in domains where computers are doing extremely badly. But that won't get you a lot of coverage from the clueless tech press, I guess.

What are their other options besides Starcraft2? This doesn’t seem like a PR stunt (not that the PR isn’t a bonus), but there’s already a history of AI competitions for Brood War, the game is more balanced than arguably any other RTS, and even though it is “primitive” as a strategy game in your estimation, AI isn’t ready to tackle a more advanced strategy game.

>What are their other options besides Starcraft2?

Uh, real strategy games? Something like this:


Very simple ruleset, huge strategic depth.

Why do you not think StarCraft is a real strategy game?

> It still uses a Monte-Carlo Tree Search to get to the level where it can beat human pro players.

I'm not sure this was true of the first AlphaGo, probably wasn't true of the Sedol AlphaGo, and definitely isn't true of Master.

Do you have an article about this? I never heard that AlphaGo moved away from MCTS.

It uses MCTS, but that's not the same thing as the claim, now is it? If you look at the win rates in the AG paper for the NN vs MCTS+NN and then consider the performance curve, use of a single TPU, crushing superiority of Master's flawless 60 blitz matches and Ke Jie matches despite very fast moves, the released self-play matches, and comparing with FB's Dark Forest, it's clear that the AG NN all on its own, without any MCTS, is a truly formidable player that would likely crush many pros, although I don't know if it would reach Sedol or Ke Jie levels of play.

This comment in insightful, thanks !

> Not to mention that the search space is massively massively larger

That's what I'm not really convinced about. The build-order space is not that big (compared to Go's positions) and once you got a good micro-management engine I'm affraid this will lead to something like : if protos or zerg pick protoss then 8 gate -> 9 pylon -> scout : if no counter to 4-gates, then 4-gates and win from out-microing.

The preferred opening for Protoss in PvZ (on most maps) is the forge fast-expand. If the Zerg player doesn't want to play an economic game in response, they have a variety of all-in strategies available. There is a lengthy article on Team Liquid about how Protoss should respond to these.


What I'd note here is that:

1. It's a rather long list.

2. Good scouting is required for most of these situations.

3. There are terrain-based considerations all over the place.

4. There are considerations based on how many units were lost in earlier engagements all over the place.

Enumerating all the build orders (#1) is pretty easy (as you said, build order space isn't that big), but the interaction between terrain and building placement (#3) is a lot more complex and starts to interact with the full game's massive search space more, and the followups are dynamic (#2, #4) so I don't think the game will degenerate into a solved solution as long as it looks anything like regular play.

It's possible that there's some degenerate micro-based solution that turns everything on its head, of course. Bot-based vulture micro might rewrite part of the Terran matchups, but it doesn't seem insurmountable yet. My own bot gets units across the map 5% to 10% faster than normal, but that doesn't look like enough to break the game even with a 4pool.

Protoss hasn't went FEE PvZ in a long time. It used to be good, though.

Well, egg on my face then. What's the current choice?

Adept openings are more popular nowadays

Actually I realize that my cyclist vs motorcycle analogy is better that I first thought :

You might think that a race between the winner of the last Tour de France and an automated motorcycle is a good challenge because automated driving is hard (especially with crowd running around on the road and temporary road signs for the race circuit).

But in fact it wouldn't be funny because all the motorcycle have to do is follow the cyclist during the whole race (which is not really challenging in term of self driving AI) and just sprint during the last couple dozen metters with no hope for the cyclist to win.

It might only become interesting AI-wise if you add arbitrary rules like «limit the power of the motorcycle» and/or «limit the amount of fuel to limit the number of accelerations». But you're not really doing a Man vs Machine challenge.

That's only assuming they are competing at the same time. An average of timed trials across a pre-defined route (like TDF) would allow you to present a similar challenge to both the human & the automated motor cycle.

Couldn't an analogous structure be used to assess the AI for the SC II as well?

Disclaimer: I only know about SC. Not really a player.

What about AI's fighting each other on Starcraft 2? Will that be interesting?

Interesting. So would you say that there are two parts here, figuring out a general strategy for a new map and then maximizing execution?

Skill is often divided into 3 components: macro-management; micro-management; and mechanics.

Macro refers to decisions regarding economy. It includes finances, build order, counters, etc. Macro is mostly strategic.

Micro refers to decisions regarding battle. It includes troop positioning, focus fire, kiting, etc. Micro is mostly tactical.

Mechanics refers to execution. I.e. do your fingers have the dexterity and APM to accomplish your goals effectively? If not, practice makes perfect.

Except you can only execute your plan until you have contact with your opponent, and then it's improvisation. The rest of the game is only 'easy' if you've executed your opening far better than your opponent, or you have a rock-paper-scissors situation where your build dominates theirs. Both situations are not very common at the pro level.

More often than not, yes even though there's some counter-examples with some players playing against the meta with great succès.

I wonder if AI will be able to bring that to another level. Recognize the counter and adapt.

Interesting to see unfold.

It's possible you're referring to the tradeoff between exploration vs exploitation.

Do StarCraft AIs (the ones that are included with the game) cheat? (e.g. can they see past the fog of war?)

The ones built into the game can see past fog of war. Blizzard published a nice overview of how they work, though I don't think it actually mentions that cheat. http://classic.battle.net/scc/faq/aiscripts.shtml

The ones created using BWAPI cannot (you can call a function that lets you, but it's banned in all tournaments). The one thing that BWAPI bots know that a human doesn't is a persistent unit ID - if a marine leaves fog of war and comes back, you can check if the ID is the same and know whether it's the same marine. It also tells you where cloaked units are, but a skilled eye can see those already.

Are you sure? I remember the video about the whole blizzard story and the Ai creator of starcraft2 said that the new AIs, compared to those of starcraft, have only the same information a player has and no way to "cheat"

I thought the question was a bout the original game?

I don't know much about the SC2 AIs. You could play some games and watch the replays to tell if consistently looks like it's responding to things it shouldn't know. I remember it being pretty blatant in BW.

Last I checked most of them don't cheat, but a few do (I think they're labeled as cheating though?).

Minor nitpick, video games running on digital computers are by definition still discrete even if they feel continuous. Networked multiplayer wouldn't be possible in RTS games if that wasn't the case. The granularity of unit positions and turns in Starcraft obviously leads to a much larger state space, so I get what you're saying, for AI its effectively continuous.

They're discrete with such high cardinality that successful approaches will likely model them assuming they're basically continuous. Neural network layer activations are also discrete after all, but they're often 256+ dimensional vectors of float32s or float16s.

Well, WaveNet[0] outputs audio in the time (not freq.) domain using PixelCNN, so it's not unthinkable.


If you're gonna be like that, our 'real' universe may well be discrete given that there are minimum possible lengths and time intervals.

For a game of go the entire game state is known to each player. That's the diff. For vidya games state is hidden to the player if the player cannot 'see' it. Therefore u wrong fam.

I didn't say anything about hidden information. That clearly makes SC more challenging than Go, as it requires the AI to build some kind of mental model of possible player states from limited information.

The term the grandparent post meant to use is "imperfect information game" versus Go, which is a "perfect information game."

I don't know if I would label SC2 as continuous. I don't think anything happens to the game state at a finer granularity than tick level. So to me it seems that it's also discrete (but with the state changing 44.8 a second at default speed). I agree though that this looks more challenging for ML methods.

I haven't looked at if they limit the rate of commands that the AI can issue, otherwise this will be something that can be a very big advantage to the AI once it learns to micro ...

It's not literally continuous, but it is real-time rather than turn-based, and positions of units are essentially floats rather than (a small range of) ints. That makes it effectively continuous (too large to just generate a tree of all possible actions and then prune).

Are you sure positions of units are essentially floats? Given how the units seem to arrange themselves (from what I see), I would guess that it's not close to the full range of floats, and instead there are just a few fractional pixel locations that units snap to. This is just a guess however.

-- If this is the case though, the space could be represented by taking larger integer values (say, a magnitude of 1 or 2 higher) to represent positions at a fractional pixel level (say, in 100ths of a pixel).

Buildings snap to a grid. Units take up space according to a hitbox. Hitbox size varies according to each type of unit (E.g. Thors are huge). This becomes important when dealing with AoE.

Consider a group of mutalisks. If you select-all and issue an attack-command or move-command, the mutalisks will bunch up tight and then disperse. Cf a video on the "magic-box technique".

So I wouldn't be surprised if position-values were floats.

I haven't looked at SC2 specifically, but would be surprised if they were floats; I sometimes read game dev blogs and RTSs on average tend to be implemented using int/fixed-point based positioning, to eliminate floating point imprecision as a source of multiplayer sync issues.

Even if you just bucketized things at the pixel level, that leaves you with a range in the thousands for each dimension.

> Starcraft is a continuous game and the game state is not 100% known at any given time.

It seems to me that multiplayer games may feel continuous to a human player but are still designed around a series of discrete states called ticks where each tick is determined from the previous state plus inputs.

Why is this distinction made in the context of how difficult it is to develop an AI?

Technically you're right, but there's a real qualitative difference. Each "tick" in a game like StarCraft is on the order of tens of milliseconds. When you send out an army to attack your opponent, it's quite possible that the actual confrontation won't happen until 10,000 ticks in the future.

Also, the dimensionality of the state space in a "continuous" game is orders of magnitude larger. In a game like chess or Go, you may have dozens or hundreds of moves available at each turn, but only a few of them will be "locally optimal". In StarCraft, there are many more degrees of freedom -- attack timing, positioning, formation, banking versus spending resources, and so on. A good AI will need to be able to abstract that huge state space down to something more tractable.

To the best of my knowledge, the only thing in SC2 that requires pixel-level precision is selecting units. Everything else can just as easily be represented as a fairly coarse grid with no loss of expressiveness. Buildings are explicitly snapped to a grid, and moving your units several pixels to either side simply doesn't matter. So calling SC2 "continuous" in terms of space is misleading.

I don't think there is anything that requires super-fast response times either, so you could conceivably get ~1 frame per second and not lose much information.

Well, IIRC, there are some visual indicators that rely on blinking, but I don't think they are crucial.

Even with the restrictions you put in place, SC2's state space is much larger than any board game.

A typical game might last 15 minutes = 54000 60fps frames and a typical map is larger than 10k x 10k in terms of 'coarse units'.

For any given frame of animation there are at least a million valid actions - if you have 10 units then you can move any subset of those units to any place on your screen.

>Even with the restrictions you put in place, SC2's state space is much larger than any board game.

That depends entirely on how you represent it. You're forgetting that there are plenty of AIs that actually play the game right now. They just don't use the same kind of "interface".

>For any given frame of animation there are at least a million valid actions

That's Google's marketing cool aid. "Actions" that produce the exact same result cannot be meaningfully counted as separate actions, especially when you're trying to compare them to board game moves.

You're basically correct in terms of SC not technically being continuous. There are discrete steps under the hood.

One of the significant challenges is figuring out how to use 42ms (the frame duration on fastest speed) of computing time to decide what actions, if any, to take next. You don't have the luxury of taking many minutes to decide one move as you would in a game like chess or go. You also don't alternate taking discrete turns with your opponent, despite having discrete frames. It may be best to not take an action in a given frame. This is particularly true if the AI is attempting to stay under an APM threshold, as it has to decide if an action is worth the opportunity cost.

It is also necessary for a quality SC AI to remember what has happened in the past. A chess board position is identical regardless of how the game got there, but this is not the case in StarCraft. An AI has accumulated lots of information about its opponent that is no longer visible to it in the current frame (unit movements, gas/mineral counts, number of workers active, etc), and this needs to be recalled and play into decision making.

SC2 ticks faster than SC1 - you only have 22ms. You don't need to tie everything to tick rate though, a strategy module could update way slower.

Bingo. If this had the tasks split up among multiple threads/processes correctly and using a very fast performing language + good developers, the tickrate is less important. Some army control module could manage the unit micro within the bound of a tick with other modules updating other info the system draws from to perform actions.

Its computationally difficult to calculate every possible move a character might do in the occluded region. This requires a long attention span and running simulations of possible opponent actions based on previous 'frames'. Alpha go solved some of this by reducing the amount of space searching and instead search the likely set of possible choices by the opponent but each evaluation is for a single frame. If for some reason a piece on a go board could disappear and reappear then during that time it was gone its impact on decision making would be either nil or skew heavily towards nil compared to the rest of the opponents pieces depending on how many frames in the past are used.

From what I saw in the API, the AI will potentially have some key advantages like more accurate micromanagement, and that can make a significant difference in a combat setting. They can try to compensate for this by throttling the number of actions per minute, but that won't compensate for extremely well-planned pixel-perfect clicks. This is a very powerful tactical advantage that can offset strategic deficiencies, if any.

Now, I would not compare SC1 bots to whatever DeepMind is going to create. SC1 bots were in their majority just rule-based bots with hand-coded strategies. DeepMind will create machine learning based bot, train it with data based on thousands if not millions of replays, and test it privately, maybe hiring a professional in the process (same they did with Fan Hui 5p), and make it play itself millions of times. It's a matter of time until they get it right and they get to pick when that time is. They will not organize a match until they feel their probability of winning is significant.

This. Somehow I was expecting the implementation of mechanics to be the easy bit, compared to high-level strategic planning and tactics. Curious to see whether these will emerge by themselves, or if they will need to provide some heuristics (use drops, harass, all-in, etc. )

I known nothing about what they are trying to solve, but it would be interesting if their goal was not just to beat humans but to make a game AI that was actually fun to play.

>A lot of people here seem to be underestimating the difficulty of this problem. There are several incorrect comments saying that in SC1 AIs have already been able to beat professionals - right now they are nowhere near that level.

Mostly because no one cared enough about solving this to spend 1/100th of the resources Google will undoubtedly throw at it.

As a long-time high level SC2 player, one additional thing that makes SC2 so difficult is that the game has multiple layers of tactics and strategy that require specialized logic, but those layers also interact and synergize in a deep way.

- There is the overall strategic game of 'Who is ahead economically? Given that, should I be expanding, attacking, or defending?', with the implicit understanding that the player with the current economic advantage puts pressure on its opponent to attack - There is a resource management and build-order system where you need to plan and optimize building as big and as effective a unit composition as quickly as possible, except there are a lot of tradeoffs: you can build for a stronger army sooner, as opposed to a weaker army alter - There is a tactical micromanagement battle where small groups of units are pitted against one another, and where small tactical movements can gain very large materiel advantages. Units are relatively short ranged, so to damage or defend effectively requires effective positioning. Most armies fight better as a cohesive group ('ball'), except there are units that specifically punish and do splash damage that need individual micromanagement. Battles can take place over a short period and be over quickly, or can be long-running positional skirmishes that last for half the game, where each player is constantly probing for weakness before one finally goes for the throat. - The economy fundamentally depends on worker units that are vulnerable to harassment, so the tactical battle requires a choice between putting everything into one large army and pushing, or splitting units into smaller groups and harassing in multiple places, or various mixes (small group to harass, bulk of army to defend, etc.) - If keyboard and mouse action rates are capped, then at every moment in time, the player must decide whether it is more profitable to devote actions to managing the army (micro) or managing the overall economy (macro). Choosing wrongly usually results in a loss - There is an implicit rock-paper-scissor tradeoff at the highest levels of the game: a 'greedy' strategy that cuts corners and favors economy over military will generally beat a 'safe' balanced strategy. Very aggressive strategies win against greed and generally lose against safe - There is the ability to scout your opponent to see whether they are going greedy, safe, or aggressive, but scouting requires an early investment in units and making subtle inferences about the opponent's build order, so the choice of whether to scout and how is not a trivial one - There can be bluffs where your opponent purposefully allows a scout of a key building, kills your scout, then cancels that building and chooses an entirely different technology instead

And all these layers interact:

- For example, if you go for an aggressive strategy, then you must commit blindly at the beginning of the game and often try to deny enemy attempts to scout you - If you scout that your opponent's army consists of units that are faster than yours, then they generally have much higher harassment potential, which pushes you towards a defensive posture. On the flip side, your opponent can use this threat to improve their economic position instead of attacking.

There is long-term planning at the strategic, informational, and also tactical levels. Effective high-level play requires an accurate model of what your opponent is doing in an environment where it's easy for your opponent to deny acquiring that information.

I'd wager that if you took two evenly matched professional level players, and then revealed the entire map to one player but not the other, you would go from a 50% to a 95%+ win rate.

Related: Today I learned that a group of AI researchers has released a paper called: STARDATA: A StarCraft AI Research Dataset. According to one of the authors: "We're releasing a dataset of 65k StarCraft: Brood War games, 1.5b frames, 500m actions, 400GB of data. Check it out!"

> Article: https://arxiv.org/abs/1708.02139

> Github: https://github.com/TorchCraft/StarData

The great thing about this is that it includes the game state throughout the game. It's been pretty easy to find lots of Starcraft replays, but the replays only include enough information to recreate the game (basically just the player actions). If you wanted to know what was happening in the game at the time the player made an action, you had to load up Starcraft and simulate out the game until that point. This dataset has already run the game for you and provided the data!

Is it that much computation to simulate an entire game? You obviously don't need to render the graphics or anything, it should just be a list of events that occur, which doesn't seem all that slow to process.

Until today's release of the headless Linux client, you still had to run the full StarCraft program, which gets expensive fast. And it massively complicates the workflow to have to play through every game serially to recreate the state rather than simply reading random rows of data from a 300GB dataframe on disk.

Oh I see, thanks, I didn't know. But man, 300 GB per game sounds completely nuts!

No, total. For comparison they quote the replay files at what was it, 5GB? It's a classic space-time tradeoff, but in deep learning right now, hard drives are far cheaper than CPUs/GPUs. Playing out the games as you need individual datapoints would probably be at least twice as slow, while anyone can easily store 300GB these days.

I believe the 400GB is the total amount for the 65000 different game replays

@wfunction: yes, TorchCraft includes a serializer that compresses the useful game state into a relatively small struct. That is then further compressed with other tricks and zstd.

Oh but how does that work? That's ~6 MB per game which sounds like just a list of actions rather than precomputed data per frame. Is it compressed somehow?

"The full dataset after compression is 365 GB, 1535 million frames, and 496 million player actions." - Yes

FYI, there are two things being discussed here. There dataset linked in the comment above is for Brood War. The headless client released today is for SC2.

I am aware of that. The point remains the same: both Brood War and SC2 are expensive to run, so you really don't want to and it's worth spending disk space to cache the results of playing out a replay files. This will probably also be true of the replay files DM/Blizzard will be releasing even with the lite client.

The API Blizzard is exposing is really nice. Sadly most of the advantages AI had in SC1 were just due to the fact that an automated process could micro-manage the tasks the game didn't automate for you (a lot of boring, repetitive work). SC2 got rid of a lot of that while still allowing room for innovative and overpowered tactics to be discovered (MarineKing's insane marine micro, SlayerS killing everyone with blue flame hellions, some more recent stuff I'm sure from the newest expansions). Hopefully the API lets AIs converge on optimal resource management and get to exploring new and innovative timings, transitions, army makeups, etc.

I'm afraid that I'm essentially nitpicking here, but the games don't really compare that linearly.

For one, "insane micro" was around before SC2 and was more of a deciding factor in BW than SC2. For instance, if you pay attention and analyze pro games you'll notice that macro (the boring repetitive work) that you think was flushed out from BW was actually just translated into other, equally monotonous tasks in SC2. Also, SC2's fights (outside of early skirmishes) are MUCH more based around creating concaves or fighting in favorable positions, and not actual micro. In BW micro is far more of a deciding factor in fights. (see: any pro-game fight consisting of supply over approximately 120 aka deathball)

Another thing, MarineKingPrime didn't really "invent" marine micro, he just excelled at it. And SlayerS isn't a player, it's a team name.

For the last part regarding optimal resource management and exploring timings and makeups, build orders have been virtually completely fleshed out and maximized. There really isn't much that say, 1000 APM (just a stupidly high impossible number to represent computer APM) could do that 300 APM (pro player human APM) couldn't in terms of gaining an early advantage in build orders.

> And SlayerS isn't a player, it's a team name.

I think the parent comment is fair. At MLG Anaheim in 2011 SlayerS unleashed their TvZ blue flame build and slaughtered with it. Blizzard nerfed it pretty hard fairly shortly after it. 3 of top 4 were SlayerS and they almost excessively used that build against Zerg. http://wiki.teamliquid.net/starcraft2/2011_MLG_Pro_Circuit/A... http://www.majorleaguegaming.com/news/anaheim-starcraft-2-re...

As a Zerg player at the time, I remember the frustration of having to deal with that.

Yes, very nitpicky. MarineKingPrime didn't invent marine micro but he was known for it/popularized in the early GSL seasons before it became a standard tactic. SlayerS (the team) dominated MLG Anaheim with blue flame hellions and basically caused them to get nerfed if I recall.

I don't believe he's being that nitpicky. Brood War had a lot of micro, including marine micro...

Boxer (SlayerS_Boxer is who I think you are referring to, as he also played starcraft 2) was well known for having really insane marine micro in the community before SC2 even came out. Starcraft 2 was just way bigger in the west so more people over here associate MarineKingPrime with micro than brood war pros.

e.g. https://www.youtube.com/watch?v=WJp0t9n8DWk

I guess what I was trying to say, in general, was that since SC2 required less micromanagement of your macro (auto-harvesting, etc), players had more free time to spend on unit tactics.

In GSL season 2, MarineKing showed that stimmed marines could counter banelings. Then the rest of the playerbase quickly adapted his tactics.

At MLG Anaheim (2013), the SlayerS terrans (ie multiple members of the SlayerS team) placed 2nd, 3rd, and 4th and destroyed all zergs with their blue flame hellion play.

Those are just two examples of the sorts of tactics that once discovered are quickly assimilated into the metagame and quickly go from innovative to standard play. So my hope was that AI could speed up the pace of tactical innovation.

If there's no APM limit you can do things like micro every worker's resource gathering. See http://www.teamliquid.net/forum/brood-war/484849-improving-m....

> SC2 got rid of a lot of that

You think so? My impression is that SC2 had a lot more of repetitive tasks you had to do. E.g. wall off the ramp, send a worker scouting, ... and you have to perform certain actions every X seconds (like using chrono boost). A lot or mastering the game is rote learning, and polishing a build order. Another big part is constantly scouting and reacting to what the enemy is doing.

Due to those reasons I found SC2 a bit tedious (it was still fun, just felt more like work than SC1). Granted, this is maybe because I played SC1 more on LAN, and there wasn't all the metagame going on. But I think SC2 really does focus on "grinding" and rote learning to get better, probably this was chosen to make it more "eSports"-like.

If I would get to design a SC2.5 or SC3, I would remove all the rote - the actions you always have to perform - and I would give the player the opportunity to trade off more between macro and micro.

Actually, it would be cool if you could "research" certain AI features in game for a cost. For example, have one upgrade that micros your marines like a pro, or positions your units in sensible formations. Another player could counter this with a "radio jam" ability, that would make your units in an area take bad formations, or be controlled by a very simple AI. And if you are good at micro, you could save the update, or invest it in an update that makes macro simpler. And so on, I think there are a lot of things one could explore there. Maybe opening SC2 to AI exploration can lead to such gameplay innovations.

> You think so?

It's an absolute fact. Here are a few things that reduced the necessary repetitive action count considerably compared to SC1:

1. Larger control groups 2. Worker queues (including sending them to the resource patches) 3. Smart casting (no longer have to select individual units to correctly chain cast certain AOE spells) 4. Pathfinding actually works now so no need to click 15 times to get a unit where you want it to go.

> I played SC1 more on LAN, and there wasn't all the metagame going on.

Oh, I see. So you never played competitively?

> Oh, I see. So you never played competitively?

Almost only in LANs, not un-competitively, but not "professionally" and not in ladders. I was more of a Counter-Strike dude when it came to competitive play.

I find the characters of the games very different, and the difference might be LAN vs BattleNet rather than SC1/BW vs SC2 (although Blizzard did emphasize certain aspects more in SC2).

I never played SC1 with contorted arms, frantically hitting the keyboard, nervously checking my production buildings. It was more like, lets build a pretty base, and then settle the age old question: Battlecruisers or Carriers?

I don't think I've even seen a wall-off. We probably would have laughed at how ridiculous it was to misuse a building like that, especially since you couldn't lower your supply depots.

Sometimes we would have ridiculous battles over secondary bases - in modern play, you would punish a player who overextends and go to their base, but back then that would have killed the fun. While you might have won, you would have been considered a spoilsport.

I think the idea of spending to upgrade your helper AI sounds really interesting and potentially novel, but it's important to push back a little bit on the idea of "removing all the rote" from an RTS. day[9] (ex Broodwar pro, current caster of many games) probably makes the argument better than I can here: https://youtu.be/EP9F-AZezCU?t=55, but one of the important implications of having repetitive aspects in an RTS is that they turn the player's attention into a resource. You can either concentrate on microing your units to squeeze the most possible value out of them, or you can focus on hitting every single production cycle back at base, but you can't do both at the same time, so deciding when to focus on what becomes part of the game. Additionally, it allows for some crazy "overcoming the odds" scenarios where a small number of units can pull off a surprising win because the player was willing to donate far more attention to them than usual. Basically, an RTS where you don't feel like you have too much to do all at once is really more like a turn based strategy game (which can be fun too).

It would just be a game for a different audience, casual vs. pro-gamer. Now casual sounds nasty and makes you think of candy crush - god forbid... I mean somebody who plays as a hobby, sometimes in the evening, but doesn't train for the game. For one, I can't put in the hours required anymore with a full-time job and a family. For another, the way we used to play RTSs (~2000) on small LANs was very different. Still competitive, but somehow more relaxed. Exactly this "managing your attention" aspect was missing, so the early game was more like a building strategy game, and the late game was less paper-scissor-stone and more about outwitting your enemy.

I think SC2 is fine as it is, for what it is. I would not want to take that away from anybody in a potential version 3; but I think there is space for a more casual game in the StarCraft universe, thus "SC2.5".

>You think so?

Brood War is the most mechanically demanding game I have ever played, and certainly the most demanding that has ever been an esport.

[This clip](https://www.youtube.com/watch?v=UXH8eCcvQMI) of Flash playing SK Terran style is illustrative of what I mean.

Let's go down the list of things Flash needs to do:

He needs to click on every one of his production structures every 15-19 seconds, and click M or C. Otherwise, no army.

He needs to click on every one of his command centers every 13 seconds, and click s. Otherwise, no workers.

He then needs to tell every newly built worker to go mine, otherwise they just stand around doing nothing.

He needs to build supply depots roughly every 20 seconds, or his entire production grinds to a halt.

His army consists of well over a hundred marines and medics, which Flash needs to stim, split, and maneuver to take on lurkers and defilers who will crush him in an unmicro'd fight.

His science vessels need to be irradiating defilers constantly to prevent the Zerg reaching critical mass, while dodging scourge using the Chinese triangle technique.

If Flash just clicks his science vessel hotkey and casts irradiate, every ship on that control group will waste its irradiate on the same target - he needs to manually select each vessel before casting spells.

His entire army, and his scans, must be controlled with just 10 group hotkeys of at most 12 units each. Except for buildings, of which you can have at most 1 hotkeyed.

To maximize speed of his army going up and down ramps, he needs to spam-click the move command. A single instruction causes the units at the back to spaz out and take a long time to do anything.

As the above implies, Brood War unit AI is terribad.


Which means everything Flash does above needs to be double and triple checked to ensure the AI doesn't decide to go off and pick its nose. Building supply depots was especially bad for this.

Brood War is a game that no human being will ever play perfectly. Even the top tier professionals, like Flash, Bisu, or Jaedong, can't do all of the above all the time - they have to prioritize some activities over others and treat their actions per minute like a resource the same as minerals and gas.

>Granted, this is maybe because I played SC1 more on LAN, and there wasn't all the metagame going on. But I think SC2 really does focus on "grinding" and rote learning to get better, probably this was chosen to make it more "eSports"-like.

Starcraft 2 is a game I could get to masters league in, despite only playing make a few hours a week for a month or two. In Brood War, that much practice wouldn't get you D ranking on ICCUP. Everything about SC2 is designed from the ground up to lower the grinding, remove the muscle memory requirement (you don't get to "play" Protoss in BW until you can hit your P key blindfolded), and encourage new or unskilled players to get on and play. At that it succeeded, albeit kind of at the cost of its professional scene.

> Starcraft 2 is a game I could get to masters league in, despite only playing make a few hours a week for a month or two.

Was this in beta? I played from beta to end of WoL and got masters in NA. It was not easy.

Mid-way through WoL and again shortly after launch of HoTS.

I played BW a lot as a kid, and the fundamental concepts carry over pretty decently. Always build workers, don't get supply blocked, expand as fast as your opponent lets you get away with, never let your resources bank, upgrades are always worth it, etc. etc.

As I recall someone in WoL actually went from bronze to masters building literally no units except marines, medivacs, and scvs to prove the point raw mechanics are all you need to carry you up the ladder.

To be fair marines and medivacs are amazing units in WoL. The same likely wouldn't apply with most other two unit combo. Roach ling maybe, maybe blink stalkers.

Destiny did this with just queens and drones. A lot of games were insane creep spread and spine crawler pushes.

I really like the idea of a helper AI that takes high level commands and handles the manual dexterity aspect of the game. That would make RTS games a lot more attractive for someone like me who prefers simple, thoughtful games rather than complex games with a steep learning curve.

There are actually 2 parts to the complexity of a modern RTS game like starcraft:

1. Memorizing certain well known strategies and counters very well and recalling them immediately.

2. Having decent speed with the mouse/keyboard to actually execute those strategies within a very short period of time.

I think what you're talking about it automating 2)... which I completely agree with. How to do it... that is more complex though...


ByuN's reaper micro!

This seems all in good fun but I wonder if it's come too late.

Starcraft 2 is at its twilight.

The biggest leagues of South Korea have disbanded. [1] The prolific progamers who transitioned to Starcraft 2 have gone back to Broodwar. [2]

Blizzard itself has scrubbed all references to Starcraft 2 on the very home page of Starcraft. [3] Except for the twitter embed, it has only only one "2" character... in the copyright statement.

My take is that the future for the Starcraft franchise will be through remastered and potential expansion packs following it.

Starcraft 2 had a good run but, with the entire RTS genre stagnating [4], I don't think Blizzard wants to bet on anything less than the top horse.

[1] https://www.kotaku.com.au/2016/10/the-end-of-an-era-for-star...

[2] http://www.espn.com/esports/story/_/id/18935988/starcraft-br...

[3] http://starcraft.com

[4]http://www.pcgamer.com/the-decline-evolution-and-future-of-t... (Aside from MOBAs)

I don't quite agree, FWIW.

SC2 does seem to be at its twilight in Korea, and I agree progamers and fans there are super interested in Remastered.

But I don't think Remastered will be very popular outside KR. The SC2 "war chest" promo appears to have made more money than expected, as measured by hitting its funding ceiling within a few days.

So I don't think it's "Remastered replaces SC2", I think it's a divergence into KR playing Remastered and non-KR playing SC2, and the number of progamers and players doesn't have to be zero-sum: it could enlarge the population playing either game, too.

I agree that Starcraft 2 won't suddenly drop dead. People do play it and FWIW, I liked it! I played all the expansions, online, and even the arcade mode. It was a good game.

But I disagree that Blizzard has faith in Starcraft 2 for America or any other country.

The removal of Starcraft 2 from Starcraft's English-Speaking homepage is one sign of finality. In-universe, Blizzard has also ended the main dramatic arc of Starcraft 2's story, leaving room only for half-hearted spin-offs.

Numbers-wise, we're seeing 50% drop-offs in user activity the last 2 years alone. Even with the release of "Legacy of the Void", the number of daily games played for 1v1 since 2015 have gone from 321,000 to 138,000. The new, much-advertised, much-worked-upon, Archon Mode has gone from 11,000 games a day to a measly 1,000 [1]. Not just because of Korean disinterest, we're seeing players leave across the board in all countries.

In 6 years, Starcraft 2 went from millions of players concurrently to an average of 20k a day.

Compare with the lifespans of League-of-Legends, Dota, Counterstrike, even the original Broodwar, and the reason for remastered becomes more obvious.

Blizzard knows Starcraft 2 won't lead to the resurgence of the RTS genre, so they're trying another route.

[1] http://www.rankedftw.com/stats/population/1v1/#v=2&r=-2&sy=c...

As an esport / spectator sport, SC2 has been waning for a long time, and similarly the War Chest was capped at 200.000 for prize pool money and an unknown amount after that (which, compared to Dota 2 or League of Legends prize pools, is not a lot).

I think, given the matchmaking update for remastered, that SC1 will see a resurgence both inside and outside of KR, but I am not sure either SC1 or SC2 will stay competitive in the long run.

Personally, I think focusing on BW would have been more interesting (as long as the APM limit still stands), but I guess SC2 is alright too. The fact that they're even doing this though makes me happy.

The reason I say BW would be especially interesting is simply because the game has remained basically unchanged balance-wise since v1.08 which came out in 2001. Despite that, the pro scene never left, and we're still seeing some shifts in the meta even today. It would be cool to see a strong AI flip the script completely for such an established and "well understood" game. Opportunities like that are kind of rare, at least when it comes to video games.

I wonder if it's come too late.

Couldn't it be the opposite? Blizzard was willing to do this release exactly because SC2 is dead?

Why would popularity be a detriment to this API?

They released a headless Linux port you can download for free. This wouldn't be something they'd contemplate for a game in its prime.

I disagree I got into Starcraft recently and find it very much vibrant, both in the pro scene and casual. But that’s irrelevant. The point is it’s still a great ai challenge

It's a great AI challenge but the pro/casual scene is very diminished from what it once was. Practically every single streamer who introduced me to the concept of streaming by playing/casting SC2 have either moved onto other games or quit streaming altogether. I can't believe it was only a few years ago, but I used to watch Husky/HD/Day9 everyday.

Unless I'm mistaken, even the top SC2 streamers today receive a fraction of what other streamers who stream games like Hearthstone or Dota2 get. I'm not suggesting it's a 'ded gaem' but to me it's become a little like AoE2 in that it's a niche e-sport, which is certainly nothing to be ashamed about. But I think it's a far from what Blizzard had hoped for, which I think is reflected with their next batch of games in Hearthstone, Overwatch, and HotS, which all have some level of competitive play while still being way more very friendly to casual users than SC2.

It's an objective question: Twitch viewership of SC2 is simply smaller than it used to be.

People were still very excited about Go even if people in the US likely didn't really play a lot of Go before AlphaGo. It will be super good PR for DeepMind and Facebook AI Research (who are doing Broodwar). It will probably not reanimate the pro scenes in any lasting manner, however.

I mean, the techniques developed through such a research project would map onto many other domains, obviously including any other RTS.

Using SC2 as a starting point isn't really of much consequence. "Too late"? It's not as if the algorithms developed will die alongside the game.

It's a bit too bad they're having to move towards supervised learning and imitation learning.

I totally understand why they need to do that given the insane decision trees, but I was really hoping to see what the AI would learn to do without any human example, simply because it would be inhuman and interesting.

I'm really interested in particular if an unsupervised AI would use very strange building placements and permanently moving ungrouped units.

One thing that struck me in the video was the really actively weird mining techniques in one clip and then another clip where it blocked its mineral line with 3 raised depots...

They can always finetune using RL later. Superversied training was the first step at making AlphaGo work.

Well the unsupervised ai couldn’t even do basic tasks from the video I saw, so looks like we have a long way to go.

I also want to see the algorithm win on unorthodox maps. Perhaps a map they have never seen before, or one where the map is the same as before but the resources have moved.

Don't tell the player or the algorithm this, and see how both react, and adapt. This tells us a great deal about the resiliency of abilities.

I am considering a random map generator for just this reason.

When Watson won at Jeopardy, one of its prime advantages was the faster reaction time at pushing the buzzer. The fairness of that has already been hashed out elsewhere, but.....

We already know that computers can have superior micro and beat humans at Starcraft through that(1). Is DeepMind going to win by giving themselves a micro advantage that is beyond what reasonable humans can do?

(1)https://www.youtube.com/watch?v=IKVFZ28ybQs as one example

My understanding is that in a full match, AIs still have no hope against humans, since even though they can crush humans at micro, their macro is still abysmal [1]. I'm not aware of a match where any AI has beat a pro human player at Starcraft -- I'd be interested in learning otherwise!

[1] http://spectrum.ieee.org/automaton/robotics/artificial-intel...

That's because there hasn't been too much concentrated effort on this problem yet, since you'd have to spend quite a bit of effort just integrating with the game engine.

Certainly a lot less research has been done on computer SC2 than computer go, and nobody expected a pro to be beaten there 1.5 years ago, either.

It's not that their macro is abysmal (macro in Starcraft refers to the mechanics of managing production and economy), it's that their strategy and tactics are real bad.

would you love to be proven wrong?

Of course.

That example might be misleading because I assume the AI has perfect information- I don't know how it could know which zergling was targeted before the tank fire landed without knowledge of the game's internal state.

In any case I saw in the comments above they are planning on limiting the APM. But right now they're not at the stage where they can compete with the in-game rules based AI, so it may be a little while.

Thanks for that video. That's exactly what I hope to see. AI vs. AI with insane micro capabilities. I want to see SC2 played as close to a "perfect" game as possible.

Yes it would be amazing to watch

I wonder if limiting APM would be a simple way to make the AI's play more "human" and less exploit-y.

Limiting APM is definitely a step in the right direction, but there are ways to have super-human reaction times, beyond what a human can do, even while limiting APM.

So if we watch a match and see things that no human could physically do, we will know that the machine didn't win because of intelligence.

It would still be great, it just would be a simplification of the problem.

What if the ai machine predicted very accurately what their opponents would do? Does that count?

Blizzard should put in an AI-assisted play mode where players are limited to X lines of code that can be launched with keyboard commands.

I know that, as a player, the high mechanical limitations of Starcraft are part of why it's such a difficult, high-skill-ceiling game. But.. I've tried to enjoy watching SC2 on Twitch, and while it's kinda fun, it's just so disappointing when a complicated strategic game is thrown away because a player doesn't react fast enough to workers being sniped or a drop being shot down.

I wish the individual units had some automatic behavior -- for example, marines would could run in spread out formations near tanks or banelings; workers would flee from hazards; flying units would avoid turrets unless specifically directed to fly over them. It would require a lot of rebalancing, of course, but it would make the game so much more tactical and strategic and (imo) enjoyable to watch.

Yeah I can even imagine a thriving "marketplace" for specialty code that top players would keep secret.

And it doesn't have to just be for micro. For people who are bad at macro, maybe code can be written to consistently maintain X workers at all bases.

The difficult part here would be how to balance the AI-assistance. Is lines of code (or number of characters) a good proxy for complexity? What's the number-of-character to benefit ratio?

I guess that's ultimately determined by the individual player's strengths and weaknesses. If a player sucks at macro, then the macro script is worth the number of characters.

You would like Company of Heroes 2, the units require much less micro and the game is a bit slower paced. If your infantry come under fire, they all dive into cover to protect themselves. Defense is a bit more automated (you can set artillery to automatically return fire on enemy artillery once it's exposed) and you have more options for static emplacements to defend critical areas without needing your attention.

I have the same complaints about SC2 as you - way too fast and intense - and I really enjoy Company of Heroes. The whole game is basically set up to get rid of "nags" so you can just play around with the units and have fun.

That would be quite interesting, having humans handle the macro while the AI focuses on the micro. I'm reminded of "Advanced Chess": https://en.wikipedia.org/wiki/Advanced_Chess

> Advanced Chess is a relatively new form of chess, wherein each human player uses a computer chess program to help him explore the possible results of candidate moves. The human players, despite this computer assistance, are still fully in control of what moves their "team" (of one human and one computer) makes.

Are there any known arbitrary code injection for starcraft? Like how you can use a regular controller to reprogram super mario world to play pong?



Is this how we are going to accidentally let AGI loose into the world!? /s

On a more realistic note I think this will degenerate into a game of who can fuzz test for the best game breaking glitch. Think of all the programming bugs that turned into game mechanics in BW that we haven't discovered for SC2 yet: http://www.codeofhonor.com/blog/the-starcraft-path-finding-h...

The StarCraft 1 BroodWar AI scene has been thriving for a few years now: https://sscaitournament.com/ You can watch 24/7 live AI vs AI games on Twitch at: https://www.twitch.tv/sscait Support for voting on who to play next and even a betting system are in place, too. For those who wish to get their feet wet with BW AI development, here are the Java / C++ tutorials: https://sscaitournament.com/index.php?action=tutorial

Some thoughts and analysis on why Starcraft AI by one of the active AI developers Dan: https://dangant.com/2017/08/09/why-starcraft-ai/

The SCAI bots I've seen are more hardcoded tactics engines rather than machine learning models. They're still impressive, but their logic isn't quite 'learned' it's hand coded which is a crucial difference.

That's surprising. I thought Bliz didn't want anyone near sc2 but approved of sc1 being used for this purpose.

SC1 really doesn't make sense for this, 80% of the skill is just keeping on top of the mindless but mechanically intensive stuff, which is trivial beyond trivial for an AI.

SC2's automated away most of this (pretty much everything but production cycles), which makes it a better measure for AI vs human.

> SC1 really doesn't make sense for this, 80% of the skill is just keeping on top of the mindless but mechanically intensive stuff, which is trivial beyond trivial for an AI.

If that were true, then AIs would be dominant in BW instead of still bad at the game.

If they're limiting APM to that of human levels, I don't see it being much of an issue though. APM would just be a limited resource like any other. In fact, I sort of want to see how a strong AI would choose to spend its limited pool of actions. How different would it look compared to a pro? Maybe not much, but I'm not actually certain.

In SC1, just the act of moving a large army is a commitment and takes quite a few resources. Moving your armies under fog of war and not letting your opponent know exactly how you're set up in order to get a good angle on you is incredibly important. I want to see how much of an importance a great AI puts on that vs the other things it could be doing instead. Are the strongest AIs going to be more methodological, safer, and slow moving? Or will the best AIs try to exploit the imperfect information aspect of the game and try to lure the opponent into making a wrong decision? I feel like AIs tend to excel at the former, but the latter has been a huge component for the very best pros in SC1.

I don't follow SC2, so I don't know much of this also applies there. I just feel like SC1 isn't as mechanical as it's made out to be. There's definitely that huge initial barrier, but once passed that, the game actually feels very delicate and is about good use of resources (including mouse/keyboard actions), timing, and transitions in unit composition to catch your opponent off-balance.

> ... 80% of the skill is just keeping on top of the mindless but mechanically intensive stuff, which is trivial beyond trivial for an AI.

This statement is wrong. SC1 has admittedly less tech-tree depth & strategic approaches than SC2 does (purely because of lower number of different units/upgrades), but there are innumerable variations that are imperceivable to the lay observer.

I'll go on a limb to say that SC1 has more refined rock-paper-scissors system than SC2 ever had (taboo to speak of on reddit).

A lot of the balance of SC2 will immediately be destroyed by a semi-competent AI. One of the races has especially powerful early tactics that are primarily limited by a human's inability to multitask (reapers, medivacs, and liberators). I honestly don't think it will be possible for a human to beat an AI that just focuses on those strategies.

I agree that SC2 is much easier for a human player. However, top players still have to do quite a bit of micromanagement.

I thought this was already happening. Right after AlphaGo beat Lee, I remember hearing about it. Did they give up on having their AI playing SC2? I wondered if that would work, since it seemed to take turns in Go at the same speed as a normal player, I wondered if it was trying to compute the most likely winning move each turn and the late game implications of those moves. If it tried that in a fast paced game how it would deal with the speed. It obviously would need to develop a pattern of pre-baked strategies that would win it the game. Would it play the same build every round or would it realize that changing things up each match wins it more games?

It's a bit too bad they're having to move towards supervised learning and imitation learning.

I totally understand why they need to do that given the insane decision trees, but I was really hoping to see what the AI would learn to do without any human example, simply because it would be inhuman and interesting.

I'm really interested in particular if an unsupervised AI would use very strange building placements and permanently moving ungrouped units.

One thing that struck me in the video was the really actively weird mining techniques in one clip and then another clip where it blocked its mineral line with 3 raised depots...

There's something funny about a company that is actively developing bleeding edge AI technology, but who can't design a webpage that works on mobile without crashing.

Just goes to show how complicated web tech is, even ai researchers can't get it right!

When I used to play a lot of StarCraft, and then later with Total Annihilation, I wished for the ability to customize the AI.

So then BWAPI came along ... and ... AI is hard. The best SCBW bots are still pretty pathetic compared to a human player, never mind an expert human player.

I'd be really interested in how differently tiered data sets (ladder rank) would work as sources for teaching.

Is it possible that training on diamond players is less effective than training on, say, silver? Is that actually even an interesting thing to look at?

Any predictions for how long it will take for an AI to win against the world's best player?

Awhile. This just isn't like Go or Chess. The gap from perfect information to imperfect information is quite a chasm, and from turn-based to real-time is even more vast.

I play Age of Empires 2 semi-competitively, and I just can't imagine the research progress that would have to be made for a pro to lose to an APM-limited AI agent. So much of the game comes down to intuiting what your opponent is planning without being able to see what they're doing, and more importantly intuiting what your opponent isn't ready for.

The biggest difference, though, is the "RT" in "RTS"-- real time. This isn't turn-based anymore, where at a given moment you have a single choice to make, a single piece to move as in Chess and Go, and can then wait for the singular and visible reaction your opponent makes before making your next choice.

My understanding it that the moves a program like AlphaGo makes are not interconnected-- it picks each move individually as an ideal move for that board state. It could take over halfway through the game for someone else and would make the same move that it would have made at that point if it had been in control the whole time and arrived at that board state on its own.

But that doesn't work in a real-time game, since you and your opponent are now moving simultaneously and the "board" is never static. Your moves must be cohesive and planned and flow continuously without time to ponder, each connected to the last. There is no "one" move for a given state.

Another facet of real-time play is the idea of distraction. It's very important in RTS's to keep your opponent distracted, to disrupt their plans and their focus, by coming from unexpected directions at unexpected times, sometimes concurrently with other operations against them. This can't happen in Chess or Go, where the demands on your focus are far less urgent and two things can't happen at once in a literal sense. Can an AI agent learn to appreciate the power of distraction? Can it learn to intuit what will be most disruptive to a human, and what won't be disruptive at all? How can you teach a computer to learn to be annoying?

I will say, of course, that nobody saw AlphaGo coming. And I hope it's the same with RTS's. That would be so exciting. I would love to see an AI blow us away with previously unthought-of strategies. That would be the coolest thing ever. So I hope it happens. But I'd be astonished. RTS is just such a whole new level of thinking for AIs.

At pro level, much of the game is about what information you can gain, and about choosing what to show and, more importantly, what you don't show (hide) and acting on non-triggers.

An example of a non-trigger is knowing that if I haven't seen a certain unit at time X, I know I'm safe to do Y. It is acting upon the information that something didn't happen.

To expand: I saw my opponent starting two gases at my 21 supply scout. When I scouted again at 47 supply, I saw no gas heavy units, so I can deduce the gas was used for better technology. This will allow me the opportunity to increase my worker count by Z before building army, or I could try and kill my opponent right there for his technological greed.

> intuiting what your opponent isn't ready for

I haven't played AOE2 so I don't know if the mechanics are similar enough to translate, but my goal for my Starcraft bot is to do precisely this. If you can enumerate the possible builds (what's available when) and assess the matchups between builds, you can make this happen using some intuitive expansions on adversarial search.

> Your moves must be cohesive and planned and flow continuously without time to ponder, each connected to the last.

Recomputing the entire plan from the current state works in RTS too, but only if your decision-making takes every already-in-motion thing into account and has no internal discrepancies. That's a pretty big if; this sort of weakness accounts for a lot of bot weakness currently. Units spinning around due to slight changes in perceived state cause lots of wasted resources.

> Can an AI agent learn to appreciate the power of distraction?

Despite multitasking theoretically being one of the strengths of an AI, a lot of the current field can't handle more than one military situation at a time. In this year's SSCAIT a lot of bots completely fell apart when confronted with one of the top bots (Bereaver) doing reaver drops.

I'm not sure a bot can meaningfully learn distraction, but I'm not sure it's necessary - attacking on simultaneous fronts is optimal anyway. The army can only be so many places at once.

On the other hand, bots are starting to beat professionals at (thousands of hands of repeated) Poker, so I think we can't say that imperfect information is something that's especially intractable for maching learning algorithms.


Yeah but compare the search space of Poker vs Starcraft.

Exactly. And poker is, again, turn-based with a single move to be made. And it may not be "perfect information," but it again is a game of static board state where there is probably a statistically optimal move for a given state provided you have memory of the other players' previous moves so far in a round.

This wasn't an issue for Bridge either.

As soon as you can build a probability distribution over possible states, you can use Monte Carlo like methods.

Speaking of AoE 2, I really hope this research ends up benefiting that game. The AI in it has always been so bad but it's still my favorite game of all time.

Microsoft hired one of the best scripters (Promi) on the AoE2 AI circuit to write the official AI for AoE2 HD, and it's quite good, although the rules engine doesn't have enough features to beat early harassment. If you let it boom it's pretty scary for new players.

If you play using the UserPatch on Voobly, where the serious custom AIs are written, you can play vs. Barbarian. It is _very_ good, if you're not a semi-competitive player it will certainly beat you.

The upcoming UserPatch 1.5 will add even more features, so Barbarian and other custom AIs will become stronger.

To be clear, AoE2 AIs are all rules-based and written by pro players themselves, which is quite different from what DeepMind is trying to do.

Have you played against the updated AI in Aoe2 HD?

It is hard but my impression is it's because it cheats. I'm more looking for interesting AI (creativity, adaptation) rather than raw difficulty due to its ability to spam military.

Churchill, of the SC AI competition, guesses it might take as much as 5 years: https://www.wired.com/story/googles-ai-declares-galactic-war...

Ooh, this'll be interesting to see, as with AlphaGo, a lot of people disputed the "experts believed it would take ~10 more years" claim retrospectively.

With SC2, no AI even comes close to beating even a silver level player, so even a 5 year timeline seems really soon. Let's see if DeepMind can beat it!

What's your totally unscientific guess, Gwern?

I think it is doable in under 5 years, but this critically depends on the resources invested by DM and other DL orgs. Deep RL is hugely demanding of computational resources to iterate your designs - for example, the first AlphaGo took something like 3 GPU-years to train it once (2 or 3 months parallelized); however, with much more iteration, DM was able to get Master's from-scratch training down to under 1 month. Now an AG researcher can iterate rapidly with small-scale hobbyist or researcher resources, but if they had had to do it all themselves, Ke Jie would still be waiting for a worthy adversary... When I look at all the recent deep RL research ( https://www.reddit.com/r/reinforcementlearning/ ) I definitely feel that we can't be far from an architecture which could solve SC2, but I don't know if anyone is going to invest the team+GPUs to do it within that timeframe. (It might not even be as complex as people think: some well-tuned mix of imitation learning on those 500k+ human games, self-play, residual RNNs for memory/POMDP-solving, and use of recent work on planning over high-level environment modeling\, might well be enough.)

\ "Learning model-based planning from scratch" https://arxiv.org/abs/1707.06170 , Pascanu et al 2017; "Imagination-Augmented Agents for Deep Reinforcement Learning" https://arxiv.org/abs/1707.06203 , Weber et al 2017 (blog: https://deepmind.com/blog/agents-imagine-and-plan/ "Agents that imagine and plan"); "Path Integral Networks: End-to-End Differentiable Optimal Control" https://arxiv.org/abs/1706.09597 , Okada et al 2017; "Value Prediction Network" https://arxiv.org/abs/1707.03497 , Oh et al 2017; "Prediction and Control with Temporal Segment Models" https://arxiv.org/abs/1703.04070 , Mishra et al 2017

> (It might not even be as complex as people think ...

Yeah, I suspect you're right. Eliezer was alluding to this with the AlphaGo victory as well:

> ... Human neural intelligence is not that complicated and current algorithms are touching on keystone, foundational aspects of it. https://www.facebook.com/yudkowsky/posts/10153914357214228?p...

I can't decide if I would be bummed or excited if that turns out to be the case. On the one hand, we'd be that much closer to AGI. On the other, we'd be continuing down the path of brute-forcing intelligence, rather than depending on those elegant, serendipitous breakthroughs that much of human progress has been built on.

> rather than depending on those elegant, serendipitous breakthroughs that much of human progress has been built on

That's brute forcing as well. One such elegant idea comes every million(billion?) people. Random people would just output random ideas.

Yeah! I mentioned the same sentiment in this 2012 post when it was becoming clear that computers were reaching human strength at Go via brute force: http://blog.printf.net/articles/2012/02/23/computers-are-ver...

AI can already win against the world's best player now. It's just a question of whether they can win with human-level micro.

Do you have a citation? As far as I know, existing AIs do not stand much of a chance again humans [1]. My understanding is that even if AIs can beat humans is small skirmishes due to superior micro, they lose so badly at macro that it simply doesn't matter.

[1] http://spectrum.ieee.org/automaton/robotics/artificial-intel...

Yeah, you're right. Not sure what the guy is talking about.

> the guy is talking about

I don't want to be overly semantic or PC on HN, but just saying the GP may be female, judging by their name on profile. Being misgendered could be very offputting and discourages participation, so you may have wanted to say "this person" even if it doesn't sound as offhand as you'd have liked it to come across.

In the NorthEast US, guy is often a gender neutral term. It's totally fine to say, "Hey you guys" to a group of only women.

Personal experience tells me it's a pretty common thing across the US.

> I don't want to be overly semantic or PC on HN

Proceeds to be overly semantic or PC on HN...

Calling a group of women "you guys" is different from calling a person of unknown gender "guy" because in the former case the gender-neutral usage is the only possible interpretation, whereas in the latter it might either be gender-neutral usage or an assumption of the commenter's gender. Given that this ambiguity is nearly always a distraction from the primary purpose of the conversation (as it is here), rightfully or not, pragmatically it is better to use terms that steer clear of it. See Bryan Garner's Modern English Usage for fuller advice on safely navigating this linguistic and sociocultural terrain.

I am so happy to be European... you guys in America always take everything to the total extreme.

How are you so sure the commenter is a person? ;)

it's not exactly macro.. AI can run perfect macro commands. humans right now have better priority model (in this case, protecting/killing workers) for which the AI in question wasn't trained.

Are you sure you're interpreting 'macro' correctly? 'Macro' in the context of RTS games isn't the same as 'macro commands' in scripting, but refers to basically all high level decisions. RTS is divided up into Macro, tactics and micro:

- Macro: resource prioritization (army vs expansions vs upgrades), scouting to understand the opponent's macro choices, choosing the right posture in response (defensive, harassment or offensive), and resource optimization (not getting supply blocked, scaling production with income, increasing income at the maximum possible rate, removing bottlenecks, etc), scouting the enemy army composition to prepare the ideal counter army composition

- Tactics: Grand army decisions - flanks, baits, sneak attacks, hiding composition, multiprong attacks, positioning of siege units, timing attacks, knowing when to retreat (hit & run), scouting to gain advance notice of your opponent's tactics

- Micro: optimizing unit lifespan and effectiveness within an isolated skirmish for the given goal (usually to 'win' the engagement) -- pulling back weakened units to avoid aggro while it still deals damage, healing, surface area for melee units, trapping enemy units with terrain or skills, optimizing spellcaster energy usage, prioritizing targets based on multiple parameters (range, damage, cost, count, follow-ups), etc.

AI can "run" macro well, but they are poor at the macro decision-making part, which includes priority model as you mention (the responsive posture choice above and others). Up to low grandmaster tier, being significantly better at macro than your opponent while close in tactics and micro is usually enough to win consistently. It is the most impactful part of an RTS (and is where most of the 'S' lies).

Your definition of macro is a lot more encompassing than how I see it used in SC2. It normally refers to just spending resources properly (constantly producing units and workers, setting up infrastructure, etc).

Yeah, interesting! I agree I don't usually hear macro used that way.

But if someone was microing a battle and as a result didn't look at the minimap and see a drop arriving at their base, I can totally see concluding that this player is bad at macro as a result -- macro is referring to there being a macro cycle of tasks you have to perform all the time whether you want to or not, and non-production tasks like checking the minimap and sending in a scout seem like good examples of those tasks too.

Exactly! Macro is everything "big picture" -- being aware of the overall state of the game, managing all resources involved (including attention, APM, time, mindspace), and grand decision-making ("I'm up against X overall strategy. How should I respond? What are the weaknesses in my approach? What should I be looking for that would exploit those weaknesses? How can I address those situations?")

Execution of the answers to those thoughts is in the form of tactics and micro, the other two aspects of macro. If you're following current SC2 pro meta, a "strong macro player" however has more right answers to most of those questions (Stats, Innovation) and that's their strength, versus a "strong tactical player" (TY, sOs) or a "strong micro-based player" (ByuN, herO).

I hear you. Usually the bottleneck of macro-effectiveness is the part you mention, the decision-making parts can be simplistic in mid-to-high levels and still plenty effective. So "improving your macro" usually refers to remembering to build workers and not getting supply blocked. Getting supply blocked or not hitting production cycles is more likely to get you behind in StarCraft.

That said, in high-level play, "better macro" usually refers to the other things, not just mechanically hitting stride with production, as most players in the top .01% are on the same level with those mechanics.

There isn't any SC2 AI that's anywhere near as good as a player in an actual game. Sure, they can demonstrate some sick unit control, but that alone isn't enough to win games.

Cool Starcraft related username!

> even strong baseline agents, such as A3C, cannot win a single game against even the easiest built-in AI.

Then, why not release code for the built in ai, and improve on it ? Or is the built in ai cheating ?

The goal is to "grow" an AI through reinforcement learning and other techniques that are broadly applicable outside starcraft. The existing rules based systems in the game A) are extremely Starcraft specific and the methods have comparatively limited utility outside video game AI, B) unlikely to be scalable to fairly beat a professional human (which introduces too much variability to capture reliably into rules)

The built in ai is scripted whereas they’re trying to teach agents to learn the game from scratch with some sort of reward-based/machine learning approach.

still, why not use machine learning to tweak the script instead of starting from scratch.

Someone needs to link this to FB's ELF platform (An End-To-End, Lightweight and Flexible Platform for Game Research). That was specifically made for RTS games like SC.

great they opened it up. I'm sure reinforcement learning / Deep learning will solve this. It has been a tough problem before, but honestly doesnt seem that tough compared to all the harder AI problems.

Such as?

This gives me great ideas

I think I know what my final year project will be.


--why are there not more fanboy comments?!

Probably because they don't contribute much to the conversation.

"so agents must interact with the game within limits of human dexterity in terms of “Actions Per Minute”."

I am really glad they are limiting APM because otherwise things just get stupid.

OOTH, altering the source to remove any human dexterity limits and watching subsequent AI vs AI battles play out at 60fps would be really fun to spectate ;)

For SC1 BW, you can already watch https://www.youtube.com/user/certicky for a weekly highlights broadcast / commentary of AIs. Right now there is an AI arms race where the previously dominant AI (Iron Bot) is being beaten by challengers. It used to happen when Iron Bot did something silly. Now it is happening because the competitors have really stepped up their game.

The current board leader (krasi0) has a strategy similar to their predecessor (Terran Mechanical units: Tanks, Goliaths and Vultures). The alternative strategy I really enjoy watching is a Mutalisk heavy build by Arrakhammer.

The bots have styles and differing capabilities. tscmoo is one of the more fascinating ones to watch in this regard, as they mix it up better than anyone else (and tscmoop, the Protoss variation has the best High Templar storm going).

The AI APMs get into the 10k ranges at times. Watching the minimap can be like watching insects swarm.

If you want to see a game between two AIs in progress, you can watch: https://www.twitch.tv/sscait

While these AIs can pull off tactics a human player could not (what they can do with Vultures is incredible), at this point they wouldn't be able to compete with the professionals. Going back to an earlier example, I think I could use High Templars more effectively than any AI I've watched.

This surprises me - I would have expected the computer mechanics to be much better than what a human can achieve. Do you have insight into why that isn't true?

AIs can click faster, but humans still have better overall strategy and planning, which matters a lot in this game.

I'm trying to come up with a simple and satisfying explanation, but the best I can do is "Starcraft is a complex game of balancing your ability to attack and defend. AIs have difficulty with situational awareness."

But I think that case analysis of what I've observed may be more telling.

Case 1, The AIs tend to over-react, or under-react. For example, when a Zerg player sees flying units they may start to go scourge heavy. If a human player notices this, they may build a Wraith or two, causing the Zerg player to waste a lot of money. This can happen naturally between AIs as well. Terrans depleting the Command Center's invisibility sweep too soon is another one... Something is getting hurt, but you may want to wait until after the Tanks are in siege mode before attacking (not while they are converting).

Case 2, Lack of Memory. This one happens a lot. You'll see an AI do something bad. Then, in a situation where circumstances obviously wouldn't have improved, try again a few seconds later. I'd want to blame the fog of war, but futile attacks on static towers is a common example.

Case 3, Fight or Flight. Sometimes you'll see units fleeing from a battle they cannot win. Sounds good. But sometimes they are being pursued by units that can pick them off during retreat. And they aren't fleeing for more support, they are just avoiding a bad situation. When in reality, those units are dead no matter what, might as well try to take down an enemy unit or two. The inverse can happen too, where units stand and fight in a situation they could run away from and get reinforcements. I've heard predicting combat outcomes is really difficult in SC.

Case 4, Under utilization. Vultures are the unit that stands out in my mind as one that an AI can handle better than a human. They are fast, have a great punch, and can deploy mines really effectively (some bots mine much of the map). What was a harassment unit becomes an offensive unit that can hold its own in a "fire/flee/repeat" pattern (imagine having 5 of them do that in the same area). High Templar are the opposite. I tend to expect AIs to do poorly when storming, it is currently a highlight/joy to see them utilize the ability effectively. But the ability is meant to discourage enemies from grouping units too closely together. Which allows for Carriers and Mutas to do a lot more coupling / damage than they would be able to otherwise do. Under current circumstances, it would be possible to see a dozen or more Mutas being crippled in a single storm. I rarely see nukes/Ghosts or Defilers used. Queens also seem to be under utilized.

Case 5, Target Prioritization. When you see a Carrier, kill it rather than the Interceptors. When you see a Medic or two, kill them before the Marines. Same with SCVs repairing in some cases (with sufficient firepower, the unit will be dead faster than it can be healed). One AI loves Carriers, and part of the reason their strategy works is that a lot of units go after Interceptors rather than Carriers, allowing the Carriers to retreat (the AI judges when to do this well) and rebuild.

It isn't that AIs can't play at a professional level, but this represents the current level of the bots. Looking over it... The AIs can pull off things we can't, but professional level situational awareness / judgement is tough.

Positioning matters in basically every decision in Starcraft and humans are incredibly good at precisely that.

IMO there should also be a precision limit. The timing of actions should include human-typical jitter and the wrong action should sometimes be activated to simulate misclicks/fat-finger keypresses — e.g., messing up a control group by assinging a unit to the wrong number key. The bot must also not be able to act faster than human reaction times (~250ms), this could be enforced by adding a fixed delay to the observations.

I wouldn't be surprised if human Starcraft II play isn't so much limited by decision-making as by the translation of decisions into mechanical actions, which in turn dilutes the attention devoted to actual decision making.

Since existing bots are far from being competitive with human players why further handicap them in ways that deal with an entirely different domain?

Yea, agreed. Specifically they should add a loss function that compares the AI's action stream against a pro human action stream, and attempt minimize that loss.

Right -- likewise it would be nice to have it mimic human cognition limits, like time lag for loading a new info source into memory.

ideally they'd train it on real keypresses rather than actions

Why would that be ideal? Wouldn't that just make ML at the strategy layer harder without doing anything to make the discoveries more valuable?

to emulate human handicaps at the interface layer. I didn't say it would be free

But why is that desirable? Why would we want to emulate the human physical handicaps in our quest to advance AI at a strategy level?

For the same reason the APM are limited: to ensure that what we are doing is really focusing on advancing strategy rather than brute mechanical skill. If I played against an AI using nothing but the rendered frames and sound of a game as input, I might not even make the stipulation on reflexes. I'd be humbled if I lost.

As it stands now, most of the games I like have bad AI. Sure, it can be fun to play a hack and slash against lots of little, dumb minions, but FPS, RTS AI these days still don't cut it as savvy opponents. Often they have inhuman perception, direct knowledge of game state, or higher starting resources, but they make abysmal decisions.

Yes, I realize these are unlikely, expensive goals and incremental progress is how things are done. I just want to know if it's possible or desirable to emulate actual human reaction time.

Do you disagree this would in principle help separate strategy from godlike reflexes?

This is not the same AI you normally face in a game. Most (all?) of those AI opponents use rules written by the game developers to make decisions and some of them simply cheat to be competitive (cough Mario Kart 64 cough).

This blog is about creating AIs that interact with the game the same way humans do, the computer plays by the same rules and has no special access to the game state beyond what the player would have. With these constraints there are no existing bots for StarCraft or StarCraft 2 that can even beat the built-in rule-based AI. They aren't even close to beating professional players.

If the strategy abilities are so weak today that we can't even beat the tutorial AI then why introduce further arbitrary handicaps on the bots? How do those handicaps advance the state of the strategy layer? The AI has many potential advantages over the human player beyond just reaction time. Should we also limit the amount of data the bot considers to emulate the amount of inputs a human player can process? What about emulating human memory, can a human really learn from 60,000+ games? What about 1.5 million?

I do not think it is desirable to emulate human limitations in AI unless you are trying to create an artificial human. I think the advantage of creating an AI is to do something people can't already do so why should we impose our physical constraints on them?

I do not think it is important to separate reflex from strategy. Since every player has a different APM ability some strategies are more valid than others for each individual. If I do not have the reflexes of a professional player there are strategies I cannot employ. As long as StarCraft is not imposing APM limits on human players to maintain competitiveness the bots should also not have a limit.

Okay, thanks. I appreciate the counterpoint. I guess I'd like to see it both ways: bots limited to human speed and bots not.

Why not first allow the AI to have unlimited APM and beat humans, then restrict it later? Because I don't think we're even close to the easier problem.

In what universe is taking 10,000 actions per minute an easier problem for a neural net than 100 such actions?

"In what universe is taking 10,000 actions per minute an easier problem for a neural net than 100 such actions?"

StarCraft is precisely such universe. If you could micromanage units perfectly, you can do some amazing tricks. Here's an example of what I'm talking about: https://www.youtube.com/watch?v=IKVFZ28ybQs

That's a rules engine, designed to do basically one specific thing. It was told how to micro. In that context APM is a meaningless constraint. May as well ask how many times you can print a message in a for-loop per second. yes...quite a lot, and the computer is unfazed by the workload.

This is a different type of bot we're talking about here. A neural net could not learn to work with unlimited apm more easily than limited apm. That just doesn't make sense. That's like saying it's easier to compute 1000 hashes in a second than it is to compute one hash in a second.

I'm imagining an AI war where the next advancement is the micro of the siege engines to optimise targets and timing of shots to hit large groups after the initial splash avoidance. AI on both sides keep trying to maintain a one-step-ahead strategy which minimises/maximises casualties based on predicting the exact shooting/dodging strategy of the opponents. Will be interesting to follow developments in this area!

Have you played SC2 before? It's hard to explain if you haven't.

Yes, I have.

Philosophically, I wonder if it's better to just acknowledge the differences between AI and humans, and let them play to their strengths. It seems common to think that we need to constrain AI in certain ways to be more like humans, but the constraints are always artificial. APM is one constraint, but what about working memory? What about multithreading? We already allow the AI access to computational resources humans don't have. Why draw that line at APM, exactly?

I don't think they're constraining APM to make the AI more human-like. It seems more like they're controlling a variable, so that if the AI wins, they know it wasn't just because the AI could out-click a human.

Would not actions per second be a better limit tho?


Speed is in MPH or KPH, would it be better to go by m/s?

The convention in gaming is APM, so they're just using the nomenclature that is already understood.

Maybe what usaphp is getting at is that the AI could still gain an advantage by doing a set of actions much faster than humanly possible in just a fraction of a second as long as it kept its total number of actions that minute below the cap.

Exactly. Most actions in idle game mode (i.e. when there's no active battle, micro or macro to be done) are null, the gamers simply repeat random meaningless keystrokes; this helps with being always alert and ready for anything that requires a rapid response.

An AI doesn't need to do this.

Source: Ex pro Starcraft (Brood War) gamer.

> this helps with being always alert and ready for anything that requires a rapid response.

I always wondered why they do that, just spam meaningless things like open/close the stats window or click on the background. Some WoW PVPers did it too (like Laintime, I think it was - played a warrior and ran around spamming the character window open/closed an mashing 'weapon swap' even when there was no earthly reason to do so). I figured it was just the result of too much caffeine for 18 hours a day.

This sounds like it's more an equivalent of tennis players' dancing/hopping around, or martial artists doing their ducking/weaving thing?

There no were BW pro-gamers outside of Korea.

A 65MPH speed limit doesn't mean I can go 100MPH for some section of the road, and 30MPH for another section and still be in compliance.

Similarly, an APM limit wouldn't directly imply that the kind of "gaming" that you're talking about, where a user/agent simply has to have an average APM over a certain period to be in compliance.

usaphp's point seems to be that the period should be less than one minute. When the police measure your speed with radar they are not waiting an hour to average your speed. The same idea applies here.

Yep, I entirely understand the point. Consider that when the police measure your speed, they're still measuring in MPH. Just because the units contains a specific time-frame (hours), doesn't mean the measurement is made over an hour.

Similarly, just because the units for APM contains a specific time-frame (minutes) doesn't mean the measurement must be made over a minute. A 150APM limit doesn't necessarily mean that the running average over a minute must stay below 150 actions any more than a 65MPH speed limit doesn't mean that the running average over an hour must stay below 65MPH. If a police officer catches you going faster than 65MPH even for a single second (or however long a radar gun takes), they can pull you over.

The units of a measurement do not dictate how the measurement is made.

You say you understand the point but you seem to have entirely missed it, the same problem remains.

The point of the APM limit is presumably to fairly emulate a human player, but the APMs for humans are obviously averaged over a minute.

However, for certain things humans can initiate actions in SC2 in quick succession that, if sustained over a greater time period, would yield a ridiculous APM rate. Think a Terran player highlighting the barracks and tapping "A" really fast, say with a 10ms delay, that would result in 600 APM.

Let's say a human can build 5 marines like that in quick succession. You're going to have to allow for temporary spikes in APM to not unfairly give the human player an advantage.

But if you do that the computer is able to really rapidly execute more complex actions that the human can't because he's limited by the SC2 UI, whereas every action via the API is equally difficult for the computer. E.g. moving 3 different subgroups of marines out of the way of a High Templar Storm. I doubt any human player could select 3 different subgroups of marines from one big group and move them out of the way in 3 different directions within the span of 60ms (10ms for each action of select/move for 3x groups).

So any "fair" APM limit really needs to be dynamic in some way, or at least take into account the complexity of actions (say highlighting a group v.s. tapping "A"). It's not at all obvious how to do this while retaining fairness and not giving either the human or the computer an unfair advantage.

> the APMs for humans are obviously averaged over a minute

This is wrong. I don't know the exact number, but APM is averaged over around a second. I suspect this is done because APM is a more meaningful compared to APS, for humans at least.

Here is a graph from Scelight that high lights this: https://goo.gl/photos/9cjNxDwWoB1pmWkg9

You still end up with a computer that can perform 3 actions all within the first millisecond of a second and still end up with human-like 180 APM (3 APS), even though no human could replicate what it just did.

How is that any different than someone who goes 100MPH and then 20MPH and claims to have not violated a 65MPH rule because the average speed was 60MPH for the trip?

The units of the measurement do not dictate how the measurement is made.

Because when the cops measure your speed they can do so in an arbitrarily small time period of their choosing, and inertia and power requirements mean you can't be going 1000 MPH one milliseconds and 10 MPH the next.

You could decide to measure APM by saying that the time difference between any two actions extrapolated to sustaining that rate over a minute or seconds couldn't exceed the APM or APS, but as I've explained such a measurement would unfairly give the human player an advantage because humans are capable of bursts they couldn't sustain over longer periods.

APM isn't averaged over a minute time frame. I don't know the exact calculations they use ( and it has been changed a couple of times) but roughly, if you go above 5 actions per second your APM will be shown as 300.

I believe the reason they use APM over APS is because 3.4 APS isn't as meaningful as 204 APM.

Well it means the AIs could exploit it a bit by issuing hundreds of actions in the first millisecond and then waiting 59.99 seconds. I'm not sure how much of an advantage that would be though.

Which is still about 350 actions per minute for a professional SC2 player, roughly 5 to 6 actions per second.

Currently, though, APM isn't a real advantage for AIs. They're still too stupid.

I would like to see IBM Watson dominate Jeopardy without a superhuman finger.

Watson’s hand looks like a clear, Plexiglas, cylindrical soda can with a few metal screws in the top and a wire extending from the bottom that is connected to Watson’s Front-End Controller. The mechanical hand wraps around the Jeopardy! buzzer which is inserted in the bottom of the Plexiglas cylinder and is held in place by a clamp. Watson’s hand uses a solenoid to physically press the same button that the humans must press.

Watson’s hand is pretty fast in terms of raw speed — it takes somewhere between five and ten milliseconds for Watson to activate the buzzer once it decides to answer. This delay is affected by the speed of the solenoid and other small, sometimes hard-to-pin-down delays inherent in the software stack.


I wonder if the human decision-to-buzzer time is much slower?

Typical human response time to a stimulus is about 200ms. That of course doesn't factor in the decision time, but once the brain decides on an action it takes about 200 Ms for the signal to propagate to the relevant muscles.

https://www.humanbenchmark.com/tests/reactiontime seems to agree with you. The median there is 215ms, which probably includes some unmeasurable rendering and input latency.

In Jeopardy, though, people can anticipate the moment the buzzer becomes active.

How about exceptional human response time, as for pro gamers or Jeopardy players who have predicted the stimulus?

Yes and no. I guess the more "fair" way to compare them at trivia would be to have a format where all 3 give an answer to every question, and whoever gets the most points wins. That's obviously not really jeopardy, and I'd be curious to know how Watson would perform purely on the trivia knowledge vs clicking speed. I'm guessing it would still win though.

"What is your mother's name?"

"......let me tell you about my mother...."

Applications are open for YC Winter 2020

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact