"Hey everyone, I just wanted to make a quick statement about this article that was posted in WIRED. If Tom had told me that his intention for this article was to focus on the loss of ANY bot, Facebook or not, I would not have given him the information that I did to write this article. I am normally a huge fan of Tom's AI articles, but I think that this is a shameful clickbait title that focuses on the wrong side of the story.
It turns out that there may actually be a bug with how Facebook's serialization library interacted with our tournament environment, causing their strategy selection to not work properly, causing a huge drop in performance for their bot. We will be investigating this over the next few days to see if it was my mistake in setting things up that caused this, or if maybe something as simple as a windows firewall setting to be at fault. Unfortunately with competitions like this, competitors do not have access to our tournament environment / machines to fully 100% test all of their code on, and sometimes issues like this can arise.
I appreciate Facebook's effort in this competition and applaud them for actually having the balls to submit an open source bot, knowing the scrutiny they would face if it wasn't already superhuman. It's a real shame that this is what media chooses to focus on. Tom also asked me what time the results would officially be made public so that the article could be released at the same time, but instead the article ended up being published 7am the next morning so everyone could wake up to the headline :( Also the word "quietly" implies some sort of sneaking around on the part of Facebook, when their registration has been publicly visible for months.
Note: There is nothing factually incorrect in this article, I just believe its focus is in poor taste, and hurts the SCAI community as a whole, and unfairly singles out a single competitor."
I'm surprised this one hasn't been called out yet but Starcraft: Brood War and Starcraft II are completely different games, not just different versions.
And apart from what was already mentioned regarding information on the scene, this link is pretty useful: http://sscaitournament.com/index.php?action=tutorial
An older post on this topic: http://www.teamliquid.net/blogs/485544-intro-to-scbw-ai-deve...
As for SSCAIT in particular: http://wiki.teamliquid.net/starcraft/SSCAIT
From a high level, it seems like the elements for an AI to learn are mostly the same: Resource gathering and optimization, map/location battle optimizations, building/technology decisions...
It's not like Starcraft: verison X is different than Starcraft: version Y is different the way Starcraft and Hearthstone are different games.
One thing to add is the openBW project (http://www.openbw.com). It is truly amazing: The guys there managed to reverse engineer the full SCBW engine (I thought that's nearly impossible, as the whole flow depends a lot on implementation details, some of the details are even considered bugs, which turned out to be just the right implementation to balance the game perfectly...).
The number of units in SC2 make it much harder to micro for humans, hence the game provides help. But the human player experience shouldn't define the AI player behavior; especially one that is to learn the game on it's own.
Breaking down the goals, components, and constraints of the games, I'd actually say SC1 and SC2 are pretty similar games, especially to an AI on both micro and macro levels.
However, you are saying they share nothing in common.
However, my argument wasn't that. It was that they are not completely different. Their are similarities between the two.
Do Star Craft and Red Alert share absolutely nothing in common?
If they do, you agree with me!
Which is not to say that there weren't impressive parts. I don't think a human could stutter step that many dragoons that well. But there's clearly a lot left to do in terms of tactics.
That particular scene at the sunken colonies was equally frustrating to watch. I fixed the most likely cause of that behavior shortly after the tournament deadline.
But for sure, you're correct. There's a lot left to do on all fronts. There's noticeable progress every year -- none of last year's entries finished much over 50% -- and the growing research community should help accelerate that even more.
The biggest problems I'm personally facing at the moment are midgame macro with imperfect information -- how many Gateways do I need to be producing from before I can safely take a third base or transition to endgame tech? -- and engaging Terran mech armies. It's very hard to know when or how to engage a Terran mech army, and you usually only get one chance to get it right. Dithering results in bleeding units you can't afford to lose.
The biggest problem in general is that rules-based approaches are asymptoticly approaching the limit of their capabilities. Either of the problems I've described are big undertakings but only scratch the surface of what PurpleWave needs to advance. Machine learning is the way forward. But so far real time strategy games have proved remarkably impenetrable.
Consider that professional players often place their initial buildings to provide Scarab-proof escape routes for their workers against Reaver drops that arrive ten minutes later. Such placements are only effective if you know to react to Reaver drops by funneling your workers through them to glitch the Scarabs. How are you going to let your bot learn that on its own? There's a lot of work to do.
Or are there no people that produce such scenario-like maps anymore?
I had no hope of having a competitive score on the leaderboard, but I loved playing around with that little world.
Edit: Found more info on dinosaur island: https://github.com/robertdimarco/puzzles/tree/master/faceboo...
Lessons learned: don't run CTFs "too close" to your real infrastructure.
Edit: HN mods updated title to be less click-baity (thanks). Earlier it said "FACEBOOK QUIETLY ENTERS STARCRAFT WAR FOR AI BOTS, AND LOSES". The "and loses" at the end was misleading.
Both absolute and relative timing have to be handled. And relative since specific salient action...
Plus the real reward is very sparse. Say, crippling mineral production early may or may not snowball. Likewise being a unit or two up...
On the specific issue of encoding time-dependent behaviors in models, I think it is related to a broader issue that shows up in many application areas. To me the critical factor is that these models are ruthlessly good at exploiting local dependencies and totally forgetting long-term global dependencies or respecting required structure in control/generation.
This basically means it is very difficult to train long-term, time dependent behavior without tricks (early/mid/late game models, extensive handcrafting of the inputs, or using high level "macro actions"). Indeed, FAIR's recent mini-RTS engine ELF directly gives macro actions, in part to look closer at how well global strategies are really handled and remove one factor of complexity .
Gabriel's PhD thesis was entirely on Bayesian models for RTS AI, applied to SC:BW , so I am sure he is well aware of the "classic/rules based" approaches for this.
As to the sparsity of reward, I'm not sure this is such a big problem. Once the AI learns that e.g. 'resources are good', it can then learn how to optimize resource production. You could even give the process a head start by learning a function of time+various resources+assorted features to win rate from human games to use as the reward function.
Yes resources are good, but how do you know when to expand?
Judging from opponents movements, you can tell if they're turtling, going for some cheese strat, or doing some build where they may not be able to respond to a aggressive expansion.
Of course if you choose wrong, you lost the game.
While that all can be extrapolated from current state I think starcraft is much easier to go for immediate gains by destroying more supply/resource value of units and extrapolate from there.
I noticed this building in this position at this time and I haven't been attacked by X unit yet, so he's probably doing strategy Y. I better skip some unrelated building I was going to make, so I can have an extra unit Z in case he's doing that strategy. Then I'll place the units at a particular spot to try to trap him because that unit will be vulnerable in this other spot so he's unlikely to move through that spot.
It wouldn't be a suprise if some research team could put out a bot achieving superhuman victories purely by out-microing an opponent with minimal strategic choices.
Yeah they did pretty much that. But the problem is it's a very brute-force approach and violates some rules of the game.
They jam thousands of commands per second into the game, and give each unit its own rudimentary AI. The units basically just dance at maximum range, magically dodge hits, etc.
If they limit it to 600 actions per minute (10 keystrokes hitting the keyboard every second - still beyond the human mind but beyond human fingers) it becomes a much harder AI problem.
In the case of certain unit matchups, say, zergling versus vulture, the vulture should be able to kill an infinite number of zerglings given that it is microed correctly. However, despite the zergling being useless against a vulture on paper, In a human game you just don't have enough time to babysit your vultures with everything else going on so you end up seeing zerglings being used against vultures somewhat cost effectively even at professional levels.
While it certainly isn't fair to play against, it does have a certain elegance.
There's also the problem that even if it's AI vs AI, the races and units are balanced around reaction times of humans.
The chess equivalent would be letting Deep Blue take 10 years to evaluate each move; it's not a very interesting system anymore since it isn't playing under normal rules (~90 minutes per turn).
Any "real" SC AI will have limitations on input, say 300 actions per minute. It'd be pretty interesting to see how few actions per minute an AI could use to defeat the top human players.
Even worse and less interesting - it's a bit like allowing the computer to move two pawns in each turn.
With hundreds of unit x objects, jungling, roshan, cd and pick + ban, you can actually get at the sc level of complexity.
So, SC is still a much more complex space. DoTA has non player bots, but they are similar to SC buildings and follow very simple rules.
In dota combinations open the road for brand new moves. Some items tp, some regenerates, some cancel buff, some critics, some cleave, some cut trees, some slow, some stun, some give visions, etc.
Now in SC, you have 3, 4 main builds for a given match up. You see the building, and you know where this is going.
In Dota, depending of the 10 heroes, current money and objects combination, and player skills, you may expect one build or another.
Also, a zergling or 10 zergling is pretty much the same the same to consider from the behavior point of view. The number doesn't matter that much, only the intensity of the effect. And a gling will alway do the same thing. Move. Attach. Burrow.
The same unit in Dota can have a completly different role depending of the context.
My guess is that an AI would give you a much bigger advantage in SC because they can make more APM than a human, strat or not, while on Dota at high level strat is more important on the long run.
For example larva are one of Zerg's most valuable resource and there are several ways of attacking that resource by killing units or simply forcing them to go more defensive.
According to the player interviews and Reddit discussion threads, the "break" you are talking about was more like being really unpredictable therefore finding a play style that the AI had never encountered.
The players were flailing to find a way to defend against the AI that is learning quicker over time than they are.
Call again when it can actually learn the game from limited inputs available to human players.
Do you really need a 'research paper on machine learning' to understand that?
And Google still hasn't figured out how to best their search advertising network business, so I assume that people click on ads, even if I'm not in that demographic.
You should really watch a child play a game that has interstitial ads. It's quite obvious that they often click on ads because they want to learn more (maybe not fully convert, but intentionally click).
I am curious if that millions in revenue is for Microsoft (not surprising) or advertisers (more interesting) - I would love to read through their thought process either way.
They are supposed to be mini maps where your bot can train separate aspects of the game on a much smaller scale.
Extremely useful if you're an aspiring bot author.
The public exploited tricks to beat it. They did not beat it 'handily'.
Afterwards, the pros who do beat it only manage to do so ~2-3 times for every 100 games played. I believe they have been playing the same version that was shown at The International and not an iterated version.
Correct, we've been playing a number of pros using the same bot played at TI. We do have a stronger version which is just two days more of training (gets a 70% win rate vs the one at TI), but haven't seen a need to test it out. We'll likely do a blog post in upcoming weeks with more stats and commentary from the pros; would be curious what people would like to know!
Incidentally, the various exploits that people used are all similar to how we actually develop the bot. We try to find areas of the strategy space it hasn't explored, and then make a small tweak to encourage it to explore that area. Lots of progress comes from removing hardcoded restrictions, which are nice to get started. So the fact there exist exploits wasn't surprising to us — what would be surprising would be exploits we couldn't fix.
1v1 has always been a proof-of-concept for us. The fun part is really 5v5, which is what we're working on now (and hiring for! ping me if you're interested: email@example.com).
I understand this is a perfectly reasonable early test, but there are so many complaints about "it was just a restricted subset of the game and 1v1".
This is like complaining that Google doesn't release first-pass code (with minimal unit tests and no stress testing) to their production sites across the world. Everything that loops starts with the first iteration.
Also, keep up the good work, OpenAI! And please remember Asimov's 3 rules.
Anyone have links to the matches?
AIs have a long way to go to beat good human starcraft players.
Just from a pure economy standpoint, any computer process has quite an advantage from just optimizing action queues and keeping idle workers working.
People have already created TAS programs that can take out infinite numbers of enemies with minimal stock (ex: medvac/tank vs. infinite ultralisks). Or have zerglings that perfect 1 siege blast per 1 zergling splitting.
As far as micro, AI already has proven capability to absolutely dominate human micro, like not even close.
As far as build paths and macro decisions, AI isn't there yet but all it takes is one player and one programmer to come up with an in-the-middle and well-rounded build path that doesn't lose to any cheese; Sacrifice some economy to just have an army at all times_ and the AI will micro dominate the rest in extremely, humanly impossible army trades (I mean winning a 40 stock vs. 200 stock army battle).
Honestly just imagine having ONE mutalisk perfectly micro all-game, never receive death damage that just outputs as much damage as humanly possible at every angle. And you could have all 20-140 of your army stock doing this at all points in the game.
No contest. Just hasn't had time devoted to it yet.
And EVEN THOUGH they have access to ridiculous APM and the ability to do BS cheese strategies like that, they STILL suck.
Seriously, go watch some of these games. The AIs are freaking terrible, despite the fact that they basically cheat at the game.
Also, the AIs "perfect" micro, frankly only applies at the individual unit level. IE, they can kite like no tomorrow with a singular marine, but as soon as you have anything more complicated than that, such as "fight with 10 marines", well you learn that the AIs can't so much as form a concave.
Yeah, those 10 marines are all INDIVIDUALLY stutter stepping, but it turns out that perfect stutter stepping doesn't matter much when your army is cut in half, due it being split up.
Controlling more than a couple units "perfectly" (with regards to each others actions) seems to be out of reach of any AI out there.
It turns out that being really good at narrow micromanagement situations doesn't add up to winning complete games. StarCraft is messy and difficult.
harder more complete way:
read this "Deep Reinforcement Learning: An Overview", https://arxiv.org/abs/1701.07274
progressively implementing subparts in Python and openai gym https://gym.openai.com/read-only.html