Hacker News new | more | comments | ask | show | jobs | submit login
DeepMind – StarCraft II Demonstration (blizzard.com)
314 points by steve_musk 31 days ago | hide | past | web | favorite | 183 comments



For those interested in building their own AI to play Starcraft II, here is a very comprehensive, 17 video series by sentdex (Python AI in StarCraft II): https://www.youtube.com/watch?v=v3LJ6VvpfgI&list=PLQVvvaa0Qu.... He walks you through building and training a neural network to play Starcraft II via the same API used by DeepMind.


And for those interested in making their own AI to play Starcraft 1 BroodWar and become a part of the huge BroodWar AI community (get in touch on Discord at: https://discord.gg/w9wRRrF), they can follow the following tutorial: https://sscaitournament.com/index.php?action=tutorial

Your AI would then be able to participate in tournaments and fight other AIs on the existing ladders.

Additionally, one could watch BW bots fighting each other on: https://www.twitch.tv/sscait


And those who would like to follow developing a bot from scratch, I'd like to plug in here my blog: www.makingcomputerdothings.com There, I also host a podcast about SC AI development, which can be interesting for everyone. The podcast is uploaded to youtube as well: https://www.youtube.com/channel/UCHPl6OFov2v8SK14oQUHY2w


Very useful. Thank you!


"DeepMind has been hard at work training their AI (or agent) to better understand StarCraft II. Once it started to grasp the basic rules of the game, it started exhibiting amusing behavior such as immediately worker rushing its opponent, which actually had a success rate of 50% against the 'Insane' difficulty standard StarCraft II AI!"

Well, that's fun. I mean, if it works, it works.


50% ?! Wow.. I’d imagine going against any decent human player that would be more like 10% success rate.. considering by the time workers go to other camp and rushed in unison the other player would have +1-2 workers and would have to really suck at micro to lose that battle.


I think you are severely underestimating the APM and micro ability of these bots. https://www.youtube.com/watch?v=udIA6uvWS2Y

I know they have said they will be limiting the maximum APM of the bot to around 180 for the final bot but they may not have implemented it at the time this was happening.

Edit: another example https://www.youtube.com/watch?v=IKVFZ28ybQs


The second example is likely a 'cheat' because the splash damage would be applied the same frame as the shot hits the centre zergling. For an AI to dodge that it would have to have knowledge of which zergling was going to be hit before it was fired upon. The video creator likely used map triggers to simulate the effect.


Here's the original mention of that AI in a teamliquid forum thread started in 2011: https://www.teamliquid.net/forum/starcraft-2/210057-automato...

As far as I can tell, the AI must be cheating, at the very least to obtain map vision beyond what is available in a match. In order for the zerg AI to observe each tank's turret rotation and predict which zergling it's about to fire on, zerglings have a sight radius of 8, smaller than the tank's 11. In a fair match, the zerglings would be fired on before getting a chance to see the tanks. Even as the group of zerglings gets close enough to see the first tanks, other tanks start firing from beyond the zerglings' vision, so I'm not convinced this AI could pull this off in an actual match.


In the linked video around 1:07, you can see that they already have vision on the tanks when the first one rotates to fire. This could be the result of a cheat, but SC2 does include NPC watchtower buildings that can be captured to grant vision over a large area of the map (you could also send flyers to scout). From there it is (or at least can) 'just' ridiculously fast and precise angle-tracking and micro, so this is a situational tactic at worst.


> would have to really suck at micro to lose that battle

I'm not sure there would need to be that large a skill difference, relatively speaking - worker micro has a really high skill ceiling, and players start with 12 workers instead of 6 nowadays, so 2 extra is a much smaller difference.


You're right, which means DeepMind must have substantially better micro than the Insane AI. Not sure how impressive that is, but I do remember the stock AI does register a ludicrous APM (600+) in replay mode.


Good micro has a lot of room for tactics and outplaying where there's a lot more to it than having a high apm, so it's not really a surprise that Deep Mind can outplay the AI, even with a lower APM.

Evenly matched armies looking for a micro-based edge turns basically into a fighting game like Street Fighter, where it's about reading your opponent, baiting them into a mistake, pre-empting them, taking decisive action, crisp reflexes/timing, doing something unexpected or unorthodox to get an advantage at a huge risk, etc.


Worker rushing is pretty much a guaranteed losing strategy. One progamer who did it in competition was criticized for being disrespectful toward his opponent and viewers and lost his tournament seed as a result.


This is overstated. In the last few years Solar successfully worker rushed Zest, two world class Korean pros who didn't lose any reputation from it, and Scarlett will regularly worker rush in ZvZ.

Especially in ZvZ where spine crawlers can be built, it's not an auto loss -- it's a bet that the opponent is playing a little greedy, and a chaotic fight.


You're referring to this game? https://www.youtube.com/watch?v=dpGZY9O_dF4

That's not a worker rush, that's an early pool where you brought some drones.


The video has "drone rush" in the title, and the commenters echo the term without disagreement, so I think you're being idiosyncratic.


Those aren't worker rushes.


> One progamer who did it in competition was criticized for being disrespectful toward his opponent and viewers and lost his tournament seed as a result.

..what? Thats an absurd consequence -- sc/2 has a hundred different ways to rush, and they all come with the note: if it works, you win, if it doesn't, you're dead. And everyone accepts it. What's so special about drone rushing that it'd receive such consequences?

I mean hell, there's Has specializing in cannon rushing and goes damned far; but drone rushing? No, thats just too much!


It was a proof of concept.

It meant that the AI actually found a working strategy against one of the standard bots, where it had not found any before.

It will probably never work when it starts to play against itself, but we don't know if the AI is already at that point.


Which player? I've followed SC2 for a while and never saw that.



Here's the video for anyone interested. Looks like Naniwa didn't put any effort into making the rush effective. https://www.youtube.com/watch?v=5roOPKm5lOw


The announcers said that the game didn't matter. I don't think Naniwa was trying to win, just get the game over with.


If I understand the article correctly the verdict would have been different if he had won the match or at least an interesting fight had resulted. So, I'd guess that if an AI plays this strategy well enough (somehow) it would not be a problem.


Yeah I remember that. That was just seen as him throwing the game, not a sincere attempt to win. He had a reputation for throwing temper tantrums.


Thanks for sharing!


This should be discouraged at a professional level


The professional level should involve doing whatever the game permits that is most effective.

If it is a viable strategy and not a real edge case to punish some maximal greedy strategy, the game should probably be patched.


I would expect professionals to be able to deal with it defensively as well. It's a game, if it's possible to do in the game it's legal. And Blizzard has regularly pushed updates / rebalancing in both SC1 and 2 to make certain strategies more or less viable.


Pretty sure Insane AI plays better than most bronze-silver-gold league human players.


I've seen this in some of the player-made AI bot competitions- it's exciting to see how an opponent reacts to an all-in strategy where a programmer has likely not trained to expect one.


That doesn’t really sound like it understands the game... is this not training by playing against itself? How is it not learning the flawed reasoning behind this strategy?


The reasoning behind worker rushing is sound. StarCraft II is a game of hidden information and there's no time where that's more relevant than the beginning of the game.

The problem is that there's a sort of rock-paper-scissors game of (conservative)-(greedy)-(all in). That is, a player who build defenses early on will crush a worker rush but they'll be far behind a player who ignores defenses and focuses on growing their economy. The worker rush is designed to punish this greedy build order.

There's nothing that can be done, really. If you decide to send a worker to scout super early on then your economy suffers so much that you'll be behind no matter what you do.

Edit: to those of you drawing a distinction between an immediate worker rush and something like an early pool or a cannon rush, that's splitting hairs. These are all classified as all-in strategies that almost automatically lose if they encounter a conservative build order.


The rock paper scissors element is there, but not with worker rushing. Worker rushing is a huge disadvantage.

The term means you take your initial workers and charge the enemy. If you know where the enemy is you'll arrive about the time they are up one or two workers on you, then you just have to fight when you're down one or two workers and you can't back off (they'll return to mine).

A human can win like this if they are so much better at micro that they can overcome the material disadvantage (unlikely) or if their opponent panicks and does something dumb (e.g. pulling only half their workers to fight).

Because worker rushes arrive so early it doesn't matter what plan you have for the match, you'll have at least as many workers (if you are also worker rushing) or you'll have more. Maybe if you're going six pool you'll have fewer but you'll also be able to get lings and win with them.

The computer can lose to worker rushing only because it can get caught in loops of poor decisions (e.g. you lure away a couple workers and kill them).

Worker rushing is always a bad plan.


Maybe certain races have stronger workers if you have perfect micro (e.g. probes w/ shields)


Possible, but I highly doubt the differences are significant enough to make up the deficit. The other side has one or two more workers, which is just an insurmountable material advantage. If you back off to recharge your shields, I return to mining and get the resources for another worker or two (plus, my guys can recharge, regen, or repair too).


I'd argue that with AI level micro, the Protoss probe shield regen advantage would definitely come in play. Shields regen only when out of combat for 10 seconds and they regen at 2/second after that so it just takes 20 seconds to regen 20 shields.[1] This could definitely come into play enough, if micro'd enough, to make a difference, maybe... It'll be exciting to find out.

[1]https://starcraft.fandom.com/wiki/Plasma_shield


20 seconds is pretty long. Your opponent is up 2 workers on you, so assuming perfect micro from both sides, the defender can use one worker to continue to mine minerals and pump out more workers, while microing with the other 11 vs the attacker's 10 workers.



It should be noted that this is one of, if not the shortest map to ever have been played in official matches.


Very weird how the zerg somehow got their drones trough the scv line, it looked pretty bad until they got that surround going.


Workers can fade through units (unless they are sieged) when they collecting resources (minerals, vespene). So Idra commanded his drones to mine minerals and canceled the command shortly after.

This is common technique, mostly used to get scouting drone back through a walloff (own unit stands on "hold" in a small choke) or when a mineral line is under siege (e.g. Siege Tank, Liberator), the player can select all his workers to mine a single mineral patch (has to be spammed, otherwise workers redistribute) outside of the siege range until they siege is cleared up.


I thought Idra was notoriously against "cheese" of any kind?


That's why he apologized when he won.


Zerg have faster build time for drones, so they can get a numeric advantage (or at least parity, once you account for travel time).


The very fact that it's a bad plan means that it is surprising, which means that it is occasionally a good decision!


I think the chess equivalent would be choosing to play with only your King and your pawns under the logic that your opponent won't have practiced playing against only pawns and won't be expecting you to do it. If you're up against a novice they might panic or fail to react and you win! Otherwise, not.


Most cheese in SC2 is a micro check; you're testing to see if you can control your units better than they control theirs, even if they might have an economic advantage. A worker rush isn't nearly as stupid as what you're describing in chess -- especially if it's a ZvZ match, because in that case you can start planting spine crawlers on their creep.


If you didn't get any other pieces until the 10th move and then you got one piece for each pawn that hadn't moved, then that analogy would work.


Chess is not a good analogy I think because there is perfect information in Chess.


The surprise for a human player won't gain the deepmind AI much advantage. A human can recognize the concept "I am being worker rushed" and immediately counter it, necessarily having more workers at their starting site than the attacking player who sent only their starting workers.

As the grandparent stated, a worker rush only works against the built in starcraft computer player, because the computer AI is too dumb to recognize "I am being worker rushed" and may not commit all of its workers to defending.


I disagree. The part that everyone seems to forget about Starcraft is the micro; humans are imperfect at controlling their units, and even relatively decent players (say Diamond-ish) will sometimes lose to something like a worker rush.


> Edit: to those of you drawing a distinction between an immediate worker rush and something like an early pool or a cannon rush, that's splitting hairs. These are all classified as all-in strategies that almost automatically lose if they encounter a conservative build order.

Things like early pools and cannon rushes are (depending on the details of the meta at the time) a rock papers scissors game. Cheese (early pool/cannon rush) beats aggressive play beats conservative play beats cheese (roughly speaking).

Worker rushes are just bad. Apart from playing against 6 pools on really old versions of SC2 on maps with ridiculously short rush distances they just lose. Even then I don't think I'd call them good.


Other commenters have covered my issue with worker rush. My larger point was that it seems like the AI has learned that this works against the standard Ai, because the standard Ai cant properly adapt to such a strat.

That might solve its task, but it’s a far cry from learning to play Starcraft well. I was surprised they seemed to have trained it against the standard Ai as that’s not useful.


>How is it not learning the flawed reasoning behind this strategy

If it's playing against the Starcraft 2 blizzard AI, it will learn a lot of "bad habits" because outright bad strategies work against the AI. Thus the worker rushing.


> flawed reasoning

Your "reasoning" is actually a false perception. It doesn't make these mistakes and tries strategies that seem improbable to human players.

A 50% win rate against the Insane AI is very good, as at the higher AI levels the computer actually cheats (no fog of war, more resources, etc).


A moderately skilled human player will be able to beat the insane AI ~100% of the time.

The AI cheats, but is also incredibly predictible, and bad at decision-making.


As a moderately skilled human player myself, I developed a foolproof cheese strategy for defeating insane AI. They go crazy if you attack their worker line and recall every attacking unit that's not currently engaged. I played terran and built a single supply depot in the middle of the map to give me enough advance warning when they were coming to attack me, and then dropped 2-3 marines behind their minerals to stall them.

If you do this long enough you can build a virtually undefeatable siege tank army that the AI would simply smash its face into, and then corral them to 2 or 3 bases. They would always run out of minerals while I expanded everywhere and slowly inched tanks forward. Doing this same strategy against humans got me to about gold tier in the early days of sc2.


it's more simple than that, just play Protoss and cannon rush the AI.


Even a below average one: it's normal to belong in bronze league (~5th percentile) if you beat the AI on every level and then start playing other humans. The AI is not a serious opponent.


I had a super simple rule-based, which had about 80% win rate against a 'Insane AI' (depending on the starting position..) - 50% is so not impressive.


"Cheesing" as it's called, is classic SC2 strategy. People absolutely hate it, but it occasionally works, even though it feels cheap. It's kinda like Blitzkreig in chess...


I used to be bitter and hate it with a passion. I realized it has a very important part in the psychology of the game, especially in series. it creates very exciting moments for spectators when used well and helps keep people honest in their openings. If you as a pro never cheese, you will get taken advantage of.


As I get older I appreciate cheesing much more. 'Has' is one of my favorite players these days.


it's a valid strategy that brought some super exciting moments, like boxer bunker rushing yellow 3x at 2004 EVER OSL.


I believe the technical term you want here is 'legitimate strategy'.

https://www.youtube.com/watch?v=bLMYTQUw8Lc


Who hates it? 2017 and 2018 saw a resurgence of aggressive play that resulted in more varied games than ever. Some strategies that were previously considered all-in plays are now standard openers.


I hate it when people call it cheesing.

It's not my fault if you don't scout. Planetary Fortress rushes, proxy barracks, cannon rush... it's all fair game.


Like someone else mentioned, early game sc2 is a bit like rock, paper, scissor - scout too early and you'll be behind in economy against someone who's super greedy, but scout early against cheese and you've got a free win. You can take the "conservative" approach which is relatively safe against early cheese while not falling "too far behind" against a greedy opponent, but you'll still be behind in most of the games. If you're the better player though I'd argue it's still better to play conservative, be slightly behind, but win in the late game, while not falling victim of early cheeses.


People keep saying this but it's really not true in current starcraft. You start with 12 probes now, and terran and protoss probe scout 90% of the time at least with their first probe.


what is blitzkreig in chess?


Another name for the "Scholar's mate". Quoting wikipedia [1]:

> Scholar's Mate has sometimes also been given other names in English, such as Schoolboy's Mate (which in modern English perhaps better connotes the sense of 'novice' intended by the word Scholar's) and Blitzkrieg (German for "lightning war"), meaning a quick and short engagement (Kidder 1960).

[1] https://en.wikipedia.org/wiki/Scholar%27s_mate


The following sentence:

> After feeding the agent replays from real players, it started to execute standard macro-focused strategies, as well as defend against aggressive tactics such as cannon rushes.


What makes it flawed? It is a legit strategy that occasionally works at the pro level.


Not a worker rush. Worker rushes only really work against lower ranked players that panic and do not know how to respond.

The reason is that while the attacking player sends over their workers to attack, the defending player is still constantly making workers. By the time that attackers get there the defending player should have more workers. He simply has to command all of them to fight the opponent's workers.


It's plausible that a self playing AI would still learn to worker rush in some race combos though, at least for awhile.

If the opponent doesn't react, you win because you kill all their workers.

If the opponent overreacts, you win because you are mining more than them (IIRC this is a problem with the insane ai).

If the opponent reacts poorly in other ways, you might be able to win by out microing them. E.g. SCVs (terran workers) consistently beat drones/probe (the other races workers) 1-1 without micro. Repairing can be an advantage. The offensive AI being better at stacking damage on a few units. Etc.

The defensive AI should learn to fix these problems and stop losing to worker rushes, but it could take awhile. In other words the defensive AI starts out as a lower ranked player.

PS. I'm not sure how serious a tournament this was, but here's an example of a real worker rush (not an early pool with workers attached): https://www.youtube.com/watch?v=_AtqgaJzJP4


https://www.youtube.com/watch?v=KP_ZcpZIBDQ This is a pro-level game where one of the players worker rushes, and the rush is moderately successful.


That's a 12 pool though. Little different. The article was talking about an immediate worker rush, not building some army and then going over.


There was no pool, no lings. Just double gas cancel to get 2 extra drones without needing an overlord.


12 pool is a valid opening. A worker rush is not.


Isn't the idea that because your opponent spent resources on things like their first barracks and gas collector that your workers should outnumber them for the attack?

Of course you lose all of that time it takes for your workers to drive across the map, so it's a pretty dicey strategy. A 50% win rate against the AI sounds about right.


You shouldn't cut worker production at the beginning of the game - it's critical to keep pumping them uninterruptedly. You only build when you have enough resources to ensure that.


Worker rushes do not work at pro level. Cannon rushes and proxy buildings sometimes work if caught off guard. The openings in sc2 is more akin to the mind games of poker than chess. Information is hidden so there is nothing to keep you honest.


ZvZ is an exception, when a worker rueh is paired with an early pool, making it a worker and spine rush.


I am a layman, but isn't the point that it doesn't ever understand that game, it just wins?


What does "understand" mean? Do you understand the game if you know what all the units do, their hitpoints and attack values, how to build them etc.? Do you understand the game if you build a model in your head of the current state of the game and use it to try and predict what moves give you the best chance of winning? If so then it's probably fair to say that any decent AI "understands" the game on some level.

Or is "understanding" an attribute which is only afforded to humans who do these things, much like a machine can process the same inputs in the same ways as a human but will probably never be considered to have 'qualia'?


I think you bring up a good point. People imprint their own perception of things to this AI game. The bot is probably working like other deep learning machine learning things. It runs some models or whatever and tries to minimize error. Not surprising really that the computer thinks differently than humans. A worker rush sounds hilarious to us but hey it can work.


It doesn't sound hilarious to a Starcraft player. It's a very common strategy in the lower ranks that's very effective against players who panic or don't have good micromanaging skills. Everybody has been beaten by a worker rush at least once.


Except that worker rushes are actually terrible and very easy to beat if you don't panic. the insane AI is easy to panic though.


For a complicated game that lacks a dominant strategy like Starcraft does, the two are the same. You must be able to infer the state of the game given your limited factbase of what’s going on. You must be able to choose the strategy that is most likely to Result in victory.

Using this strategy is objectively bad in that it can be repelled by an opponent almost regardless of their initial strategy.

If deepmind’s goal is to beat the standard Ai, then sure I guess it understands that. But that’s a very different problem from understanding Starcraft, and a much less difficult one at that.


They are still learning. So at some point their AI should explore worker rushes, but also explore how to defend against them.

And once the defense gets good enough, the attack portion of the AI should probably do the worker rushes much less often.

Of course, given that the AI has different constraints in micro-management and attention, and that the game is balanced given human skills, some things that would be totally bonkers to do as a human might work for the AI.


But it is not a flawed strategy. It works 50% of the time against the "insane" AI!


This works really well against humans too. I can't remember how the technique was called, cheese rush?


I wonder how Starcraft2 compares to Dota 2 in terms of difficulty for the AI. Moreover, I'd be interested to see how OpenAI and DeepMind's techniques, models differ in approaching these games.


The main difference would come in the simultaneous control of all of the different units in Starcraft. Both MOBAs and RTS share many characteristics, since Warcraft 3 was the origin of MOBAs to begin with. For instance, micro, unit pathing, lowground highground, fog of war, spellcasting, etc. However, in a MOBA there is more emphasis on team-play, on the interactions of 5-hero team compositions, which is decided upon at the start of the game. There is also itemization, which is much more stylistic than anything that can be purchased in an RTS.

In Starcraft 2, it would be fantastic to individually control all 16 marines in your 2 medivac stim timing, pulling them back when they become the targets nearly instantaneously, perfectly shuffling them into/out of the medivacs, while focus-firing the defenses. Another example would be a reaper rush, a beastly AI could in theory micro 5-7 reapers individually, potentially breaking the game as human fingers/muscles/nerves have limits that CPUs and GPUs do not have.

The latter concern, about sheer speed and potentially limiting APM of the Starcraft 2 AI, is a very interesting one. It would be interesting to allow the AI to match the average APM of the Global Finals players, for instance, which might be around 300 actions per minute. If the computer was allowed to utilize 3,000 actions per minute, it could surely perform much greater feats of micromanagement.

The largest edge I see human players having vs the Starcraft AI would be something strategic, get a sense for how the AI plays then pull off a specific strategy which counters it. It may not work well in the current human-meta, but perhaps the AI will have its own meta that you can out-think it on?


> In Starcraft 2, it would be fantastic to individually control all 16 marines in your 2 medivac stim timing, pulling them back when they become the targets nearly instantaneously, perfectly shuffling them into/out of the medivacs, while focus-firing the defenses.

Single unit pullbacks, medivac shuffling, and focus firing, all together, are commonly seen in pro-level matches. The guys are superhumanly insane. The current pro players can sustain 500 APM during battles. The only reason they don't do this even more is that they are usually fighting on two or three fronts at the same time, while managing base and unit production to the second.

The real limit of the players is the single viewport and the single instruction input pipeline. Managing dozens of units with those heavy restrictions is pure madness. That's the main difference between RTSs and MOBAs. MOBAs are defined by strategy and tactics, teams win and lose because of those two factors (at pro levels). A game like Starcraft II adds another factor: players can simply break down when they're not able to keep up with the game. Even at pro level, players can be overwhelmed by the amount of information they have to process in tenths of a second, over and over again, for 15-25 minutes, or they can get into a spot where there's too much to do and not enough time to do it.

AI has two advantages. It has better information channels, in the form of perfect vision, while humans have a single pair of eyes that have to scan through the screen. It also doesn't get tired over the course of the game.


> Single unit pullbacks, medivac shuffling, and focus firing, all together, are commonly seen in pro-level matches. The guys are superhumanly insane. The current pro players can sustain 500 APM during battles. The only reason they don't do this even more is that they are usually fighting on two or three fronts at the same time, while managing base and unit production to the second.

Not to the level AI can do it. Did you see the video where zerglings dodge tank shot splash? a human could never do that


That zergling video depends on a zero reaction time, it's true that's an advantage of AI. No matter how high the APM, if the reaction time is high enough, some things just cannot be done.

Perfect micro against siege tanks is impossible for humans, because the time between the beginning of the shot and the moment the damage is applied is seen by us as instant. The game features other units where that time is bigger, and pro-level games usually feature perfect micro on those instances (e.g. widow mines, banelings, disruptors, locusts, hellions...)


> Even at pro level, players can be overwhelmed by the amount of information they have to process in tenths of a second, over and over again, for 15-25 minutes, or they can get into a spot where there's too much to do and not enough time to do it.

Consequently, limiting he AI's APM only adds a meta-game around APM planning that a human still would have a hard time to maintain.


> since Warcraft 3 was the origin of MOBAs to begin with.

Incorrect. It was with Starcraft 1, a custom map called Aeon of Strife (AoS).

https://web.archive.org/web/20101111183559/http://www.getdot...


Yes, although MOBAs were popularized with War3, specifically DotA All-Stars became utterly dominant and more popular than the base game.


When deepmind announced the project, they noted they were going to limit the machine's APM to something human-like


A problem I see with limiting APM to the same level as a human is that for that APM, a human will move units tightly packed together, while a computer will likely move a bunch of units at different ends of the map


The conclusion would be to patch the game's unit control with the new AI. That would augment human play and the implementation would need to be careful not to become the equivalent of auto-aim in first person shooters.



It is somewhat disturbing that the AI learned to creep-block on its own.


Where did you read that it learned "on its own"?

It was specifically trained to do that (https://blog.openai.com/more-on-dota-2/) and the devs made edits to the model to improve it (https://twitter.com/Smerity/status/897959521661759488)


That was from 2017 1v1 bot. OpenAI Five apparently learned it on its own. From: https://blog.openai.com/openai-five/

> Creep blocking can be learned from scratch. For 1v1, we learned creep blocking using traditional RL with a “creep block” reward. One of our team members left a 2v2 model training when he went on vacation (proposing to his now wife!), intending to see how much longer training would boost performance. To his surprise, the model had learned to creep block without any special guidance or reward.


It’s going to get even creepier for you then. I’d be very surprised if a decade from now any human player has a chance of beating a state of the art AI player in pvp games.


that's not creepy at all. there's many sports today where a machine specially built for that task would dominate any human from ever winning.

a baseball "player" that can hit a homerun on every swing. or throw a 150mph fastball.

a basketball "player" that can make every single shot. you dont even need to defend that well when youre shooting 99%

a "golfer" that can get a hole on one in every hole

emulating human movement and agility will keep a few more sports safe, but not for long. im just sitting here waiting for the robot olympics


You are really underestimating the difficulty in making humanoid robots that actually work well.

Sure we can make a basketball shooting machine with a high accuracy, but one that participate in a game, dribble, pass etc? Near impossible with today's technology.


It just needs to be tall, move with some speed, and be able to receive passes from other players; essentially a easier to hit hoop that can annoy the opposing team by getting in the way.

The requirement that it'd be humanoid is just an artificial restriction that reveals you are not really asking for the best basketball machine that can be made.


It'll have to follow the rules of basketball, though, which were written for human players and refer frequently to 'hands', 'feet', 'steps', 'arms', 'legs', and so on.


More interesting would be humanoid robots that could win at these sports. The state of the art today is a long ways away from being able to compete in any of the listed sports. Baseball is probably closest because with sufficiently good pitching and hitting you could win, which would avoid the much more dynamic problems of fielding, running bases, etc.


The first two, maybe, but there's no way we could build a golfing robot that gets a hole in one on every shot.


In an indoor field it's totally possible. Standard golf is affected by weather, though.


at the very least the machine will 1-putt from anywhere on the green


Reminds me of this woz quote an associate and I say to each other on occasion when creepy AI comes up: "The future is scary and very bad for people" which we shorten to "The future is scary and very bad." :)

https://www.washingtonpost.com/news/the-switch/wp/2015/03/24...


state of the art AI has been unbeatable in FPS since the first aimbot was created.


I'm interested to see how advanced AI can get at Spy Party


I would think SC2 is harder than Dota 2. In Dota you control 5 units while SC2 you can build hundreds.


Technically, in Dota you only control 1 unit. That being said, I don't know if OpenAI's algorithm had 5 independent agents controlling 1 unit each, or 1 agent controlling 5 units.


I believe it's five identical agents: https://blog.openai.com/openai-five/

But because the agents were all trained together, they kind of operate as a unit.

Coordination

OpenAI Five does not contain an explicit communication channel between the heroes’ neural networks. Teamwork is controlled by a hyperparameter we dubbed “team spirit”. Team spirit ranges from 0 to 1, putting a weight on how much each of OpenAI Five’s heroes should care about its individual reward function versus the average of the team’s reward functions. We anneal its value from 0 to 1 over training.


It would be fun to see the AI players spamming the ingame chat to communicate with each other in its own language :)


A truly superb AI would guilt the support AI into buying wards by constantly whining about it in chat. It would also ping the minimap when one of its teammates makes a mistake. Finally, at the end of the match, it would pronounce "bad team" if it lost.

(I never played dota2, that's all I know about the game ;)


In the last model they trained they shared some part of a layer for all the agents. So it was like a one agent trained with shared knowledge of the rest.


OpenAI Five had a separate agent per unit, no explicit coordination.


This is incorrect. They shared neural net activations which is explicit communication.


at the same time that could be seen as an advantage for the machine, as multitasking is much easier for a massively parallel computer not limited by muscle movement or vision than for us. we'll see. depends on the usual stuff like limited apm ofc


I wonder if they stuck with their original game playing API that takes in some "visual" features. In that case it would be "limited" by vision.


in dota there are 10 players, so you have to coordinate. apples to apples this would mean 5 indepdendent bots per team that have to learn that. there are also summons to control and elaborate spell interactions.


I'm confused, I don't see anything new here. I looked at the youtube channel and don't see a demonstration...


It's not published yet. It's scheduled for Jan 24.


then maybe OP should've waited until Jan 24 to post this...


But then I wouldn't know to set a reminder to see this live.


> On January 24, at 19:00 CET, head over to...

You looked too early.


>On January 24, at 19:00 CET Guess we have to wait


Jan 24th I guess is the publishing date


I wouldn't be surprised that an AI would be good at StarCraft. If it's allowed to have high constant APM, an AI could do micro maneuvers that most people can't reasonably do like consistent unit splits and fixing any weird unit pathing issues at bottlenecks.


I’m pretty sure they’ll limit the APM (or do something more fine grained since not all APM is equivalent). It’s already well known that bots can outmicro humans in certain aspects like splitting and focusing the correct units.


They limit APM under 200 or something around that number.


180 APM, specifically: pg6 https://arxiv.org/abs/1708.04782


I have a hypothesis and I’m curious to hear responses. The ability of Ai to work as a team with other AI has shown to be relatively weak. They can do individual tasks well but are relatively agnostic to others (thinking the Dota AI from a while ago).

Might this be improved by forcing the AIs to send messages to each other at a constant interval? I feel like this would lead to the development of simple language built around intentions and help the Ai to consider its allies.


Im woefully qualified to answer any AI related questions, but I must ask, why do you think forcing bots to send messages to each other at a constant interval would lead to the development of simple language and encourage bots to be more aware of each other? If you reward the behavior of writing in chat, bots would start writing gibberish in chat all the time.

Communication in animals can be traced all the way back to cells sending each other messages to organize themselves. Cells began to have division of labor and work together, and as multicellular organisms increased in complexity, so did the complexity in their communication. But what would happen if you put these cells in charge of playing Dota and force it to send each other messages? Would the cells somehow develop a simple language built around intentions? Or does it still not understand the concept of intention?

Video games like Dota are extremely complicated and made specifically for human biases. We've obtained these biases over many many many generations of evolution. Machines are getting better at doing human tasks, but they still do not possess all the human biases (and I think it'd be quite scary if we actually get there). That's why these bots/algorithms/AIs can do simple tasks like identifying objects really well but they don't know what to do with them, yet.

Having machines engage in meaningful conversations with other machines and plan together is a much bigger problem to solve than having them farm up, buy items, and attack things.


I don’t think you understood my point. I’m not saying train the robots to say things by rewarding them for doing so. I’m saying “force” them to do so as a hard coded rule. And also force them to read the chats of the other AI allies as an input. Continuously train the model to map the messages into an embedding space and use the embedding for decision making.

This would likely benefit by some cherry picked training data, but I think AI could get good at communicating their state to each other. Star craft has too much entropy to make a simple example. Imagine the game was just, you pick a random number and your ally has to guess what it is. I think Ai’s could learn to communicate this using a similar training design.


I recently got my ass beat by a player that had a 400+ APM. Is that even humanly possible or is it one of these bots?


Most of those APM aren't actually doing anything productive, they are spamming build unit buttons or repeating a command order. It's really common in Starcraft for people to press lots of buttons they don't strictly need to in order to "keep warm".


Pro players have such APM. Even higher for most intense periods in the game.

It's also possible to cheat by spamming the same commands over and over to look "like a pro" in the stats and some people do this.


Definitely possible. A progamer like Serral for example can maintain 500+ apm on 20+ min games, and can peak to 700+ apm.


70 real actions and 630 useless clicks.


APM accounting in SC2 is dumb and broken for Zerg: if you're building 100 zerglings, you select all of your hatcheries with one action, select your larvae with one action, and then hold down Z for 100 actions in a split second.

It's not unreasonable to have 200+ real APM in bursts, though. For example, at the beginning of games in Brood War, pretty much anyone competent is going to do something along the lines of the following in about 3 seconds:

  click command center
  build scv
  select all scvs
  mine a mineral patch
  select a scv
  mine a different mineral patch
  select a scv
  mine a different mineral patch
  select a scv
  mine a different mineral patch
  click command center
  set command center hotkey
  set command center rally point
  set camera hotkey
which is 280 APM. You'll see similar bursts during macro cycles, which are generally just (click building, press M) repeated 10 or so times in a short period.


It's more like 300 or 400 real actions during the heavy, action-packed moments. It's just insane. I doubt I can even click a mouse that fast.


This comment makes me think you have never played starcraft.


Have you? It's "easy" to get an APM over 1,000:

Step 1: Hotkey units on 1-9

Step 2: Slam face on keyboard numpad

Holding down a hotkey to build things (even if you don't have the minerals) also increases your APM.

Lots of players (especially the pros) will spam actions in the quieter parts of the early game to keep their fingers warmed up, or to start building a unit the millisecond they have enough minerals to afford it.


I play a ton of SC2 and he is mostly right.


I've reddit[1] as programmer and it's terrifying to think somebody can bang out code at that speed for 20+ min beside you.

[1] pun intended


I've seen a super meat boy player say across consoles and pcs it's roughly 6 button presses a second (not apm) that most pros can handle. On a mouse you can go higher, but the real skill is making those presses useful.


As though StarCraft multiplayer wasn't difficult enough!

In all seriousness, this is pretty neat. Does anyone know if it already knows everything on the map, or if it has to discover units like the rest of us?


It has full fog of war and has to use the camera like any human does. As of November at Blizzcon, apparently it wasn't doing enough exploring: http://starcraft.blizzplanet.com/blog/comments/blizzcon-2018... https://www.youtube.com/watch?v=IzUA8n_fczU&t=1360

More detailed discussions at https://www.reddit.com/r/reinforcementlearning/comments/aioc... & https://www.reddit.com/r/MachineLearning/comments/aip7vu/d_d...


> As though StarCraft multiplayer wasn't difficult enough!

But it just won't be the same without someone telling me that I'm awful and my mother is of questionable morals.

... machine learning will probably pick that up too.


Personally I've found that Starcraft is one of the less salty/bm games out there. At least in diamond 1v1, people doing "gl hf" at the beginning and "gg" when they lose is common. And you don't have much time to whine in practice because there's no downtime until either you or them are basically dead.


The previous attempts at this would plug in the AI to a regular player interface. Tournaments would also impose APM limits to prevent inhuman rate of actions.


Does APM alone somehow win games? If that's the case this AI stuff is trivial.


It makes a huge difference. In this video, 20 tanks kill 100 zerglings and lose only two tanks. With super-human micro, the results are... quite different. https://www.youtube.com/watch?v=IKVFZ28ybQs


Reaction time is the real difference in this video. A human can't react to the twitch of the siege tank cannon that happens just an instant before the shot lands, we treat it as something that happens instantly and thus, we can't react to it, not even with a single unit. If you apply a human reaction time to that AI, that behavior isn't possible anymore, even with perfect micro.


The SC community seriously obsesses over APM, but I'm not entirely sure if it's correlation or causation. I personally think it's overblown and became popular because it's one of the few things you can easily measure and compare between players.


Well, ignoring spam clicks, APM is a measure of the number of decisions that a player makes throughout the course of the game. You'd expect a player making more decisions to come out ahead in a game as fast paced and action oriented as Starcraft.


It's clearly correlation. A pro will still play at a high level with mouse only, or even keyboard only.

Example keyboard only game: https://youtu.be/cY0d2kHzrxY


it's largely overvalued, but you probably can't be a pro without at least ~150, slightly less than 3 actions per second, increasing as the game goes on.


I hope the paper release on Jan 24 will include such details.


There may not be any papers. They only announced a YouTube livestream of a 'demonstration'.


It has partial observability constraints.


I couldn't find an actual video of a game.


The demo is tomorrow on Twitch and Youtube (24th Jan)


In other news water is wet


tl;dr: wait for jan 27th

I really hate marketing pre-news...


Everyone likes sci-fi. Star Wars, halo, Star Trek, etc. all these franchises have something in common: they depict AIs working alongside humans. I think that it’s an important demonstration of cognitive dissonance about AI. Fans of these franchises aren’t bothered by the glaring contradictions that some fictional elements create. For example, Star Trek features transporters that teleport people across space and in reality the existence of such a machine would change the economics of many different things very drastically. Life would look a lot different if teleportation were allowed to effect the economics of human life. Another example is communication technology. Star Trek had what we now call smart phones. But they still lived their lives the way people did in the 80s. We now know that when smart phones are present, people stare at them all the time, don’t go out much, are never in trouble because they can’t communicate (car broken down) and etc. it’s not the best example but it’s the only one that actually played out irl. My point here is that people in general do not seem to appreciate the change in economics that are prompted by new things — people are only able to imagine some new technology being used in some niche case. It doesn’t bother anyone, or break their suspension of disbelief, when some groundbreaking new technology completely fails to alter the world and society in a realistic way in a fictional setting. But any other breach of “realism” is somehow very offending.

In halo, you have an AI that lives in your power armor and her name is Cortana. Cortana can sense what you see, hear, etc. Cortana is seen to be super smart on multiple occasions and it is clear to the player. This doesn’t bother anyone. It bothers me now, though. If all of this is true, the following would happen.

Cortana senses the presence of enemies faster than you. she hears them and is able to identify how many based on their foot steps. She becomes annoyed with how sloppy you are, how many enemies you didn’t even notice and how slow you are to respond to them. She suggests that she take over your power suit. You find yourself in a situation you can’t handle and are forced to give her the reigns. She jumps into action dispatching enemies with speed and accuracy that you’ve never seen before. She shoots enemies as they step out from cover. Your movements are so fast that you vomit. She wipes out army after army as you passively watch, bewildered.

As you go through the game, you are on a mission to unravel the mysteries of halo and find a way to survive. Cortana assesses the battle space and concludes that you need to travel east. She notices several patterns of ancient runes and directs you to investigate a hidden passage. Her assessments seem incredible and bizarre but they consistently prove to be correct and eventually she ends up making every decision.

At the end of the day, if Halo were real then it would be a story about a man who passively sat inside a suit and watched an ai save the world. He probably would die in the suit at some point due to experiencing high g forces during some adventurous fight or maneuver and it would be a story about an ai who saved the world with a dead guy in the suit. It doesn’t make you feel warm and fuzzy like halo is supposed to, does it?


> It doesn’t make you feel warm and fuzzy like halo is supposed to, does it?

I take it you haven’t played Halo Reach :)


Wrong thread?


My comment is about ai. The thread is for an ai post.


It's not about an AI in the sci-fi sense though. It is about an actual real world AI playing the game StarCraft II.


My comment is about ai. It is about how people perceive ai, and sci-fi is a very good demonstration of how incorrect human intuition can be when it comes to ai. This incorrectness is present not only in stories about ai, but policy and opinions regarding the very real and dangerous developments that are taking place. Developments such as the one that this thread regards.

It is therefore only through deep, contemplative and unintuitive ruminations that one can come to understand that AI is going to end human society as we know it, and that the result will be very unpleasant. This is not a guess, it is the truth. People have a huge amount of difficulty seeing this. I thought the example of halo and Cortana would make it easier for people to understand. I could have written it better but why bother? Nobody ever seems to get it.




Applications are open for YC Summer 2019

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: