Hacker News new | comments | ask | show | jobs | submit login
DeepMind StarCraft II Demonstration [video] (twitch.tv)
163 points by Permit 22 days ago | hide | past | web | favorite | 116 comments

I was left unconvinced (reproducing my comment from reddit):

AlphaStar had 0 cost to sense the entire map simultaneously, and when they introduced the attention/context switching cost (camera control) in the showmatch, it lost.

Also very notable for me in the showmatch was it seemed completely blind and exploitable to some strategies. When AlphaGo played top players, it made a few mistakes, but there wasn't anything obvious that it just couldn't see. Here it just couldn't think of making phoenixes vs. the warp prism harass which shows strategically it isn't near human level yet. It could clearly be exploited by back and forth harassment too (probably has to do with the limited memory those networks have).

Finally, DeepMind were just emphasizing average APM, when it clearly reached totally superhuman levels at times -- even a top professional can't execute 900+ flawless apm in a battle that we saw.

David Silver was clearly expecting the showmatch to be totally one sided (hence his speech that 'this is another historic victory for AI), but I were left with the opposite impression: that strategically top humans are still ahead in this game. This is not the end of the game for humans yet !

The original AlphaGo showing (vs Fan Hui) had obvious problems. They weren't catastrophic, it was clearly playing at a professional level stronger than 1d, but it also wasn't obviously superhuman. Lee Sedol could have expected to win his matches, and not been arrogant about it.

The gap between that AlphaGo and the AlphaGo that beat Lee Sedol was probably something like 10 years of human improvement compressed into a couple of months. 500 Elo is huge.

Even if they still have further to go, comparing this to AlphaGo's progress is reasonable. I'm going to live dangerously and make an assumption - you probably understand Starcraft better than Go, so can see the flaws more easily. Any mistake in Go comes back to something the AI just didn't see.

Yea stepping back a little I agree it's a fundamental achievement. It's all about expectations -- I were expecting a Lee Sedol kind of event, and it was more of a Fan Hui (or a bit further) kind of demonstration. A milestone for sure considering how difficult the problem is with various previously unconquered subchallenges (uncertainty, partial observability, massive control space), different but in some ways more difficult than OpenAI's Dota 2 matches.

> I'm going to live dangerously and make an assumption - you probably understand Starcraft better than Go, so can see the flaws more easily

Correct :) Although I'm not great at either.

This comes closer to opening doors to real uncharted territory in robotics and general agency; it trained in crazy low wall-clock time. But now we're this far it makes me question even more the cost, flexibility, adaptability, of the AI less forgivingly. How much did it actually cost to train? Would it be economical to train one of those for every single task? (e.g. in a factory or warehouse setting), Would it be vulnerable to those exploitable strategic weakness in a real world setting?, etc.

It's definitely not over yet as far as SC2 goes though :)

StarCraft is also a FAR bigger challenge than Go. So much so that even saying AlphaStar is similar to AlphaGo is quite a stretch (see https://deepmind.com/blog/alphastar-mastering-real-time-stra... - the architecture and to some extent training regime are very different). I'd be surprised if they can learn the macro without imitation, unlike in Go ; but we'll see. Would sure be nice if they also released the paper...

I'm always partial to a good argument for being sceptical of such announcements (I'm a professional skeptic; "researchers", some people call us), but in this case the question is whether it is ever possible to make the comparison fair at all. For example, we can restrict the AI player to perceive the game world in the same way (iiish) as the human player, but we can't really make the human player see the game world in the same way as the AI player. So if the AI player has an inherent advantage because of its superior perception, there's just no way to level the field and compare the human to the AI in a fair manner. Perhaps we just have to accept that a human will never have an advantage against a computer player in a computer game, or that computer games are just not very interesting testbeds for AI, after all (that would be very controversial).

On the other hand, it's also useful to remember that previous attempts have been made to develop strong Starcraft players, and they all had access to the same API as the DeepMind player, yet they didn't perform as well. Of course it makes a difference that it's DeepMind who achieved this and they seem to have spared no costs in hardware and training time, unavailable to a smaller team. Perhaps we shouldn't be that surprised to learn that a poweful computer can do better than a human in some cases. But, at the end of the day, the result they got against MaNa shows that it is possible for an AI player to beat a strong human player -fairly or unfairly- in a game that was rightly considered very hard for AI players, under any assumption.

A fair comparison is simple: make the AI play the game using a standard screen, mouse and keyboard (the game was designed to be played this way after all, and non-standard peripherals are not allowed in tournaments by the way). It suddenly seems WAY more challenging for the machine, doesn't it? You could also invent some contraption that interfaces directly with the human's brain to even out on the other front, but this sounds even more futuristic.

The average APM limitation is just a publicity stunt. There are similar caps one could apply that would make the challenge much more "fairer". For instance: Max (instead of average) APM or limiting mouse travel distances (no one can get even remotely close to clicking 300 times/minute on alternate sides of the screen with pixel-perfect accuracy AND timing).

Finally, the AI only plays 1 of 6 possible match-ups in 1 of infinity possible maps, and the live game has shown severe strategic mistakes (mainly the indecision when getting harassed and the inability to build a phoenix to counter the prism / keep harassing itself with the oracles instead of stupidly watching over the prism with them). It still has a long way to go for me to consider it "superior" to humans regarding intelligence (not just superhuman control).

All in all, this is still an impressive achievement though!

Very impressive... but it seems like the AI relies entirely on abusing blink stalkers which with perfect micro is basically impossible to counter. It is no surprise it can crush pros when it has perfect timing and zero mistakes in using these units.

I think the coolest thing is how the play of AI mirrors a similar style to how pros have developed (Micro harrasing early, early expansions, a very good understanding of when to attack/retreat). It looks just like your average pro..until you see its god tier micro.

Well, after seeing Mana crush it in the live game it seems the AI had zero clue as to what to do... it seems like it calculated it couldn't win a fight so let Mana destroy its entire base. So, just like with the Dota AI we see pros can exploit and win easily once they play around the micro advantages.

It had a variety of strategies, not always making a lot of stalkers. And humans can do some really impressive blink micro too up to a certain number of units.

So that's one aspect of it but isn't the main strength.

Those were not one and the same agent. If you look at the figure they released on their website the given agent would probably always have gone for a lot of blink stalkers.

I was amused to see on their website that one of the top-rated agents we didn't get to see built almost nothing but Void Rays.

We're talking about the whole package of the AI, what the entire system is capable of, so that technical detail doesn't affect kmnc's point or mine.

The biggest effect is that it's harder to change strategy midgame, which is not all that critical.

It makes the trained agent much less like a human, individual agents are not creative, they‘ve just learned to execute a certain build order. It might turn out that this is enough coupled with flawless execution and micro.

"A human agent isn't creative, it just has a set of initial build orders it developed and picks one before the game starts, and after that it adapts to the situation weighted by the units it prefers."

Also the execution of the build order is nowhere near flawless.

As the commentators mentioned, it's no use building units to counter your enemy's army (Immortals over Stalkers) when the enemy can control their army so much more effectively.

I have to wonder if future competitive games will need to take into account the abilities of reinforcement learning algorithms when releasing balance patches.

But there IS use building units to counter your enemy's army. In the last live match when Mana won, his immortal archon zealot composition was what sealed the deal in the end.

I mean, the AI in that last match couldn't see the whole map at once.

Fog of War was enabled in all cases, but the last match was them enabling the "Scrolling window" that humans are forced to look at the game with.

The AI in all of the other games could see and control all of its own units on the map simultaneously. No human has this ability due to the limitation of the screen.

But they did say that the way the AI focused was roughly equivalent to moving the screen every couple seconds, not very different from pros, and that adding the limit didn't affect its performance against the old version.

And his far superior positioning.

Stuff like that is why StarCraft is not a very good game to test AI on. It's good for publicity, because it is well-known and has a pro scene so you can claim to "beat humans", but it's too complex in some ways (ruleset) and too simplistic in others (mostly just killing stuf). APM and micromanagement are huge, while long-term strategies are fairly limited compared to many other strategy games.

With all of this, it's really hard to see how much/which parts of the game the AI really "understands".

I'd love to see an AI that can play MAX or MoO from raw pixel inputs, even if it sucks at it.

There are limitations they put on the AI to try ti restrict to human levels. Such as having an action counter. And in the demonstration they filmed today, they actually limited the information the AI knows about to the screen space, which is probably along the lines of what you were wanting.

Their action counter rationale seemed fallacious, allowing 300 actions per minute sounds like it would give the bot an edge, presumably the bot has a much better ratio of meaningful actions vs all actions.

>There are limitations they put on the AI to try ti restrict to human levels. Such as having an action counter.

Which is exactly why StarCraft is not a very good game to test AI on. It's absurd to put arbitrary limitation on something to make the game "fair" and then pat yourself on the back simply because the algorithm won. If it can already win through pure micromanagement, why

There are tons of strategy games which don't revolve around micromanagement where APM simply doesn't matter. All turn-based games, for example. Or real-time games where building stuff is more important than combat.

> All turn-based games, for example.

You mean like chess? And go?

I think turning to a real time game with a complex rule set after showing they mastered turn based games with simple rule sets was very sensible.

> Or real-time games where building stuff is more important than combat.

Can you name one that is played professionally (important for balance and comparison to humans) where this is more true than starcraft 2? I think of starcraft 2 as very macro focused as games go.

I used to be ~80th percentile in North America (worst region) and I'm certain at the time any pro could have beat me without clicking anything outside of their own base (except the minimap).

>You mean like chess? And go?

What a clever reply. No, like the ones I named in the post above.

>Can you name one that is played professionally (important for balance and comparison to humans)

If you pause to think about it, things that make for a "good" pro scene are exactly the things that make computer games less than impressive for innovative AI research.

It looks like a human/AI centaur[1] would far exceed the abilities of either alone. In chess that lasted for about a decade, before the human became superfluous. In Go it seems to have happened already? But given the complexity of Starcraft I expect it to take longer than Go.


>I think the coolest thing is how the play of AI mirrors a similar style to how pros have developed

I mean, the agents have learned the game from pro replays so to me it seems obvious it would evolve to play the same strats human pros use.

My understanding is that Alphastar is trained on reinforcement learning. I suspect this is unsupervised so it would have learned this behaviour independently without pro replays.

Per DeepMind's blog post[1], an agent was initially trained via supervised learning on pro matches. Then, the agent was forked repeatedly, as the population of agents learned via tournament-style self-play. So, while initial strategies could have been seeded by pro play styles, the final models were the result of models learning from games with other models.

[1] https://deepmind.com/blog/alphastar-mastering-real-time-stra...

Meaning that the final models evolved by playing against each other and they all started by using pro strategies, so, again, to me it seems kind of obvious that they would end up using pro strategies and in the best case just try to make them better

Ya, that one game was a bit... 'fresh'. But saying that it relies 'entirely' on abusing blink stalkers isn't quite right.

It did it in 1 game of the 5 they were showing.

I only had a chance to see the live game, but very impressive stuff, especially in the early game! However, it does seem to have the same issue as all other AIs in that it is inflexible and fails to adapt to unusual situations - As the commentators pointed out, it kept building oracles when being constantly harassed by an immortal drop, whereas any amateur player would be able to react more effectively by building a single phoenix.

It also fails to recognize patterns - MaNa was able to find and repeatedly abuse a border between two strategies: whenever he was not actively dropping, AlphaStar would attempt to move out and push, then immediately retreat to defend when being dropped, whereas a human would recognize the reoccurring theme and either keep pushing or stay in their base.

While not groundbreaking, still exciting to see an AI that can hold its ground against pro players - it certainly demonstrates the potential of machine learning for constrained problem spaces.

Here is another link (YouTube) that allows you to go back in time if you've missed anything: https://www.youtube.com/watch?v=cUTMhmVh1qs

The live exhibition match definitely made Alphastar look like a machine making the decisions, not a super smart being. The micro was obviously impressive but that should also be the easiest part to master. I share the sentiment that DeepMind is hosting big events to paint a very one-sided picture of man vs machine, so this last win of mana feels oddly satisfying.

It is a little disappointing that Mana's win was achieved in part by simply exploiting Alphastar's poor response to the immortal drops by doing it over and over again. In contrast, Lee Sedol's win versus AlphaGo involved profound strategy and a particularly inspired "divine move" that humans get to brag about.

On the one hand, it feels a bit cheap. On the other, getting ahead due to micro but losing due to lack of pattern recognition and problem solving seems like a rather complete demonstration of both the strengths and weaknesses of current AI.

I agree that this makes Lee Sedol's one win against AlphaGo even more impressive. But deepmind also felt ready to host this event and it chose European pros to compete with not Asian pros, who arguably have better micro and innovate a lot of the new meta.

TLO is to Starcraft* as Fan Hui is to Go. They are strong players but not the best. I am guessing that Deepmind knows that the strength of their bot isn't ready to take on the very top players, and wanted to show off against a well-known personality in the Starcraft II scene.

Incidentally, though, the strongest Starcraft II player currently is arguably Serral [1], who is Finnish. He beat the strongest Korean players to win the WCS World Championship last year.

[1] https://liquipedia.net/starcraft2/Serral

* at least TLO's skill level when offracing as Protoss. His main race, Zerg, is very strong, ranked around 70-80th in the world.

Lee Sedol "divine move" actually doesn't work, even human top players would have punished it.

It actually was just a really weird mistake in AlphaGo

As they say... there are no good moves in Go/chess/zero-sum games etc. There are only bad moves.

That doesn't prevent professional players such as Gu Li 9p from describing it as a divine move though. And it does feel good for Team Humanity.

Thanks for clarifying that, never really got that that deep into Go.

They limited the APM to 180 for AlphaStar's input I think. An average professional player consistently plays at 200-300 APM with peak levels generally going to 350-400 APM during engagements.

To me, it seems like AlphaStar is actually at a disadvantage in its ability to micro, and has to make up for this in its macro level strategy.

Most human APM is wasted on "spam clicks". The effective APM, or EPM, is probably quite a bit higher for AlphaStar than for human pros.

TLO and MaNa are in the top 0.01%. They are the best of the best, and then more. The numbers I stated was for the average professional.

When you have geniuses like TLO and MaNa, average APM is more on the level of 300-400 consistent and 500+ in engagements. You're definitely correct that APM is not 100% effectively utilized. I would say maybe around ~40% of actions could generally be considered spam clicks. That still puts AlphaStar at a definite disadvantage in its ability to enact micro level strategy.

Two additional points, aside from APM effectiveness:

- The APM constraint by AlphaStar seems to be average APM (or some variation thereof), while it reached 900+ APM in battles -- again with 100% effectiveness. That's insane and definitely beyond human capacity.

- It didn't waste APM in the early game (when humans are warming up and spamming APM), further leaving slack for when engagements came.

I mean, when gauging practical applications, we shouldn't be too concerned about good reflexes and superhuman effectiveness; however, part of the point here was to gauge other aspects of cognition more related to strategy, large decision spaces, etc; and in this setting it makes sense to constrain those non-strategic (more mechanical/tactical aspects). I'm sure e.g. constraining humans to <300 APM for example would change player style/effectiveness that much, while e.g. the blink stalker micro displayed in Game 4 vs Mana was really reliant on massive parallel single unit/group control (blinking every single low health stalker individually, engaging from multiple sides simultaneously, etc).

Wow, Alphastar was able to 5-0 vs TLO (when he was off racing) and 5-0 vs Mana in an impressive manner. I am quite impressed that DeepMind has been able to take down Go and now Starcraft. From here on out, if DeepMind makes an announcement about a demonstration of X, I expect them to beat the top humans at X.

Edit: In the live game Mana is able to beat Alphastar. Alphastar was trained with a more limited camera than the previous games. Mana was able to harass Alphastar with a warp prism then push in and win.

TLO and Mana are good players but Im not sure they fulfill your criteria of top humans at X

According to a site [1] Mana is 19th. While that is probably overly high, I would say he is easily in the top 200 Starcraft 2 players in the world. I would also say that to get from beating a top 200 human to beating all humans is much smaller than from scratch to beating a top 200 human.

[1] https://www.gosugamers.net/starcraft2/rankings

Aligulac's had Mana around 50th in the world consistently for a recent 12 month period, so I think that's a more reasonable claim, and I'm happy saying that the 50th best player is a "top player in the world".

Aligulac is not that good, the current top 10 is highly debatable, I'd argue the only thing the current top 10 gets correct is Maru/Serral top 2, though Maru being first is a stretch as he hasn't beaten Serral and was easily countered at Blizzcon by simple strategies. I'd agree with him as #1 if he had any flexibility to not just cheese his opponents out (see Keen Ro16 matches, sOs, numerous proxies vs TY, etc.) Maru's a fantastic player but Aligulac has no way to judge his strategic inflexibility.

Is Mana the best Protoss player in Europe?

Second-best, probably. Best Protoss players, approximate from older Aligulac data:

1. Classic (Korea)

2. Zest (Korea)

3. herO (Korea)

4. Stats (Korea)

5. Neeb (USA)

6. Dear (Korea)

7. Trap (Korea)

8. ShoWTimE (Europe)

9. MaNa (Europe)

he had the 2nd most WCS points in 2018 of all european protoss with Showtime having vastly more: https://liquipedia.net/starcraft2/Main_Page

I am somewhat disappointed that they stream only hand-picked replays. For a big, announced presentation, they should be at a stage to stream live games (like OpenAI did with Dota).

However, apart from that great work by Deepmind and I am excited to follow the progress in this area.

They're streaming a single live game right now! They explained that the research build of the game being used doesn't have an "observer" mode so it's not quite suitable for streaming live games — and, in fact, the games that they were showing were played over a month ago. (The current live game is being shown from the human player's point of view, and as such is a bit… dizzying.)

They just announced there will be live streamed game(s) after Mana's series.

> For a big, announced presentation, they should be at a stage to stream live games (like OpenAI did with Dota).

This is such an inane complaint. They didn't do it all live because they have results they want to show and don't have observer mode because they've been focusing on research and don't want to delay things.

They said all replays will be posted online.

Apparently there is going to be a live exhibition match after showing the replays of the matches against TLO and Mana.

Some of the commentary shows that the UI is a barrier . The fact we can understand the strategies, but cannot physically make it happen shows that at least some of the advantage is just the precision of the inputs.

Could be a new way to play SC2 like games where we can better communicate our intentions to the game. For example, make a type of move action where the stalkers automatically retreat and stop to fire, instead of having to do move, attack, move, attack series of inputs.

This thought process (of the UI limiting the player) is a valid point when we are talking about Human vs. AI strength and weaknesses in RTS games. However, when pitting one human player against another, the UI limitation actually adds strategic depth to the game. Unlike pure strategy games like chess, in Starcraft when you have to move, attack, move, attack, it puts a physical burden on the player. This physical burden to work around the limited UI allows one player to outperform the other on both a strategic and physical playing field, in real time. If both players were freed of their physical shackles, it would become a much more boring display of pure strategy, which eventually would be "solved" and the game would cease to be fun.

If you look at the difference between Starcraft 1 and Starcraft 2, the two most significant UI changes are the ability to select more than 12 units at a time, and the ability to "multi-cast" spells in sequence when more than 1 spellcaster is selected. These two small UI improvements greatly reduce the skill gap between an average player and a "pro" player in Starcraft 2, to such a degree that many Starcraft 2 pro players went back to Starcraft 1, where their fast reflexes give them a much greater advantage against other players due to the limited UI. So an improvement to the UI would help human players beat AI, but it takes a strategic element out of the game when considering human vs. human.

I agree there are pros and cons to the physical mechanic. For example misclicks become a part of the game, whereas the RNG would be fair to each player about errors in preprogrammed walk/attack patterns.

It's really just a matter of of the game designers deciding what types of skills they want the game to require. If you want to reward micromanagement skills, then remove aids like that. If you want to reward strategy, then throw in all the micro aids.

Personally, I think Starcraft 2 has chosen a near perfect balance, although I recognize that it leans more toward micromanagement than pretty much every competitive RTS game other than its predecessor. Coming from Brood War, I can say "you should be happy even being able to select huge numbers of units and hotkey groups of buildings." :)

I was just daydreaming the other day about an RTS where the UI controls and inputs are fully customizable by the users, for this exact reason. There could be even a community repository where the best known custom UIs are available.

The best custom UI would have one button, "Launch DeepMind AI and let it win the game".

The intermediate UIs would be along the continuum between full micro and this.

Maybe developing these AI/UIs would be fun (until Google/DeepMind crushes you), but the game itself?

This is mad impressive, for sure. With most AI problems, I can at least comprehend the approach, and the necessary combination of models. Not so much here.

My first question is what is the input to the AI? Is it the raw pixel array of the display? Or does it get API-level readouts of what’s happening? Because implementing the CV just to segment the display output in real time is crazy enough. I would assume the latter.

I think this basically proves that any problem that can be exhaustively simulated is solvable now. This may mark a tipping point, as every problem for which simulations exist (essentially infinite labels) is solved - then the balance will tip back toward making faster and more accurate sims (think multi-scale first principles physics stuff).

Blizzard released a client with an output accessible to machines, but still preserving fog of war. See here:


Here's what I want to know, were these agents developed from scratch a la AlphaZero in chess, or did they have to create a number of abstractions in order to get the AI to start learning the game? In the initial demonstration they could hardly get the AI to mine minerals or do anything. How did they make the jump to actually good play?

They mentioned that they initially used imitation learning on human replays.

Open source API. Please re-watch the start of the presentation.

The AI makes some awful decisions, such as building five observers. This calls into question its "understanding" of the game. It looks like a lot of its ability comes from micro, which it's unsurprising a computer can do better than a human.

This is impressive, but with a lot of caveats. DeepMind's work on chess and go was impressive with no caveats whatsoever.

I do believe the live match was with a newer version of the agent that hadn't been tested against human yet

I'm not proficient in Starcraft, but if they have to make Pros play against 5 different agents for a Bo5 is it because the same agent would merely repeat the same game overall and the human player would be able to see through its strategy?

AFAIK at least for Go/Chess DeepMind wasn't handpicking agents to send against human opponents, but it was simply a trained agent who would try its own strategy and respond to the opponent, in this case isn't it like the single agent only ever plays a single type of meta strat? If so, I think this is a bit less impressive than what I had predicted.

There's a rock-paper-scissors like aspect to StarCraft openings where you essentially have to randomize your strategy so that the other player can't blindly counter what you're doing. I assume each of their agents learns one particular strategy; in that case, they could simply create a "meta-agent" that acts like one of their trained agents at random (probably using some weighted distribution).

Well, why wasn't this implemented then? For an hyped and live streamed event I'm expecting to see something that blows me away, not hand picked agents still in embryonic stage

Different agents for now, and 1 of the 6 possible matchups and just on one map (starcraft usually is played on multiple different maps/boards with different layouts over the course of a series). Given that this was just a demonstration match it's reasonable restrictions and they surely will lift them one by one (just like they removed the zoomed out vision in the final, live-played match). I'm really impressed with their performance, and even if my brother was just playing offrace I didn't expect him to lose.

10:0 AI wins. Precision micro control of units to save them from dying.

This was incredible. The games against MaNa even more so. TLO games were like MNIST, but AlphaStar going against MaNa was like ImageNet level honestly. Hats off to DeepMind.

I haven't played SC in years. Surprised that it is still going strong considering I haven't heard much SC news in a while. I remember one trick against the SC AI was to send a probe to the AI's base and lure all the AI's probes out of its base. That was the easiest way to win against AI back then. Comparing DeepMind/Alphastar now to what SC AI was a decade ago really puts AI advancement into perspective.

Twitch.tv chat is full of shit-posting no doubt, but I did find some of the comments amusing:

  - "You only have 50 GPUs in 2019, LuL"
  - "AlphaStar is a cyberbully"
  - Many were claiming the AI had "1500 APM", not sure where that idea came from
  - and lots more

There were moments during clutch blink micro where the onscreen APM number for AlphaStar exceeded 1000.

I'm starting to get annoyed with these DeepMind publicity stunts. They don't release any code, and don't let anyone verify their results. Their chess AI beating Stockfish involved some at least a bit questionable setup. And here? I'm a big fan of TLO, but he's currently not even in the top 50 even with his main race. With Protoss, he's clearly making amateur-level mistakes. Why choose this setting to show off your supposedly superhuman AI?

(Edit: This comment was made during the stream and it just looks like my point will be addressed before the stream's even finished, they'll let the AI play against a true top Protoss. Yeah!)

>I'm a big fan of TLO, but he's currently not even in the top 50 even with his main race

For what it's worth, this comment reminds me a lot of[1]:

> FYI, Fan Hui, the current European Go champion that Google DeepMind defeated, is ONLY a 2nd Dan Go player. The Highest Dan ranking in Go is 9th Dan!

[1] https://news.ycombinator.com/item?id=10983898

>but he's currently not even in the top 50 even with his main race.

I don't understand this part of the complaint, and it also seems to go counter to the idea that they should be releasing more data. Do we only want to see the state of these systems once they're beating everyone? I prefer to see the progress of these systems over time. Seeing an AI getting stomped on by a pro player doesn't really show us what an AI is currently capable if most people would also get stomped by a pro player.

Even so - to me this is kind of a big deal. Go is fully observable, there might be an intractable number of combinations, but the game is well defined and everything is out in the open.

In Starcraft you don't have full knowledge of the game (fog of war), there are infinitely more game paths , you have to scout, plan for the long term (macro), react tactically in the short-term, micromanage, etc. I'm actually shocked at it's performance.

Because it's still insanely impressive. I would have been blown away if the AI was merely at the platinum level. This is an AI playing starcraft. Starcraft!

>This is an AI playing starcraft. Starcraft!

Yeah, it's not like computers were ever able to play StarCraft before. Oh, wait...

The AI normally picks a set strategy and attempts that. On harder difficulty the AI gets to cheat (more income per mineral collected). The SC2 AI is extremely primitive and at best Bronze level. The AI never really "reacts" to your actions, it's basically always on a set path. DeepMind is orders of magnitude better than the Starcraft AI.

FWIW cheating AI in RTS games is extremely common. Often they don't get fog of war, or have higher income rates. RTS AI is very very hard, so this is very impressive.

  but he's currently not even in the top 50 even with his main race
Does that really matter? AFAICT he's in the top #100 [1]. For a game as complex as SC2, it's pretty significant that DeepMind is showing these kinds of results already. It's only going to improve from here right?

[1]: https://www.gosugamers.net/starcraft2/players/14064-tlo

With Protoss, he's not remotely near that. And that does matter if you intend to make huge a press release in the shape of "DeepMind AI beats top pro player". We'll see in an hour if that's what they're going for.

there is a LiquidMana coming on now ;)

I'm excited. :)

The fact that the community has been able to use the published paper to replicate all of AlphaZero's results in both chess, go (and a number of other similar combinatorial games) and get similar performance should lay your concerns to rest.

Can you provide some links?

http://zero.sjeng.org/ for AlphaZero playing Go, and http://lczero.org/ is at least par with AlphaZero Chess based off match reproductions.

LeelaChess is one example for Chess.


This project actually proves the point of the root comment. The community spent tons of time on tuning/training this network, and it still routinely loses to Stockfish, which runs on inferior hardware. It illustrates that:

1. Deep mind kept a lot of information about their methods undisclosed.

2. Despite all the claims of generality, the algorithm requires insane amounts of fiddling to train. Just read their blog.

3. The hype around AlphaZero crushing every other algorithm in chess is overblown. It's competitive, but not clearly superior.

Still, Kudos to people running Lela project for doing what DeepMind should have done - describing how things really work and testing in real-life conditions.


> Their chess AI beating Stockfish involved some at least a bit questionable setup.

I believe their eventual Science paper addressed these concerns.

Which paper? How did they address it?

They played a newer version of SF (newest version at the time the paper was written), let SF use opening book, stronger CPU for SF, non-fixed time control for SF.

In all of the games Alphastar was slower than TLO in the build order. I agree that TLO could have been faster, but this was definitely not the deciding factor.

I think the main factor was Alpha* just had incredibly accurate assessment of exactly what part of it's army it needed to defend/attack and was very careful.

The DeepMind folks on stream just mentioned that Alphastar's inference speed is about 50ms from observation to action.

I thought they said 350ms.

E: https://youtu.be/cUTMhmVh1qs?t=3272

What's AlphaStar's APM. Can it just move every single unit perfectly in a single frame?

They said the capped the APM, so the game is fair from this point of view. They also said that while the agent can see the entire map, it mostly uses local views of sort. I assume because moving the camera costs actions.

Commentator just mentioned that the APM they've observed from AlphaStar is comparable to pro players.

Yeah, in game 4 vs Mana the commentators mentioned there are APM limitations to keep the APM within the realm of human possibility, but the decision making and multitasking are beyond even the best players.

"There is not a pro in the world that can control stalkers that way" said Artosis (a commentator).

It really isn't. Human APM is inflated and isn't true APM, humans spam click, repeat actions, make pointless actions a lot, etc. A human can have 300 APM but only make one true action per second.

DeepMind never missclicks, doesn't have to move the mouse or fingers, never spamclicks, doesn't make pointless actions, so its APM isn't inflated.

I do believe that with this progress AlphaStar with more training could beat humans even with a more realistic APM limit.

They gimp the AI to a human level APM because they want to win through superior strategy not through 30 000 APMs.

it's not perfect but definitely superhuman from what I see. These are demonstration matches and afaik they want to tune the "mechanical" abilities of the agents to be somewhat in the realm of the strongest human players but it's not a trivial thing as not every action is equal and just picking a APM number would make the AI way weaker than a human in some areas (splitting marines, building/morphing lots of units ) and way stronger in other areas (multitasking)

I'm having a hard time knowing if it's using a single agent of the five top agents or an ensemble of their five agents.

It sounds like each match in the Bo5 is against a different agent. It's a hair dirty, but I don't think it's too bad because the same kind of random-selection strategy is just as possible "in the wild".

They select 5 agents and then put up the professional player against a different agent in each game of the series. So it's a single agent, but a different one in each of the five games.

The singularity is near (5 yeats away max)

Applications are open for YC Summer 2019

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact