AlphaStar had 0 cost to sense the entire map simultaneously, and when they introduced the attention/context switching cost (camera control) in the showmatch, it lost.
Also very notable for me in the showmatch was it seemed completely blind and exploitable to some strategies. When AlphaGo played top players, it made a few mistakes, but there wasn't anything obvious that it just couldn't see. Here it just couldn't think of making phoenixes vs. the warp prism harass which shows strategically it isn't near human level yet. It could clearly be exploited by back and forth harassment too (probably has to do with the limited memory those networks have).
Finally, DeepMind were just emphasizing average APM, when it clearly reached totally superhuman levels at times -- even a top professional can't execute 900+ flawless apm in a battle that we saw.
David Silver was clearly expecting the showmatch to be totally one sided (hence his speech that 'this is another historic victory for AI), but I were left with the opposite impression: that strategically top humans are still ahead in this game. This is not the end of the game for humans yet !
The gap between that AlphaGo and the AlphaGo that beat Lee Sedol was probably something like 10 years of human improvement compressed into a couple of months. 500 Elo is huge.
Even if they still have further to go, comparing this to AlphaGo's progress is reasonable. I'm going to live dangerously and make an assumption - you probably understand Starcraft better than Go, so can see the flaws more easily. Any mistake in Go comes back to something the AI just didn't see.
> I'm going to live dangerously and make an assumption - you probably understand Starcraft better than Go, so can see the flaws more easily
Correct :) Although I'm not great at either.
This comes closer to opening doors to real uncharted territory in robotics and general agency; it trained in crazy low wall-clock time. But now we're this far it makes me question even more the cost, flexibility, adaptability, of the AI less forgivingly. How much did it actually cost to train? Would it be economical to train one of those for every single task? (e.g. in a factory or warehouse setting), Would it be vulnerable to those exploitable strategic weakness in a real world setting?, etc.
It's definitely not over yet as far as SC2 goes though :)
On the other hand, it's also useful to remember that previous attempts have been made to develop strong Starcraft players, and they all had access to the same API as the DeepMind player, yet they didn't perform as well. Of course it makes a difference that it's DeepMind who achieved this and they seem to have spared no costs in hardware and training time, unavailable to a smaller team. Perhaps we shouldn't be that surprised to learn that a poweful computer can do better than a human in some cases. But, at the end of the day, the result they got against MaNa shows that it is possible for an AI player to beat a strong human player -fairly or unfairly- in a game that was rightly considered very hard for AI players, under any assumption.
The average APM limitation is just a publicity stunt. There are similar caps one could apply that would make the challenge much more "fairer". For instance: Max (instead of average) APM or limiting mouse travel distances (no one can get even remotely close to clicking 300 times/minute on alternate sides of the screen with pixel-perfect accuracy AND timing).
Finally, the AI only plays 1 of 6 possible match-ups in 1 of infinity possible maps, and the live game has shown severe strategic mistakes (mainly the indecision when getting harassed and the inability to build a phoenix to counter the prism / keep harassing itself with the oracles instead of stupidly watching over the prism with them). It still has a long way to go for me to consider it "superior" to humans regarding intelligence (not just superhuman control).
All in all, this is still an impressive achievement though!
I think the coolest thing is how the play of AI mirrors a similar style to how pros have developed (Micro harrasing early, early expansions, a very good understanding of when to attack/retreat). It looks just like your average pro..until you see its god tier micro.
Well, after seeing Mana crush it in the live game it seems the AI had zero clue as to what to do... it seems like it calculated it couldn't win a fight so let Mana destroy its entire base. So, just like with the Dota AI we see pros can exploit and win easily once they play around the micro advantages.
So that's one aspect of it but isn't the main strength.
The biggest effect is that it's harder to change strategy midgame, which is not all that critical.
Also the execution of the build order is nowhere near flawless.
I have to wonder if future competitive games will need to take into account the abilities of reinforcement learning algorithms when releasing balance patches.
Fog of War was enabled in all cases, but the last match was them enabling the "Scrolling window" that humans are forced to look at the game with.
The AI in all of the other games could see and control all of its own units on the map simultaneously. No human has this ability due to the limitation of the screen.
With all of this, it's really hard to see how much/which parts of the game the AI really "understands".
I'd love to see an AI that can play MAX or MoO from raw pixel inputs, even if it sucks at it.
Which is exactly why StarCraft is not a very good game to test AI on. It's absurd to put arbitrary limitation on something to make the game "fair" and then pat yourself on the back simply because the algorithm won. If it can already win through pure micromanagement, why
There are tons of strategy games which don't revolve around micromanagement where APM simply doesn't matter. All turn-based games, for example. Or real-time games where building stuff is more important than combat.
You mean like chess? And go?
I think turning to a real time game with a complex rule set after showing they mastered turn based games with simple rule sets was very sensible.
> Or real-time games where building stuff is more important than combat.
Can you name one that is played professionally (important for balance and comparison to humans) where this is more true than starcraft 2? I think of starcraft 2 as very macro focused as games go.
I used to be ~80th percentile in North America (worst region) and I'm certain at the time any pro could have beat me without clicking anything outside of their own base (except the minimap).
What a clever reply. No, like the ones I named in the post above.
>Can you name one that is played professionally (important for balance and comparison to humans)
If you pause to think about it, things that make for a "good" pro scene are exactly the things that make computer games less than impressive for innovative AI research.
I mean, the agents have learned the game from pro replays so to me it seems obvious it would evolve to play the same strats human pros use.
It did it in 1 game of the 5 they were showing.
It also fails to recognize patterns - MaNa was able to find and repeatedly abuse a border between two strategies: whenever he was not actively dropping, AlphaStar would attempt to move out and push, then immediately retreat to defend when being dropped, whereas a human would recognize the reoccurring theme and either keep pushing or stay in their base.
While not groundbreaking, still exciting to see an AI that can hold its ground against pro players - it certainly demonstrates the potential of machine learning for constrained problem spaces.
Incidentally, though, the strongest Starcraft II player currently is arguably Serral , who is Finnish. He beat the strongest Korean players to win the WCS World Championship last year.
* at least TLO's skill level when offracing as Protoss. His main race, Zerg, is very strong, ranked around 70-80th in the world.
It actually was just a really weird mistake in AlphaGo
That doesn't prevent professional players such as Gu Li 9p from describing it as a divine move though. And it does feel good for Team Humanity.
To me, it seems like AlphaStar is actually at a disadvantage in its ability to micro, and has to make up for this in its macro level strategy.
When you have geniuses like TLO and MaNa, average APM is more on the level of 300-400 consistent and 500+ in engagements. You're definitely correct that APM is not 100% effectively utilized. I would say maybe around ~40% of actions could generally be considered spam clicks. That still puts AlphaStar at a definite disadvantage in its ability to enact micro level strategy.
- The APM constraint by AlphaStar seems to be average APM (or some variation thereof), while it reached 900+ APM in battles -- again with 100% effectiveness. That's insane and definitely beyond human capacity.
- It didn't waste APM in the early game (when humans are warming up and spamming APM), further leaving slack for when engagements came.
I mean, when gauging practical applications, we shouldn't be too concerned about good reflexes and superhuman effectiveness; however, part of the point here was to gauge other aspects of cognition more related to strategy, large decision spaces, etc; and in this setting it makes sense to constrain those non-strategic (more mechanical/tactical aspects). I'm sure e.g. constraining humans to <300 APM for example would change player style/effectiveness that much, while e.g. the blink stalker micro displayed in Game 4 vs Mana was really reliant on massive parallel single unit/group control (blinking every single low health stalker individually, engaging from multiple sides simultaneously, etc).
Edit: In the live game Mana is able to beat Alphastar. Alphastar was trained with a more limited camera than the previous games. Mana was able to harass Alphastar with a warp prism then push in and win.
1. Classic (Korea)
2. Zest (Korea)
3. herO (Korea)
4. Stats (Korea)
5. Neeb (USA)
6. Dear (Korea)
7. Trap (Korea)
8. ShoWTimE (Europe)
9. MaNa (Europe)
However, apart from that great work by Deepmind and I am excited to follow the progress in this area.
This is such an inane complaint. They didn't do it all live because they have results they want to show and don't have observer mode because they've been focusing on research and don't want to delay things.
Could be a new way to play SC2 like games where we can better communicate our intentions to the game. For example, make a type of move action where the stalkers automatically retreat and stop to fire, instead of having to do move, attack, move, attack series of inputs.
If you look at the difference between Starcraft 1 and Starcraft 2, the two most significant UI changes are the ability to select more than 12 units at a time, and the ability to "multi-cast" spells in sequence when more than 1 spellcaster is selected. These two small UI improvements greatly reduce the skill gap between an average player and a "pro" player in Starcraft 2, to such a degree that many Starcraft 2 pro players went back to Starcraft 1, where their fast reflexes give them a much greater advantage against other players due to the limited UI. So an improvement to the UI would help human players beat AI, but it takes a strategic element out of the game when considering human vs. human.
Personally, I think Starcraft 2 has chosen a near perfect balance, although I recognize that it leans more toward micromanagement than pretty much every competitive RTS game other than its predecessor. Coming from Brood War, I can say "you should be happy even being able to select huge numbers of units and hotkey groups of buildings." :)
The intermediate UIs would be along the continuum between full micro and this.
Maybe developing these AI/UIs would be fun (until Google/DeepMind crushes you), but the game itself?
My first question is what is the input to the AI? Is it the raw pixel array of the display? Or does it get API-level readouts of what’s happening? Because implementing the CV just to segment the display output in real time is crazy enough. I would assume the latter.
I think this basically proves that any problem that can be exhaustively simulated is solvable now. This may mark a tipping point, as every problem for which simulations exist (essentially infinite labels) is solved - then the balance will tip back toward making faster and more accurate sims (think multi-scale first principles physics stuff).
Here's what I want to know, were these agents developed from scratch a la AlphaZero in chess, or did they have to create a number of abstractions in order to get the AI to start learning the game? In the initial demonstration they could hardly get the AI to mine minerals or do anything. How did they make the jump to actually good play?
This is impressive, but with a lot of caveats. DeepMind's work on chess and go was impressive with no caveats whatsoever.
AFAIK at least for Go/Chess DeepMind wasn't handpicking agents to send against human opponents, but it was simply a trained agent who would try its own strategy and respond to the opponent, in this case isn't it like the single agent only ever plays a single type of meta strat?
If so, I think this is a bit less impressive than what I had predicted.
- "You only have 50 GPUs in 2019, LuL"
- "AlphaStar is a cyberbully"
- Many were claiming the AI had "1500 APM", not sure where that idea came from
- and lots more
(Edit: This comment was made during the stream and it just looks like my point will be addressed before the stream's even finished, they'll let the AI play against a true top Protoss. Yeah!)
For what it's worth, this comment reminds me a lot of:
> FYI, Fan Hui, the current European Go champion that Google DeepMind defeated, is ONLY a 2nd Dan Go player. The Highest Dan ranking in Go is 9th Dan!
I don't understand this part of the complaint, and it also seems to go counter to the idea that they should be releasing more data. Do we only want to see the state of these systems once they're beating everyone? I prefer to see the progress of these systems over time. Seeing an AI getting stomped on by a pro player doesn't really show us what an AI is currently capable if most people would also get stomped by a pro player.
In Starcraft you don't have full knowledge of the game (fog of war), there are infinitely more game paths , you have to scout, plan for the long term (macro), react tactically in the short-term, micromanage, etc. I'm actually shocked at it's performance.
Yeah, it's not like computers were ever able to play StarCraft before. Oh, wait...
FWIW cheating AI in RTS games is extremely common. Often they don't get fog of war, or have higher income rates. RTS AI is very very hard, so this is very impressive.
but he's currently not even in the top 50 even with his main race
1. Deep mind kept a lot of information about their methods undisclosed.
2. Despite all the claims of generality, the algorithm requires insane amounts of fiddling to train. Just read their blog.
3. The hype around AlphaZero crushing every other algorithm in chess is overblown. It's competitive, but not clearly superior.
Still, Kudos to people running Lela project for doing what DeepMind should have done - describing how things really work and testing in real-life conditions.
I believe their eventual Science paper addressed these concerns.
NYT article summarizing some of the issues addressed: https://www.nytimes.com/2018/12/26/science/chess-artificial-...
I think the main factor was Alpha* just had incredibly accurate assessment of exactly what part of it's army it needed to defend/attack and was very careful.
"There is not a pro in the world that can control stalkers that way" said Artosis (a commentator).
DeepMind never missclicks, doesn't have to move the mouse or fingers, never spamclicks, doesn't make pointless actions, so its APM isn't inflated.
I do believe that with this progress AlphaStar with more training could beat humans even with a more realistic APM limit.