It also provides the paper and an archive of all of the AI's matches for anybody who wants to take a closer look. These can be viewed with the free version of SC2 (afaik).
https://doi.org/10.1038/s41586-019-1724-z (supplementary data available here in json form)
https://rdcu.be/bVI7G (public paper)
https://www.youtube.com/playlist?list=PLtFBLTxDxWOSrWZ8krQt6... (list of older AlphaStar matches cast by an SC2 player)
https://www.youtube.com/watch?v=l82wBa3UoZU (one of the newer matches cast by another player)
https://old.reddit.com/r/starcraft/comments/dpaunw/deepminds... (win/loss rates across all the played matches by race, includes apm)
I'm a gold league SC2 player, so maybe in the 30th-50th percentile. Three years ago, when DeepMind started this project (and after nearly two decades of research into SC/SC2) I could probably have beaten the best non-cheating AI. Now, after 3 years, this AI is playing at a Grandmaster level, under at least a reasonable approach to fairness. By comparison, according to the AlphaGo paper  the best Go AIs prior to AlphaGo were playing at a 6 Dan level, which looks to be somewhere in the 90-98th percentile .
The speed at which AlphaStar overtook previous AIs seems to me to be nearly unprecedented in AI research. This is like if the world's best chess AI had gone from losing high school tournaments to being competitive with Kasparov in less than 3 years. Valid criticisms aside, this feels like an incredible achievement.
It's amazing that most people here don't understand that AI performance in any one computer game relative to humans is largely irrelevant. A system that can play many games at a mediocre level, but does it without any hand-holding, clever APIs or architecture adaptation is infinitely more impressive than a system that can beat everyone in a specific game with all those things applied.
Remember, most humans are completely mediocre at Chess, Go or StarCraft.
Yes, a human is still needed to decide which approach to use, but we are slowly approaching a world where building an AI becomes easier and faster. It will become absolutely irrelevant that a single AI can not play all games, because whenever a new game comes out, someone will be able to build a superhuman AI on it within a week/month.
The brains of an AI are also transferrable. Built a superhuman AI? Send it to a friend! Compare it to a human, who would need to spend enormous amounts of time to transfer their game-brain to someone else. If you want, you can bundle all those AIs into one and pretend it can play any game.
"It's amazing that most people here don't understand that human performance in all computer games relative to AIs is largely irrelevant"
So far I haven't seen Google sending their AIs to a friend.
"All" neural network models is a stretch. Some researchers, including some that work in the industry, do releast their models. The majority don't.
Is it not simply the case that, before AlphaStar, very little money and effort was being put into developing AIs for Starcraft 2?
I'm sure that the AlphaStar effort absolutely dwarfed everything that came before in terms of investment (as I'm sure AlphaGo did for Go as well) but based on my reading I'd also bet that Starcraft has likely seen the most continued AI research effort of any imperfect information, real-time game over the last decade - I think that's also part of what made it appealing for DeepMind over Dota 2 or any of the other options in this class of game.
Even now, I wouldn't really expect to see most game developers start using the latest machine learning techniques to build their own AI's. Maybe the really successful games with competitive leagues might be interested?
I really hope this will change, though. Skilled AIs could make so many games more interesting and better.
Many online games really suffer from bad multiplayer experiences. Players that drop in the middle of a game, ...
If Overwatch, Destiny, ... had really good AIs, that would make the game more attractive. If a player drops, they could replace it with similarly skilled AI.
Instead of waiting for 12 players of the same skill-level, they could just play with some AIs.
They could even make it such, that humans win 70% of the time, and let the AIs just lose more often.
AI development probably needs to be made easier, somehow. Game developers have a lot to do.
This point is being brought up a lot, but I don't really buy it. Yes, there have been instances where the AI being too good discouraged players from playing the game as much, but this almost always happens due to the AI having some inhuman advantage that a person could not really replicate. I think players do want the AI to pose a challenge - that's what a lot of casual PvP games are about. Humans are (were?) the only ones that could offer a fair match against another human.
I will absolutely grant you the computational power point though. However, as hardware advances, this cost will become more and more acceptable.
I don't expect to see widedspread adoption of this in commercial games within a decade, but I think that in 2 or 3 decades this will be the norm. In fact, I would bet that once we can make an AI that is good at a game, we can also make it weaker. A game could estimate a player's skill rating silently and then adjust the AI's strength to give the player a difficult/fun time.
Of course, it could just be that AlphaStar is able to play well against humans, because players treat it like a human. Maybe the AI still has standard game AI-like weaknesses that can be exploited if people know that they're playing against an AI. Eg some Diamond league player went mass ravens and kicked AlphaStar's ass. The AI would have to learn how to deal with stuff like this on the fly and I don't think we're there yet.
Why does this matter? Because the trick is often less effort than the real deal. The point of videogames is to entertain you, and the goal of game publishers is to turn a profit while doing so. Obviously there are some exceptions, and indie/free games might have different goals (we all know indie games that can be punishing, like Dwarf Fortress).
I remember with the original Left 4 Dead by Valve, they claimed there was an "AI Director" that modified the map according to how well you were doing, "adapting" to how you were playing. In practice it amounted to closing some sections of the map according to how easy you were killing zombies without dying. So this wasn't really AI but a glorified IF-THEN-ELSE trick, but what does it matter? It served its purpose. Do we really need something more complex (and expensive!) when killing zombies in L4D?
Throwing people into the deep end of the pool usually doesn't work so well. Scaling down the difficulty helps, but really what you want is challenges that teach you something. And you don't really need AI for that.
Last year I started learning to play accordion out of a method book. It has a bunch of songs ordered from easy to hard, and they're also chosen to teach certain skills. It reminded me quite a bit of good level design.
Just look at the achievements for Left for Dead. Only 40% of all players have got the achievement for "kill one of each Uncommon type" which is trivial to get if you play for a few hours.
Just today I saw something that looked interesting, and it was 85% off because of a Halloween sale. So I grabbed it for a couple bucks. Free time being such a rare commodity, there's a solid chance I never even install it... and then there's always Stellaris or Mount & Blade or Medieval Total War that I could happily play until the heat death of the universe, competing for my attention.
Most people who buy your game aren't going to invest even 10 hours in it or come close to maxing out the AI because they only picked it up because it was 85% off during a Steam sale.
Maybe you'll be able to figure out from Early Access numbers how popular your game is and whether you need to invest in super-awesome AI. But it seems like for most games it would be a waste of effort. More likely it would be retrofitted into an already popular game in a DLC or future patch or something.
Maybe the reason people quit your game is because of the AI. What made me stop playing Heroes of Might and Magic 6, Civilization 6, and Total War Three Kingdoms is the poor AI. It takes the fun out of the game, because at lower difficulties the AI puts up no challenge and at higher difficulties you're just exploiting the AI the same way over and over again to keep up.
You might think that that's fine, because you already got the money, but then you'd be looking at it from the perspective of singular games. But studios usually don't just release one game and then disappear. After my experience with HoMM 6, I had zero interest in HoMM 7. 2K Games could release Civilization 7 tomorrow, but without them showing that the game's AI is much better than in civ6, I probably wouldn't pick it up. Yes, the Civilization series will still have players, but it could easily just fizzle out like many other franchises and even genres.
You also can't add this in as a DLC, because people make up their mind about a game near the start. This only works if the game already has longevity, but at that point the game is already popular, so does a better AI at that point make a difference?
Yes, and the current game AIs are completely and utterly incompetent at it. AI in video games only provides a challenge when the mechanics of the game are challenging or the AI is cheating.
>Do we really need something more complex (and expensive!) when killing zombies in L4D?
Not in L4D, because you don't play the game to kill zombies. You play the game to play with other people. If L4D didn't have coop then the game would likely not have gained popularity.
You're right that players want a challenge that they can overcome, but current AI in video games does not offer that. The AI always ends up being so weak that play against the AI cannot be the focus of the game. The focus has to be in mechanical play, puzzles, story, atmosphere, multiplayer or something else. The only way AI keeps up is with inhuman mechanical play or cheating. Neither feels fun to play against, because the counter to it is to figure out how to exploit the AI, after which the game becomes more or less trivial.
I wouldn't say "incompetent", but yes, the mechanics of the game must be challenging and/or (often) the AI must cheat. I'm saying this is enough most of the time; there's really no incentive to build a better AI for videogames because (my hypothesis) most gamers don't care. The illusion of challenge is enough, most of the time. The goal of videogames is to entertain, not to really challenge (again, with the exception of tournaments and competitive gamers, who are a niche within a niche).
You mentioned Heroes of Might & Magic in another comment. I played the heck out of HoMM2, some of (I think) HoMM3 and got bored with the rest. But I wasn't bored with the AI; I got bored because it was the same formula again and again. The genre itself soon felt like a tired formula. Note that the "AI" -- cheating or not -- of HoMM felt horribly difficult to me. I lost battles more often than I won; the only trick that reliably worked was to start the fight with a lot more troops than the computer enemy, and that's not much of a trick! Not a lot of places to learn effective tactics back then either, maybe the hellpit that was GameFAQs?
The article describes "exploiter agents that focus on helping the main agent grow stronger" as key to their approach. That seems promising? Maybe it could be used to make bosses with specific strengths and weaknesses where part of the game is figuring out how to beat them.
Chess and Go are already quite different from NLH, because they are purely about strategy. The best strategy in Chess either wins or leads to a draw if played by both sides every game. In NLH an optimal strategy just breaks even (ignoring the rake) against other optimal players and makes money on average against anybody else. But over even a few hundred hands you can't tell.
Which is why the most complex game of all is actually intergalactic horseshoes.
Even basic strategies will win if they’re done faster. APM (actions per minute) is a very significant factor into who is winning. Apparently they limited their AI player to 264 APM but that’s still incredibly high and done with machine level consistency. That’s almost 4.5 actions per second!! I know there are human level players at and probably above that level but that really allows for basic strategies to win out.
This isn't really true. Basic strategies done faster still lose miserably to humans, we can see this from the long history of SC Broodwar AI tournaments where they have a human play the best AI at the end (and always win).
Faster helps, better strategy helps more, maybe we can't say the AI is doing as well as the top humans at strategy since it's faster, but we can say that it's doing pretty good.
AAAI Conference on Artificial Intelligence and Interactive Digital Entertainment (AIIDE)
IEEE Conference on Computational Intelligence and Games (CIG)
Student StarCraft AI (SSCAI) Tournament
BWAPI Bots Ladder
A great example would be seeing the faint image of cloaked units. How does that work with the AI? Can the just phrase the current screen and instantly see any cloaked unit? I can imagine an attention system where more attention dedicated to part of the screen would increase the probability of noticing a cloaked unit there.
Actually 792 APM. They limited their AI to 264 "things" (choose unit+ability+target or change view) per minute. Each "thing" can be counted as between 0-3 actions by starcraft. So it really has a peak APM limit of 792.
From the article:
>Agents were capped at a max of 22 agent actions per 5 seconds, where one agent action corresponds to a selection, an ability and a target unit or point, which counts as up to 3 actions towards the in-game APM counter. Moving the camera also counts as an agent action, despite not being counted towards APM.
But from replays it has an average APM of 248 at the highest, so nothing crazy going on. Plenty of pros have peak APMs during fights that are higher than 800.
I don't think it's like that at all. On the high level, there is no "chess AI", "go AI", "image classification AI" and "dexterous manipulation AI". These are all sides of the same coin, that gets significantly better every year. Adding support for the new game or new "environment" to existing deep learning based backbone still requires a bit of engineering work and a few creative tricks to unlock the best possible performance, but the underlying fundamentals are already there and are getting better and better understood.
There is a reason why the progress in AI is so hard to measure. Anytime a next task is solved, there is a crowd saying it's not a "real AI" and that scientists are solving "toy problems". Both statements are totally true. But the underlying substance is that each of these toy problems is of increasing complexity and brings us closer and closer to solving the "real problems", which are mostly so undeniably complex that we couldn't attack them upfront. Still, the speed of progress in the field of AI research is staggering and it's hard to keep up with it even for professional researchers who spend all their waking hours working on these things.
6 years ago we were able to solve some Atari games from pixels. Today, that feels like a trivial exercise compared to modern techniques. With billions of dollars of investment pouring in and steady supply of fresh talent, it is very hard to predict what the pace of research will be in the coming years. It is entirely possible we'll encounter a wall we won't be able to overcome for a very long time. It is also possible that we won't, and in that case we're in for a very interesting next few decades.
On a practical level, this is not true. There are different algorithms, different architectures, different hyperparameters required for each of these problems, and often for each subdomain within each of these problems, and often for each specific instance of these problems. It's difficult to draw any kind of holistic picture that combines all of the individual advances in each of these problem instances; that's why progress in AI is so hard to measure, and why a statement like "each of these toy problems...brings us closer and closer to solving the 'real problems'" is probably a bit too coarse-grained to be fair as well.
Deepmind's best-in-class chess and Go AIs are the same code (AlphaZero) just given respectively rules and game state input for either chess or Go and then allowed to train on the target game.
One of the fun works in progress in this space is teaching AIs to play a suite of 80s video games. Getting quite good at several games where the idea is to go right and not die is pretty easy these days, but Deepmind's work can do a broader variety only coming badly unstuck on games where it's hard to discern your progress at all without some meta-knowledge.
You claimed that "different architectures" are needed. Not true. And further you claimed this is true even for "each subdomain". This would have been a fair point in 1989. Traditional chess AIs approach the opening very differently for example, relying on fixed "books" of known good openings. But AlphaZero since it is a generalist doesn't do this, it plays every part of a match the same way.
Now you've gone from asserting that Chess and Go need separate AIs to claiming that since BERT and AlphaZero are different software it makes your point. Humans pretty clearly don't have a single structure that's doing all the work in both playing Go (AlphaZero) and understanding English (BERT) either - so that's a pretty bold bit of goalpost moving.
> Anytime a next task is solved, there is a crowd saying it's not a "real AI" and that scientists are solving "toy problems". Both statements are totally true. But the underlying substance is that each of these toy problems is of increasing complexity and brings us closer and closer to solving the "real problems"
I wonder if this is true. This belief may seem like common sense, but it's not obvious to me that domain-specific problems must generalize to General AI ("real problems") or even bring us closer to it. That is, it's not evidently true that many small problems will eventually lead to a general solver of everything (or to human-like intelligence). Or to say it in yet another way, it's not obvious to me that human-like intelligence is the sum of many small-problem-intelligences.
Again, common sense may lead us to believe this, and maybe it's true! But I think this conclusion is far from scientifically evident.
You can even interleave the training for the second task with a few training rounds for the first task to maintain proficiency. There's a group that's using this sorry if technique to make a general "plays videogames" AI. I couldn't find a good link from my phone, but here's a less good link about something similar: https://towardsdatascience.com/everything-you-need-to-know-a...
As another poster said these are all tasks performed by different systems. For chess and Go AI it's Deep Reinforcement Learning with Monte Carlo Tree Search. For image recognition it's Convolutional Neural Networks. Importantly, these systems are very task-specific. You won't find anyone trying to beat humans at games using CNNs, for example, or using Deep-RL to do text recognition. Far from "a few creative tricks" these are systems that are fundamentally different and are not known to generalise outside their very limited domains. They're one-trick ponies.
The OpenAI paper on "dexterous manipulation" reported learning to manipulate one cube, the same cube, always, after spending a considerable amount of resources on the task. It was a disappointing result that really shouldn't be groupwed with CNNs and Deep-RL for game playing. The level of achievement does not compare well.
>> Anytime a next task is solved, there is a crowd saying it's not a "real AI" and that scientists are solving "toy problems".
This used to be the case a decade or more ago. In the last few years the opposite is true. The press is certainly very eager to report every big success of "AI"- by which of course is meant deep learning.
>> 6 years ago we were able to solve some Atari games from pixels. Today, that feels like a trivial exercise compared to modern techniques
6 years ago DeepMind showed superhuman performance in seven Atari games with Deep-RL (DeepQN in particular): Beam Rider, Breakout, Enduro, Pong, Q*bert, Seaquest and Space Invaders. Since then more Atari games have been "beaten" in the same sense, but many still remain. I'm afraid I can't find references to this but I've seen slides from DeepMind people a few times and there is always a curve with a few games at the top and most games at the bottom, below human performance. There are some games that are notorious for being very difficult to solve with Deep-RL, like Montezuma's Revenge which was claimed to be solved by Uber a couple of years ago however this was done using imitation learning, which means watching a human play. The result is nothing like the result in Go, which remains the crowning achievement of Deep-RL (and its best buddy, MCTS).
Bottom line: Atari games remain anything but a trivial exercise.
And the architectuers that play Atari do not perform as well in Go or chess, say. You are mistaken that it's simple to train the same system to do all of those things. The AlphaZero system that played Go, chess and Shoggi well enough to beat its predecessor (you will excuse me that I don't remember which incarnation of Alpha-x it was) had an architeture fine-tuned to a chessboard and pieces with discrete moves, so it would not be possible to reuse it to play Starcraft, say, or even tic-tac-toe. The cost to train AlphaZero is also very high, in the hundreds of thousands of dollars.
In pretty much any field, top performing humans are at the physical limitation level, you will not see any sort of breakthrough, just incremental improvement.
Machines on the other side, can be scaled arbitrarily. Once you've built a small crane, you can build even increasing ones, it's just a function of money and interest.
Some say that intelligence is not like that, that it can't be scaled arbitrarily. But the burden of proof is on them, especially given the consequences if it can.
It doesn't matter how much money or interest we have, but right now it isn't technically feasible to build a 36000 km tall crane (also known as a space elevator). Humanity simply couldn't get it done even if we poured all our current resources into that project. It isn't physically impossible, but such a behemoth has requirements that current materials science cannot meet. Building tall cranes generates useful know-how for a space elevator, but it's a really different problem, not just a matter of resources.
Following the analogy, a general purpose AI simply isn't a bigger Deep Blue or AlphaGo; it's probably something different that requires knowledge that we currently don't have. Sure, building Deep Blue and AlphaGo most likely generated part of that money, but that doesn't mean we have everything we need.
You can argue that is simply also money and interest, and that's true, but it's not in the same as building a 10 meter crane and a 20 meter crane.
I agree with this, but it isn't clear to me that a general AI will have a significantly different impact on society than a world where task-specific well performing AIs are easy for anyone to develop.
Sure, a general AI has a set of properties that are really fascinating to discuss and debate (including what is consciousness and whether AIs should be given rights), and perhaps a general AI is required for doomsday computers-taking-over scenarios, but the impacts that AI will have on our economy and politics don't require general AI.
However, I disagree that the impact is the same. I think that the difference between a lot of specific jobs being automated and the world being run by a MULTIVAC world optimizer is pretty big.
I'm happy to see that they've greatly improved the APM cap. During the earlier showmatch they had an extremely naive cap on the average APM during the whole game. Which meant that the AI could take it easy for most of the time and then during battle sequences hit peaks of 2000 APM. Also the previous cap was based on the in-game APM counter, which doesn't count some things like moving the camera, which is now addressed. The current state sounds a lot better.
However it seems still superhuman in mechanical non-strategy ways, e.g. a human can misclick (click too early/late when moving the mouse, or just miss the target with the cursor completely) or do accidental double clicks etc. These end up being very costly mistakes against a zero mechanical mistake AI. Which in turn means that the AI can win with an inferior strategy due to the extra gains it has via superior mechanical accuracy. In other words this artificial intelligence is still under significacnt artificial motor skills welfare, and thus even if it starts beating pro players we shouldn't be too quick to talk about how it's the intelligence part that got it the win.
All of that said, I'm liking the progress and am excited to see what they achieve with this next. Would love another showmatch against pro players.
You can also see in replays that the AI often makes mechanical mistakes, missing spells, missing units, even ordering wrong units from outside the screen - so it surely seems that if it's win rate was conditioned in any strong way on its sheer mechanical ability, it would have learned to not make any such mistakes. Since the mistakes are clearly there - it seems that its power comes primarily from somewhere else, likely from the AI ability to choose the right actions, not from mechanical power of executing them perfectly.
Thus it is not likely that "this artificial intelligence is still under significacnt artificial motor skills welfare". They have consulted top players before they set this AI up, and if those details were important, they would have baked them into the limitations list already.
Also I think you overestimate the AI/IT knowledge of these top players that they're consulting. I have great respect towards them, but they're not renaissance men  who both play 10 hours of StarCraft per day and also know the subtleties of how computers work, not to mention bleeding edge AI. You can watch the previous showmatch  to see how DeepMind people lack knowledge of StarCraft and how the pro players they're consulting lack understanding of the AI and both are learning new things live on air. Its obvious that their cooperation is bearing fruit as they spend more time talking to each other, as evidenced by the new APM limits. However I'm willing to bet that they would reach many more good conclusions if they just continue working together.
 The problem with both parties (the pros & deepmind) is that they're so overspecialized. I'm nowhere near as good as them at their respective fields, but I am a professional programmer and diamond in StarCraft II. In addition I've built StarCraft II AI myself, although with different goals related to finding optimal strategies.
Certainly the game is different, meta has changed, etc. But he's definitely not an amateur and is probably in the top percentile of players.
So many things wrong with this comment.
You're nowhere near a professional StarCraft player if you're in Diamond league (I play casually and I'm on high plat, bordering Diamond), and Oriol Vinyals, the lead research scientist behind this project, is one of the most renowned scientists in the field and used to play StarCraft at a professional level. They also said that other employees at DeepMind are at Masters level, and helped test the AlphaStar.
My main argument revolves around APM. Oriol Vinyals might be great, but I've also seen him make a statement on video in 2019  how restricting the AI based on average APM during the whole game is reasonable. He has blindspots that someone like me can immediately spot.
You must be one of the people who think really tv programs were really real.
The AI is supposed to call this API. Then actions per minute would be irrelevant.
Since players take so many actions during the course of the game (well into the tens of thousands), inevitably some clicks will be sub-optimal, and they all have a tiny impact on the outcome. Some professional players do specific exercises to improve their clicking accuracy in order to gain efficiency by reducing misclicks, but generally clicking accuracy is not considered a big factor compared to raw speed. Most players try to attain the highest possible clicking speed while maintaining an accuracy level that is "good enough".
AI was also much more interesting when playing the protoss race, and really felt like it was responding to the opponents actions, and the other races not so much.
But most surprising is that it didn't make any "breathtaking" moves or actions, as opposed to AlphaGO. Actually not a single game made pro player realize something new about the game. Which is really embarassing, because it suggests that the AI just was able to correctly reproduce existing strategies and build orders, that it probably "learned" from existing pro games in the training sample.
I was really hoping for a more interesting report, honestly explaining the shortcomings of the technics used and giving hints for the obvious blunders. As well as a roadmap for a second round, this time aiming at beating the very top players.
Personally I felt disappointed by the fact that real-time strategy played a pretty minor role in RTS games. If you left the well-beaten path of cookie cutting and tried something new you invariably gave up a pretty obvious advantage to do so.
It’s just that a lot of that strategy is irrelevant unless you have amazing mechanics - and the difficulty of those mechanics means that you can’t as easily think, plan, or adapt in-game (because your brain is busy)
A big part of the game is thinking about “how do I win against somebody who does that...” between games
It's kind of like competing against a gorilla in boxing chess. Just because a gorilla is dominant doesn't mean boxing chess doesn't require chess skills, only that a gorilla doesn't need them.
Designing a game that is a fun and balanced experience whilst also encouraging true innovation is a difficult problem. Every game ever has this problem, putting thousands of minds on a problem and letting them confer about their results usually solves all the low-hanging fruit in about a week. Creating a game where you only play to share a new idea makes each match feel like a dice roll.
Warcraft 3 and broodwar were much more micro-oriented than sc2 (because ai was much worse for example if you let dragoons to walk to their target on their own they spend half the time fighting broken pathfinding while the enemy shoots at them :) This almost never happens in sc2, and microing units below master league is usually detrimental to the game).
I think the micro vs macro is not the full image, and sc2 lets player express themselves on what i would call mediumgame (something that is between buildorders and microing - like deciding when to expand, what tech-switch to make, how to position your army in between fights, etc).
There's lots of decisionmaking that isn't APM-limited but thinking-time limited in SC2, and it's not as codified as buildorders. But you can certainly simplify the games to "archon immortal vs bio terran", and then it's very repetative.
Raw, spur of the moment, innovation probably isn't a good idea generally, as you'd be doing something with no practice. People do have a fairly big bag of tricks from older tactics that are less common these days.
People are now over saturating their mineral line (making more probes than before), so I don't think that's true.
> People oversaturated in wol and hots because you didn't expand a lot in these games and bases have a lot less minerals in LOTV, not because of some kind of lost knowledge like some comments seem to think here.
> Meanwhile alphastar was going 2 gate robo staying on one base making a fuck ton of probes to take a super late natural, it was bad, and anybody calling it the future really bothered me.
Alphastar has not introduced anything new. Its play is strategically poor.
Is it because they expect to lose workers to harassment? Or is it so that they can saturate new expansions quicker?
I'm interested to understand why Alpha did this since worker production seems like one of the most solved and optimized parts of the games and not where you'd expect innovation
I don't play any competitive StarCraft so my view might be moot but I was surprised at the number of siege tanks it uses. It made me wonder if there's some critical advantage to having so many tanks stacked up in a line so that the splash is spread out.
Also I find it weird that it did not build any marauders at all, in any of the games I've seen.
And just in general, no sexy units. No battlecruisers, no infestors, no swarm hosts...
This is off-topic, but I was thinking about this the other day - couldn't AI be used to balance a competitive game in this sense? Imagine that the AI becomes so good that human players very rarely win against the AI in a best of 7 series. Then we find out that AI doesn't ever build a specific unit. That would be a pretty strong indicator that said unit isn't good enough compared to the rest.
Based on the initial AlphaStar against TLO/Mana, you would think that Stalkers are insanely OP and the only thing worth building.
I (and others) have wondered that if you continue to lower AlphaStar's APM, you would see a diversity of units. The money would be in where AlphaStar decides to spend its really tight APM budget. Is it worth casting that Psionic Storm?
I personally feel like it has an insane micro advantage by being able to select arbitrary units on the battlefield, as opposed to dragged squares or control groups. But I'm not a pro gamer, so I don't know what that feels like.
> AlphaStar is an intriguing and unorthodox player – one with the reflexes and speed of the best pros but strategies and a style that are entirely its own. The way AlphaStar was trained, with agents competing against each other in a league, has resulted in gameplay that’s unimaginably unusual; it really makes you question how much of StarCraft’s diverse possibilities pro players have really explored.
- Diego "Kelazhur" Schwimer, professional StarCraft II player
Why is it embarrassing if the AI behaved like a collection of all the knowledge of existing pro players?
Obviously it'd be more interesting if we also learned something new about the game. Or AI, or both.
I watched Dota2 first OpenAI 5v5 show at TI8, and they did had different actions, like which spells used to initiated, how some spells are used, etc.
But tht AI was very limited in Hero pool.
So instead of fixing the unfairness... they tried to hide it?
> When watching replays of these matches, players noticed that the account owner was performing actions that would be extremely difficult, if not impossible, for a human.
framing it like the agent was doing some sort insane play, when the reality is the main evidence of a player being alphastar (other than its garbage decision making) was the fact that it wasn't using hotkeys!
The configuration of control groups is visible in replays; AlphaStar's replay data does not have any control groups in it.
Though, via its API, it is able to select arbitrary groups of units from anywhere on the map.
Personally, I would have loved to see it work through control group management because I think that is something that is important and steals attention away from humans. But probably to the researchers, it is just annoying data to model that doesn't get to the "core" of StarCraft.
Similarly, AlphaStar would not play with group hotkeys, but use a different technique. However, in none of the analyses, people noticed things that would give AlphaStar an unfair advantage.
One of the videos I watched compared APM (Actions Per Minute) with EPM (Effective actions Per Minute). AlphaStar always has them nearly identical, which would be (according to him) basically impossible for humans.
Error rates, EPM and APM are all red herrings.
Scrolling screen is a major action to gather info and control units.
If one can control near edge units without scrolling it gives more stable view and lower chance of making mistakes.
That humans cannot reliably perform these actions because of the limitations of our corporeal form means that Alphastar has an advantage over a human player. Limiting APM isn't enough.
One could, of course, add more restrictions (like not noticing things on the minimap all the time...), but that's not what's it about anymore. At this point, micro is comparable to humans, and we can start to compare macro and strategy.
Fwiw, when I watched some of the replays, I was disappointed by AlphaStar. It's a very consistent player with few mistakes, but it isn't very reactive and definitely not inventive. Instead of switching tactics when things don't work out, it generally continues with the chosen strategy. Often, that's enough: a well executed strategy with few errors often wins, even if it wasn't optimal.
I think this is hits the chord with why a lot of people are talking about this bot as cheating more than they are being outplayed.
It's like fighting an aimbot in an FPS. They arent outsmarting you, but the mechanical consistency is inhuman and you are unable to force mistakes. At best itll feel like playing someone on their best day.
The heart of gaming, and why starcrat2 in particular is popular, is finding ways to play around each others mistakes.
Honestly, they should reduce Micro to the point that humans can bully it a little bit. How does the AI handle finding new ways to trade units to its advantage against a mechanically stronger opponent? How does it try to bait the human into having to take fights with unfavorable units / strategy.. as opposed to beating through them with mechanic mastery
Where alphastar shines, is in making a strategy work. It won't attack into something that looks dangerous. It won't forget to build more units. It won't be distracted on two fronts. (That said: Alphastar did some really stupid mistakes as well.)
I agree with you, though: it would be much more interesting, if Alphastar was actually handicapped in the micro (at least at a later point in the game). Then it would really need to find interesting strategies and counter opposing units with more cost-efficient ones.
I guess I need to learn more about the details here. Robotic consistency is a significant advantage, so unless it is handicapped I dont think the result is much of a leap from what we learned by it being insane at micro.
You can feel consistency on your opponent and it is extremely intimidating. Usually, players are consistent at different things.. like my micro might be much better with marines than tanks. To have consistent human-level micro across all units is already a huge advantage.
I agree the aimbot comparison is excessive, but the hyperbole was good for getting the point across.
You could take an fps AI with aimbot, and give it 220 ms input lag to say it doesnt have advantage over players anymore - but its still going to climb to the top of the ladder system because it isnt going to miss. It is either advantaged at aiming or disadvantaged based on that 1 number
One of the things people noticed in replays was the lack of control groups and in the case of zergs, the ability to select larvas directly, which no player ever does. It could have been as simple as removing these quirks.
It might have some minor unfair advantage in terms of being able to click with pixel perfect accuracy, but they're marginal and from watching this project evolve, it's pretty clear that the strategic planning aspect of alphastar has indeed become better than humans.
"Agents were capped at a max of 22 agent actions per 5 seconds, where one agent action corresponds to a selection, an ability and a target unit or point, which counts as up to 3 actions towards the in-game APM counter. Moving the camera also counts as an agent action, despite not being counted towards APM."
This is worse micro than top human players.
Is that supposed to be an intention in design? I'd figure an AI could easily outplay a human if that weren't the case, given its inherent advantages (e.g. better accuracy, the ability to instantaneously prioritize which units to blink in/out).
For example, [this](https://youtu.be/pETcAm82vXU?t=322) game is one example where an AI could perform even better using such tactics.
EDIT: another commenter mentioned it would make the game too unfair, so it was an intention in design.
100 zerglings vs. 20 siege tanks. Without insane micro the zerglings barely kill 2 siege tanks. With insane micro the 100 zerglings mop up the whole army with ease.
It's fascinating & fun to watch, but if your goal is to make an AI that can out-think a human it's super not useful, either.
This is not true. Top-level human players take into account who their opponent is, what builds they've used in recent games, their proclivities, strengths, and weaknesses. In a 7-game match they'll intentionally use builds which (in order to deceive) appear the same but have very different effects.
AlphaStar has better strategy than other AIs, but that's a low bar. My young son has better strategic appreciation of the game, and easily points out the difference between good strategy and AlphaStar's next-level micro.
Edit: Note that the version they sent out to the ladder had significant larger delay, significant lower APM, and didn't get any information not visible on the screen unlike the first iteration.
Of course to any reasonable person a consistent 5 ms is obviously vastly superior to once in a blue moon lottery-winning luck.
Sounds like it's the same deal here, where the designers made some effort to level the interface playing field, but also left in some advantages like instantly being able to select individual units (or whatever it was the other players felt was not humanly possible to do). Probably because, at some level, they really want their system to win.
If we have reached the point where to even risk a loss requires implementing artificial restraints on the AI...well...
You aren't really testing the thing you really want to test otherwise.
You could probably play tennis pretty well against Nadal if you made him wear goggles that gave him a quarter second delay in response time, but that's not really a fair test.
All other things being equal, if you match up two chess players, and one can just say or think the position instead of having to physically move the piece before they hit the clock, that advantage will accumulate over time.
It will be a different game, but you can see how AI does against humans tactically and strategically.
Strategy and tactics are a function of constraints. If an unconstrained computer can beat a constrained human, have we really shown that its strategy and tactics are better? To do that, you'd have them to play under the same constraints. Or at least have the constrains be close enough, which is what people are debating.
Any aspect of "fairness" is relevant if and only if it helps that goal. APM limits are important because we want the computer to choose strategies that would actually be good choices in that situation even for a human player, instead of being able to win with strategies that are powerful only because of superhuman clicking speed. On the other hand, requiring the computer to press buttons with a robot hand doesn't really facilitate improvements to the decisionmaking part, so it's irrelevant even if it would make it more fair.
After all, isn’t that the point of AI? To perform better at certain tasks than humans? We are visitors in the digital realm, just like when we put on a scuba suit and jump in the ocean—even the smallest fish can out-swim us.
This current modality is important, IMO, because we could potentially see neural networks performing tasks on other software, not just SC2. Imagine a neural network performing copy-editing in Word, writing code for CRUD applications, etc. Those are some mind-blowing potentials, we'd be losing out if we slowed down to work on robot hands.
AlphaGo had an "unfair" advantage in its games against Lee Sedol in that it was able to "think" far faster than you could reasonably expect a human to, and could therefore evaluate many, many more moves than a human player possibly could have. If you artificially limited the speed at which AlphaGo was able to "think" to match human limitations, would it have been able to win against Lee? I'd argue probably not.
Similarly, here we have an AI that's easily able to crush top SC players if it plays with an uncapped APM and no camera limitations, but only reaches grand master level when you impose artificial restrictions that force it to behave more like a human. The question of which configuration is more "fair" is kinda arbitrary; it depends on what your goal is.
I hope Civ5 could open up its API.
Interesting. But also consider this: in casual strategy videogames -- actually, strike "strategy" and just consider videogames -- most players don't want a really hard opponent. A computer opponent that is really very hard to beat is not what we want, because that'd be frustrating and many of us play videogames (yes, even strategy games!) to unwind; we want the illusion of challenge, an opponent that is challenging to beat but within the possibilities of every person who buys the game.
Which, by the way, is also the case with Starcraft II. Most people who bought it aren't tournament players. They expect a challenge, but not a really hard challenge. Game difficulty is all about perception ;)
PS: I shamefully confess I reloaded my X-COM (DOS!) game every time I lost one of the soldiers I was emotionally attached to. I don't like losing! :P
I don't know whether a more capable non-cheating AI would help though. Not unless it specifically imitated how a (good) human opponent would play, which I guess is an additional and difficult to implement constraint.
When so many player play a game, even an almost irrelevant advantage can tip the scale.
Or maybe it's a sociological factor, not a phisics ones.
But the data doesn't lie.
Here is a video of someone finding out that he played against alphastar. (He won)
I welcome skepticism and criticism for this sort of thing, and think most of it that I've seen here is well founded. But I would like to take a second to explain why I think this, and really all the progress in this area, is actually a really impressive achievement to me.
Let me try and frame this from the computers perspective. Let's assume a resolution of 1024x780. I'm not sure what size frames they actually feed their agent but it's not that important to the discussion, the point is it's a big image, and according to the article this agent is learning from pixels. So, you the computer are given let's say 1024*780 = 798720 numbers to look at. You then choose a number between 0 and 798720 (or the crazy 10^26 number the article gives as the possible number of actions at each frame) as your action for that frame, and then you get another 798720 numbers to look at. After the round is over (on average 20 minutes, if you make a decision every frame that's 20x60x60 = 72000 rounds). You get one number telling you how well you did. You repeat the process and get a new number. It's higher this time! But what is the cause? was it that click you made on frame 22456? or maybe that unlikely move you made on frame 4567?
Obviously I'm oversimplifying here, and the numbers are probably wrong. but I still think what I've said gives the right idea for what kind of task we (as a society/community/whatever) have somehow gotten a computer to solve. Computers are DUMB, the fact that it's able to play this game at all, let alone at a high level, is still a minor miracle to me.
‘Question: ... Two top grandmasters have gone down to chess computers: Portisch against “Leonardo” and Larsen against “Deep Thought”. It is well known that you have strong views on this subject. Will a computer be world champion, one day ...?
Kasparov: Ridiculous! A machine will always remain a machine, that is to say a tool to help the player work and prepare. Never shall I be beaten by a machine! Never will a program be invented which surpasses human intelligence. And when I say intelligence, I also mean intuition and imagination. Can you see a machine writing a novel or poetry? Better still, can you imagine a machine conducting this interview instead of you? With me replying to its questions?’
> Google AI beats top human players at strategy game StarCraft II
> DeepMind’s AlphaStar beat all but the very best humans at the fast-paced sci-fi video game.
Also, this part seems a bit weird from the article:
> The AI wasn’t able to beat the best player in the world, as AIs have in chess and Go, but DeepMind considers its benchmark met, and says it has completed the StarCraft II challenge.
So they didn't manage to beat the best players but consider the challenge complete anyways? I thought the goal was to build something that could things better than humans.
I'm not sure if you've actually played or followed competitive SC2. This is absolutely normal. Players will pull something completely unexpected out of a hat and win. The losing player will learn from it in future games. That's just how it goes. Unexpected strategies are really hard to counter when you've never seen them before, and they're often employed to directly counter what you're doing right then and there. So you've been countered and dealt a devastating blow, which means figuring out how to come back from that can be hard to impossible. It's a thoroughly human failing in every way. I'd be more concerned if the AI wasn't able to learn to counter it in future matches.
You can lose a Starcraft game easily if someone is doing something novel and you don’t happen to scout the right place on the board soon enough.
I see these huge AI brains creating a new class divide between people who have access to these new AI brains and those who don't. The mission of open source has always been to break down these barriers to empowerment with technology. Thus, this is a great area for open source innovation.
In all these AI v. Human games I see, it is really apples to oranges because the human consumes vastly less resources and compute cycles to perform at the same level as the AI. And when I say 'vast' I mean Vast. There is like a quintillion factor difference between the AI and the human.
There is no way the AI is even comparable to the human. To be comparable, we'd have to parallelize the game and lock millions of humans onto the game full time for millions of years.
At the end of the day, the AI is just a more sophisticated lookup table. There is as yet no analogous AI to human play.
Is the ability to use less resources a limit of our current hardware? Do we see the current hardware performance improvement trajectory being able to reduce the amount of resources an AI consumes to perform a task?
Are our algorithms simply not tuned well for using smaller resources? Could we build better algorithms around resource deficient environments?
What part of human processing capacity is this slow?
There is a problem with MMR calculation used for alphastar. More specific, there is matchmaking problem when alphastar did not get matched against equal-MMR and most of his protos games were against much lower-MMR players, skewing the data for MMR calculations.
The example given in video: you won 10 times against 5100, you lost against 7200. The calculated MMR would probably be around 6300. The problem is you wont be matched like this in real game. When matched mostly against the weaker enemies, MMR calculations have a lot of uncertainty.
The video, unfortunately in russian https://youtu.be/mpAUufSzaUo?t=1323
> 61 wins out of 90 games against high-ranking players
This doesn't seem to be quite as commanding as it was in Go. Do we know what MMR it reached or if it consistently beat players like Serral?
They didn't say anywhere they scored a win against the top 10 either.
To elaborate: to be among the "top 0.15%" of 90k is place ~14 (13.5) in the worst case, so nowhere near the "commanding" abilities mentioned by GP
StarCraft II also has a rock-paper-scissors nature to it though, so you wouldn't expect even a perfect player to win 100% of the time. There are some strategies that are hard-counters to other strategies, and because of the imperfect information nature of the game, by the time you scout your opponent and see what they're doing, it may be too late to shift and deal with it.
Unless you're Serral and your observers magically come out of nowhere and cover every single pixel of your base.
Flash had a 70% winrate overall and had time periods of 90%+ winrate.
>Humans play StarCraftthrough a screen that displays only part of the map along with a high-level view of the entire map, to e.g. avoid information overload. The agent interacts with the game through a similar camera-like interface
What exactly does that mean? Does it or does it not play by operating purely on image data human players would see on the screen?
How much of the system's interaction with game's interface is learned as opposed to hand-crafred and filtered through APIs?
It's amazing that most people here seem to think that system's ranking in a computer game are more important than its ability to learn from and interact with unstructured data.
Human limitations that AlphaStar shares:
- Data that requires the camera to see (e.g. enemy location, enemy HP)
- Inability to examine/target cloaked units
Possibly unfair, super-human things AlphaStar has access to:
- Instantaneous awareness of cloaked units
- Knowledge of things humans need to infer/click (e.g. upgrades)
- Global map awareness of unit positions (taking into account fog of war)
- Can select arbitrary collections of units, including outside of camera view
Also I wonder how their "camera-like interface" works with tactics like fly a building above units to make them harder to target.
I think you're saying does it suffer from occlusion during selection. Based on how APIs typically work, I would say no it does not. So yeah that's another thing humans can't do. AlphaStar could hypothetically stack units into a singular mass and it would be discretely untargetable by human players.
From what I'm reading in the paper, it sounds like there is some custom interface in play:
>AlphaStar can target locations more accurately than humans outside the camera, although less accurately within it because target locations (selected on a 256x256 grid) are treated the same inside and outside the camera.
It's really hard to parse what such statements mean. The fact that someone who is cited as a co-author of the paper approved the interface as "fair" isn't particularly reassuring.
They are not just some random person, they were top SCII players. Who else if not them would know this well enough to make assumptions?
There is a custom interface in the sense that the bot does not read pixels from the screen - it reads the information through the API, same information that is usually presented on a screen. But the amount of information is limited to exactly what a human would see at the same time using the standard SCII interface.
It is worth watching the match between alpha star and Serral who is current best player. Serral beats it like a walk in the park.
The paper claims: AlphaStar’s action space is defined as a set of functions with typed arguments
Looking at citation 7, it seems like they are structuring the action space as (First pick high level action)->(Pick argument 1 for action)->...->(Pick argument n for action). If this is the case, this seems to be "cheating" calling this AI as humans have completely picked out the actions. That is, the achievement here this: given what humans consider useful actions, AlphaStar can play at a grandmaster level.
The achievement here is mostly engineering in my opinion. One that extends far further than the 40ish people list on the paper. Probably an effort of over 1,000 people. From casually looking over the paper, there is nothing significantly different than AlphaZero or previous art. Again, the achievement here is listed under the infrastructure section of the paper.
In summary, this is a great step forward but now we need to start developing techniques to learn these action space hierarchies instead of throwing more power at increasingly difficult games.