Hacker News new | comments | ask | show | jobs | submit login
AlphaStar: Mastering the Real-Time Strategy Game StarCraft II (deepmind.com)
805 points by zawerf 24 days ago | hide | past | web | favorite | 453 comments



This is really impressive, I didn't expect starcraft to be played this well by a machine learning based AI. I'm excited to read the paper when it comes out!

That said, I'm not sure I agree that it was winning mainly due to better decision making. For context, I've been ranked in the top 0.1% of players and beaten pros in Starcraft 2, and also work as a machine learning engineer.

The stalker micro in particular looked to be above what's physically possible, especially in the game against Mana where they were fighting in many places at once on the map. Human players have attempted the mass stalker strategy against immortals before, but haven't been able to make it work. The decisions in these fights aren't "interesting"--human players know what they're supposed to do, but can't physically make the actions to do it.

While they have similar APM to SC2 pros, it's probably far more efficient and accurate so I don't think that alone is enough. For example, human players have difficulty macroing while they attack because it takes valuable time to switch context, but the AI didn't appear to suffer from that and was extremely aggressive in many games.


In the mass stalker battles, the AI APM exceeded 1000 a few times, and no doubt that most of that was precisely targeted. Whereas a human doing 500 APM micro is obviously going to be far more imprecise.

I think a far more interesting limitation would be to cap APM at 150 or so, or to artificially limit action precision with some sort of virtual mouse that reduced accuracy as APM increased.


>I think a far more interesting limitation would be to cap APM at 150 or so, or to artificially limit action precision with some sort of virtual mouse that reduced accuracy as APM increased.

IIRC OpenAI limits the reaction time to ~200ms when playing DoTA2. AI employing better strategies than humans will always be more interesting than AI that can out click humans.


Even the 200ms reaction time seemed overly slanted towards the AI. I don't think that is the actual reaction time of top pros, in the matches the AI played the human player would teleport in from complete invisibility and try to use an instant cast spell and the AI would have already teleported out. Yes the theoretically may have been constrained to a 200ms reaction time, but in practice the AI was playing at a superhuman level. Even with that advantage in fights, the human team still demolished the AI. Oh well, lots of things to learn still.


Another advantage was that the AI is just reading the game state through an API, it doesn't have to look on the screen. The game can be difficult to watch from a pro's perspective since they have to constantly click around the map to see what's happening, but the AI has perfect knowledge of everything it is capable of seeing, all without having to physically move a mouse to click on the screen.


If you watch the 11th game where pro player wins, (prior was a 10-0 shutout by Alpha), the AI actually lost because they rebuilt the agent to use the same forced camera perspective as the human - so there is absolute truth to this being a compelling advantage. It was able to micro multiple units in disparate areas by having far better spatial awareness. When they took that advantage away it seemed more even.


I don't know if we can absolutely claim that the limited viewport was the deciding factor in the 11th game, but it did seem to me that the Alphastar agent's blink stalker micro was somewhat compromised in that game compared to the seemingly superhuman blink micro in previous games.


You can see the alphastar perspective of that 11th game here: https://youtu.be/H3MCb4W7-kM?t=5195

It struggles with camera placement like real players :) And uses popular divert-attention tactics, which shows it understand that part of the game - for example when it sends oracles to mineral line at the same time as it attacks in front. Previous versions didn't do that, because they were taught playing vs cheating AI - so no point diverting attention of something that has instant access to any unit on the map :)

It also struggles to defend against adept harras beacuse it has "tunnel vision" - controls its oracle instead of defending probes at home. Mana actually managed his attention budget a lot better (this is a crucial pro-player skill in starcraft - harras is effective because it trades little of your attention for a lot of attention of the enemy, it's a skill that becomes irrelevant when opponent doesn't really have "attention" and can perceive and interact with all units on the map at once like previous version of alphastar).

This one is much more human, and much lower level. In my opinion it lost unfair advantage, so the mistakes in its errormaking are revealed. Previously it never was behind and never had to react to human player strategy - it rarely even scouted because what's the point - it wanted to build mass stalkers anyway.


Yeah, that's actually a huge point that I didn't even consider. Regardless of whether the AI itself is playing with a limited viewport, the fact that its opponent has a limited viewport opens up the opportunity to learn attention diversion tactics during the training process, which would otherwise be impossible.


What happens if a human tries to use the API with a custom UI of the human's own choosing? Such a UI might not exist yet, but are there ideas for more efficient UIs that could be built?


Yes I am curious of this too. What happens if the human has a giant TV screen that can see the whole map at once

Or, what if we slow down the game, so that the human can actually pause the game each second and consider what to do next. That's basically what the computer is allowed to do


It would massively be faster I think.

Upgrade building 3 comes available when you have enough resources.

A separate tab with insufficient resources gives you an overview with what you need to finish a,b,c.

A red alert appears when an enemy is spotted. You can click nearby units attack or a FSM with the attack strategy.

An finished building automatically will be placed near the town center.

Not working farmers can search for resources.

A wall is suggested by your current buildings, you can set an margin of eg. 20 meters.

The question is, how much programming will the custom UI need ( and how deep) to make it a lot more efficient


Stalker unit AI could be microing perfectly for you...


Giant tv:

Macro-wise, it would be like an unwieldly minimap which already exists so people can get a sense of where the enemy is moving. With a giant screen, information is not focused on a small area, so you are limited to your FOV. Minimap which shows unit strength in terms of armor hp or shields as well as placement would be ideal information.

Micro-wise, it would be like sitting in front of a giant text display looking at a whole book. You still have to focus on a small section to read it.


> Or, what if we slow down the game, so that the human can actually pause the game each second and consider what to do next. That's basically what the computer is allowed to do

While this would make it more fair, it would just make the micro game more similar to chess or go. I don't think humans would necessarily win in the end.


That's a good insight and yes, humans would probably be overpowered eventually. However, this is just the consequence of the fact that all games are similar if you remove external limitations such as reaction time (or, alternatively, produce a more efficient "being" which is not as subject to these limitations as some other).

Starcraft is like chess in some sense. The largest fundamental difference is that it isn't a perfect information game.


Tbh starcraft and dota shouldn’t really be the test games atm; turn based strategies (or rather, grand strategies) would be the far more appropriate evolution after chess and go, since we’re clearly more interested in AI macro than micro, and too much of its learning process is in trying to push the AI beyond micro-oriented thinking (probably many rounds of the AI tournament are lost simply because one AI found a new micro strategy to abuse)

But ofc, there’s no tbs or grand strategy currently out there with a real tournament scene, so you can’t really count on the devs implementing an AI-API, or even properly balanced / bug-free (far more user-testing goes into sc2/dota2 than say civ, simply by virtue of its playerbase).


Yes but a turn based game drastically reduces the action space compared to a real time game, something the DeepMind folks pointed out as a particularly interesting problem they wanted to tackle.


>a turn based game drastically reduces the action space compared to a real time game,

That's the primary benefit imo. The bigger action space is largely composed of non-strategic elements, at least in the sense of long-term strategies, eg micro and mini-skirmish tactics, that I don't think are as interesting. Ofc its clearly a conflict of interest, but my feeling was the most interesting aspect of Go/Chess is the AI making unintuitive discoveries that benefit the long-term. The human-collective machine is pretty good on its own at finding the shorter-term strategies; I don't think AI will make much significant impact in that space.

games as a medium to study upcoming real-world applications (eg cars), RTS makes sense; but as a medium to study AI beating humans, TBS is more appropriate (their ability to explore large search-spaces is far more interesting/potentially impactful). Studying both would be ideal ofc, but in a pick-one situation, TBS is better imo. But only RTS are even really viable atm, which is disappointing.


Well, are we wanting to test the computers ability to strategize/plan, or their ability to out click humans?

The former is an interesting AI challenge/achievement, the latter is a space in which computers are already known to outperform humans.


Even allowing players to zoom out would give huge advantages, that's why no matter the screen size you have to play at the same zoom. There was a bug at one point that allowed players to play multiplayer zoomed out and it was forbidden to use it in competetive games.


How about having multiple humans control the same faction, so one can focus on building, two on a couple of battle groups, another on scouting, etc.? Then they don't have to context switch nearly so much.


They actually have this game mode built in, it's called Archon mode.


Aha, nice, thanks. Let's see, two players per side... not a huge number but probably a big step up from one. Looks like people aren't playing it much; some people suggest it's because that requires a partner.

I would like to see a setup akin to that of Ender Wiggin, with one commander overseeing and recommending overall strategy, and, say, five others managing different areas or groups. That seems like the way to get the best human performance, and might be enough to beat the AIs—at least to nullify chunks of their advantage.


Yeah, put an eye tracker in a pro and you'll see that the eyes are constantly changing the focus point, if you can watch the entire scene with the same precision without the need to focus on it you're already at a nice advantage.

As an aside, a few pro gamers prefer to play on windowed mode for exactly this reason.


Just supporting this. I remember the uproar when 800x600 resolution was removed as an option in SC2 around 2012[1].

[1] https://eu.battle.net/forums/en/sc2/topic/6201040181


> the uproar

I'm not saying you're wrong, but 6 posts with no profanity is hardly "uproar" by blizzard forums standards.


Is the bit about reading the game through an api true? Earlier iterations of this same rl based agent that played Atari games would read just raw pixels not an api.


Yes, it’s true. A special PySC interface was created for AI. Also, it’s not only that AI doesn’t need to parse information available on much limited screen real estate but also that AI doesn’t have to use controller that have physical constraints. So AI has access to this super human controller and it can decide to click on one screen extreme and then another within 200ms.


Any game that is specifically going out of its way to support these ai’s will naturally do it through an api, though I’m only aware of dota2 and sc2 (sc:bw also does, through a community-modified client that serves the api, iirc). For adhoc games, eg atari, pixel-parsing is the natural result, but no one would intentionally set it up like that


The game is difficult to watch, but does anyone honestly believe that an AI is going to have a difficult time parsing the scene if it is trained to do so? That to me just seems like a question of resources. We're pretty good at image recognition and segmentation now, and that's without the unlimited amounts of training data one could generate when using a controlled game environment with a limited range of possible animations and effects. This is why I find the prospect of the AI agent having to parse the screen entirely uninteresting.


For real life applications, parsing the ”scene” would have impact as it could only convey imperfect information retained. In the game of starcraft the information is perfect when fog of war have been removed this together with unlimited attention (camera viewport) helps action potential and macro planning. No player is ever going to be able to consider precise strategy on the whole map perfectly in their mind. If deepmind wanted to mimic human limitations perfectly they would have to provide imperfect information for AlphaStar, e.g when providing information of locations of objects sample a random variable from a probability distribution which represent the location imperfectly and making that distribution bigger the longer the attention of the A.I wanders from the object both spatialy and temporal. Of course the usefulness of having these limitations is purely to model maximum theoretical human mental capacity and it’s use case could be to help explore strategies that work for actual humans.


There is another potential use: given these limitations, an AI might be able to learn to be better strategically, which could translate to an even greater advantage once the limitations were removed later on.


windowing the focus perhaps, yes, but I'd assume it's the opposite and the focus is applied more freely.


You talk about a static image, but navigating the camera requires strategy, attention, and adds to the focus. If you take that away, it's just a turbo charged pen-and-paper RPG with a time limit on rounds.

They could train against the API, reinforcing the AI trying to predict the state from vision. But with limited APM it would be pretty difficult for the AI to keep track of everything. And, potentially, it would still not be the same as a human looking at it. I'm not sure whether human attention is a particularly bad example of efficient resource allocation. I'm very biased to think it is still the gold standard. But the fact that deepmind didn't focus on this implies they were not finding it interesting enough, and/or too difficult.

Anyhow, (visual) exploration is a step up from mere image recognition


But on the other hand, an AI that beats humans using brute force in a game where it makes a ton of difference isn't much fair too.


> using brute force

"Brute force" in AI context is usually reserved for traversal of the entire search space. I think "superhuman micromanagement" is a better term. And before AlphaStar superhuman micro wasn't insurmountable obstacle for human players.


The funny thing is that once we're talking about the real world, which will come, that incentive actually reverses.

At that point the name of the game will be maximizing the advantage the body/infrastructure provides the AI, not minimizing it.

Weird.


Yes, since DeepMind chose SC2 for having the right characteristics for mapping to the real world, ie imperfect information and real time response, they should have had at least one run without any speed governors. And maybe another with the CPU limited to some level we might find in an embedded system of near future.


It's the same principle as a baseball player putting extra weights on the bat in practice.


I've recently watched a TED talk explaining how human perception has a lag of about a third of a second. Pro players might be better, but after noticing they also need to take an action.


My experience is that to beat 300ms requires there to be no conscious thought in the loop. It has to be muscle memory guided by higher level intent. It's like how the gunslinger waiting to shoot hits first, it's reflex instead of decision.


Getting sub 200ms on something like this benchmark is fairly easy [1]. While waiting for the color to change is different than processing a game like dota2 or sc2 a 200ms limit isn't too unreasonable to me.

I would love to see these AIs get handicapped even more like a full second and really force them to out think humans.

[1] https://www.humanbenchmark.com/tests/reactiontime


I think OpenAI would have been by lots of humans, but they decided to train it with 5 unlimited, invulnerable couriers. (until the TI showmatches, in which they were beaten easily.)


The only way to truly have a fair fight would be to accurately model the limits of human capacities. How fast can humans move the mouse and at what accuracy? How fast can they type keyboard commands? How fast can they move their eyes? You could study those limits in a sports lab with high speed cameras, etc.

A simpler model would be to limit the bot to, say, one action per 250ms, introduce a slight delay in his reaction time, require him to move the camera to gain detailed information and take further actions, and have camera movements count as actions.


Here's a graph of AlphaStar's APM versus a professional player's: https://i.imgur.com/TXeLkQK.png Evidently AlphaStar also has a similar Economy of Attention (where the player focuses) to a professional player, at around 30 screens per minute. Additionally, AlphaStar's reaction time is around 350ms, a significant disadvantage over a pro.

The skepticism in this thread is absolutely justified but I think it's important to note the lengths to which DeepMind has gone to address and assuage the fears of superhuman mechanical skills being employed in these games.


I watched all of the event live and I feel that that graph is deceptive. If a game is 15 minutes and has 3 main battles lasting 15 seconds each, and you use 100 average APM on non-battle time and 1000 APM during battles, your average APM will be 145 but you obviously have a superhuman advantage.

This is compounded by the fact that almost all of AlphaStar’s actions are “useful” whereas a significant amount of the human actions are spammy.

You will typically see a human select a group of units, and fast-click a location in the general direction they want the units to move (to get them started moving that way), and then keep clicking to continuously update the destination to a more precise location. Every click counts as an action. An AI can be perfectly precise and “clicks” the right place the first time.


TLO seems to have a longer tail than AlphaStar in that graph though, so doesn't that imply that TLO peaked at an even higher APM, presumably during battles?

Fair point about humans needing minor adjustments though. Another comment also mentioned a bug in the APM measurement: https://news.ycombinator.com/item?id=18994350


TLO is a Zerg player, so he probably does a lot more errors when playing Protoss. Also, every top player estimates when to do a sequence of actions and spams it a few times to maximize the chance of execution. Meanwhile Alphastar only has to do that once.


Yes,could that bug be the reason for the AI getting to 1000APM?


Hm, should be interesting to force the AI to use input commands through a "filter", where it can only execute orders with human level precision. And something similar for input.


This graph is incredibly deceptive and I'm kind of upset they posted it. There are about 10-15 seconds of gametime where APM is incredibly important, and the AI boosted to 1000+ APM during those periods. During lulls it cruised at ~30 APM.

Meanwhile humans are literally spamming keys to keep their physical fingers loose and ready - they're not performing anything close to 400 useful APM on a regular basis (or in TLO's case - 1500 ... He kept walking his units straight into death while spamming keys).


How can it do 1000 APM if if its reaction time is 350ms? (180/minute)


I believe you are conflating latency and throughput. It might take AlphaStar 350ms to perceive a threat, but once perceived, it might issue many commands at high speed to respond.


Latency != Bandwidth


How many of those 500 actions are actually useful? I haven't watched competitive StarCraft games for years but back when I did, rates were more like 300APM and even then the players basically spam clicked the background or selected random units non-stop and were probably only doing 50-100 actual effective actions.


> How many of those 500 actions are actually useful?

Exactly, a human doing 500 APM during intense moments is going to be way different than an AI bursting 1000 APM with pixel-precision during the most crucial moment in a game.

TLO spent a ton of time at >1000 APM and walked his army directly into enemy shots all the time. MaNa had much better control at ~400 APM. So APM is really irrelevant to control - for humans.

I suspect the AI, on the other hand, makes each action precise & count for something.

This graph, which I think was supposed to show that the AI was being "human", IMO is pretty damning. We saw the APM spike to >1000 during a critical moment and we saw the APM at <30 during lulls, so we know it uses its APM at important moments, presumably with important pixel-precise actions.

https://deepmind.com/blog/alphastar-mastering-real-time-stra...


I suspect that once the AI becomes good enough it will be able to beat human players using a much lower total APM than human players. We're not quite there yet, but it just needs a little bit of time.

As a hopefully illustrative comparison, you could give any top player a day of play time per move against the top Chess AI being given a minute of play time per move and the AI will still win. That's how much better the AIs are than humans now. There's no reason in principle this won't be possible with StarCraft AI too.


The biggest issue with allowing the ai to have high APM is that it will inevitably learn optimal strategies that depend on that high APM, eg stalkers can take on far more immortals than we normally expect, and the AI will learn it this way, because the high APM allows a new stalker strategy (or rather, empowers an old one greatly) while not affecting immortals significantly. This also naturally means the AI leagues see a different game balance than the human leagues, leading to strategy divergence.

And then when you drop the APM limit, suddenly all the learned optimal ai strategies start falling apart, and the whole thing has to be relearned.

More annoyingly, there’s not much for human players to learn from innovative ai strategies that are based on inhuman accuracy of play (because we couldn’t possibly execute it).


What they're improving at right now isn't any specific AI model, it's how to train the AI models. It's meta-machine learning. I don't doubt that they can quickly train up a new model under different constraints now that they know how best to train up said models. It's not like they throw away all progress once they change some constraints; far from it.


I'm sure we'll get there too, I just think it's a little deceptive how they've measured the APM at the moment.

StarCraft is more random than chess, so I do think it's possible humans will always be able to take occasional games off of fairly constrained AIs just based off blind luck in picking counter builds, it will be interesting to see what % that is.


Such high actions per minute does not seem fun to me, and possibly a repetitive strain injury waiting to happen.


the 1000 apm thing is because of a bug in how apm is calculated in starcraft2. There is a hotkey to assign all your units to a new control group while also deleting it from all other control groups which TLO extensively uses, and while it just is one key-combination to press it records as 1 action per unit which was selected. The real APM of pro players averages at 250-400 and peaks at 600-700.


> a repetitive strain injury waiting to happen.

Yes, I have one from it and wasn't even playing that high (I averaged less than 100 apm). I understand that it's a common problem.


Was Starcraft the only/main game that you played?


Yes, basically the only for several years at that point. A few hours here and their of other games but nothing at all substantial.


i already had some RSI, but playing SC2 made it a lot worse (i stopped playing when i got to plat as a zerg because it required enough APM to hurt)


It is why I stopped playing SC, and I was never any good anyways. Still fun, but it just hurt real bad.


I stopped playing SC competitively because it's too stressful. Both physically and mentally. Hitting 300 APM continously in a game for up to 60 minutes at a time makes your hands go numb. And the adrenaline rush makes you want to go running afterwards. With games like LoL/DoTA at least you have a chance to take a break after a gank/farming/ team wipes. With starcraft every decision has a significantly higher compounding effect


Hell I never played it competitively. I had to stop playing it even casually because it physically hurt.


Wait until you hear about stringed musical instruments? :)


From what I understand, the most common string instrument problems are with shoulders/neck/back, due to sitting for long periods of time with poor posture.

Most music should be playable without excessive risk of serious injury to arms / wrists / hands, but from what I understand very high notes on e.g. the violin are hard to play without using an over-flexed wrist, which is definitely a problem if playing music requiring such a position for long stretches of time, or many rapid switches between high and low notes.

Some of the string players with most risk are novices who have not been taught proper technique.

For professional PC game players, the design of the standard computer keyboard and furniture is absolutely terrible from an RSI perspective (worse than any common musical instrument, and without any of the design requirements of acoustic instruments as an excuse), and it is shocking to me that there has not been more effort to get more ergonomic equipment into players’ hands. The way game players typically use a computer keyboard is generally more dangerous than the way typists or e.g. programmers do. As someone who spent a few years thinking about computer keyboard design, I can think of at least a dozen straight-forward and fairly obvious changes that could be made to a standard computer keyboard to make it more efficient and less risky for game players. There is a lot of low-hanging fruit here.

Whether or not the equipment is changed, the most important single thing when using a computer keyboard (or any hand tool for that matter) is to avoid more than slight wrist flexion or extension, especially while doing work with the fingers. Excessive pronation and ulnar deviation of the wrist are also quite bad. Watching pro players, many of them have their wrists in an extremely awkward position while doing fast repetitive finger motions for hours per day without breaks, which is a guaranteed recipe for RSI.


Well I have heard of them, also looked up TLO mentioned above, he actually did get RSI and had to take months off.

"Liquid regretfully announces that Dario “TLO” Wünsch will be unable to play for the next few months due to the Carpal Tunnel Syndrome he experiences in both hands. He will however continue to be involved with E-Sports even as he takes a break from gaming to give his wrists time to heal. Sadly, this means that he will not be attending Dreamhack Summer or the Homestory Cup III as a player."


> artificially limit action precision with some sort of virtual mouse that reduced accuracy as APM increased

I like the idea of having action noise that's linearly related to APM


There would be an entire new dimension of decision making, in addition to good macro, where you have to prioritize actions. Will be interesting to see.


I said so before, but is it really a big difference from controlling a unit that can also only do one thing at a time? The agent controls itself just like another unit, with a constraint on APM available to control other units. On the one hand, these APMs add a new parameter, if the constraint is implemented naively. On the other hand, if there are viable strategies against ultrahigh APM opponents, then the constrained is really rather limiting the dimensions of the decision space and to good effect, finding viable strategies that take less effort. Hence such things are called "hyperparameters" (I think that's something different, but you know what I mean). Likewise, the game isn't as fast as to need 100 screen switches per second, if good planning allows batching and bursting actions.


I understand the spirit of the proposal but that would be like limiting a computer to add at most two numbers per second. It's OK if we want an interesting contest against humans but it wouldn't be a fair estimate of a computer math capability. It's also not the point of using computers to do math instead of a room full of accountants. I'm OK with the AI going as fast as it can and play superhuman strategies because it can be that fast. After all we'll not limit AIs output rate when we'll let them manage a country's power grid.


The purpose of limiting speed isn't to make an interesting contest, it is to accurately compare the "math" instead of the speed the math is done at.

It isn't surprising that its fast, the surprising part is that it can make human-like decisions. The only way to compare whether its thinking is human-like is to restrain it from "brute forcing" the contest through speed.

The model has likely learned that the faster it does things the better the outcome. What it needs to be measured on is strategy.


But isn't the competency of a Starcraft player is also measured on his/her speed?

In that context, you can't really measure strategy without accounting for timing/speed because a lot of tactics and strategies only become viable once the player has the required speed to actually realize them aka "micro".


Exactly, and due to superhuman micro, the AI has cornered itself into learning a small subset of the strategy space. It’s not good at strategy because it’s optimized itself for just getting into micro-handled situations.

It’s not good at strategizing with all the options available to it given it’s micro ability, it has “one” strategy that leveraged the micro as much as it could, and when given a strategic challenge by mana, it didn’t know what to do.


yes but the ultimate goal, is to make an AI as "smart", or "smarter" than a human. That's why they keep making AI's play against human players in Chess, Go etc. It's not to prove computers are faster than humans. It's to prove computers can be smart like humans.

They want to make an AI that can teach new ideas to humans. New strategies that human bodies are physically capable of executing, but no human was "smart enough" to think of yet. An example is when the AI built a high number of probes at the start. That's "smart".

The only way to train an AI to be able to come up with new ideas, is to force it to be "slow". Otherwise, it will just always do the easiest way to win, which is out-micro. There is nothing interesting about a game like that. That only shows the AI is fast, but it won't be clear that it's "smart"


That's exactly why it's so important to try and constrain the system to as close to human parameters as possible. You can't compare strategic prowess if the two players are playing at a completely different level. It'd be the same as saying MaNa is better than say, Maru (who has just won 3 GSL Code S's in a row), because he has stronger strategies against ~30th percentile players. It makes no sense.


Speed is only interesting as part of fair human competition. It's trivial for the AI to win with speed and it doesn't have to be remotely smart about it. Serral (dominant world #1) was easily beat by 3 far weaker humans controlling one opponent - it wasn't even close. It's just stupid to even claim victory in those situations.

Making an AI that wins by outsmarting humans, on the other hand, is what we are all interested in.


That would be right if AI and human player had the same opportunities for micro.

They don't, because AI doesn't use physical objects to move stuff in the game. AI just "thinks" that this stalker should blink and it blinks. Human player has to deal with inertia of his hand and of mouse.

If you want fair competition of micro - make a robot that watches screen through it's camera, moves mouse and presses keys to play starcraft.

Then the bandwith of the interface is the same for both players, and we can compare their micro.


you don't really need a real robot, but assign some "time cost" for various actions which depends on spatial distance and type of action and if it is a different action than the previous action. humans are really fast when for example splitting a group of units but performing multiple different actions on different areas on the screen or even multiple screens takes a lot longer. They don't need to fully emulate human behaviour but getting somewhat close would really show how strong teh AI is tactically and strategically without superhuman micromanagement.


I try to make my point clearer.

If we want to measure strategy, I agree with you, and out of curiosity we might do it. But the goal is winning, so is strategy important as long as it wins? The AI can take every shortcut it finds IMHO. People do take shortcuts.

Cars and planes bring us across the world exactly because they don't walk like people and don't fly like birds. Wheels, fixed wings and turbofans are shortcuts and we're happy with them. We can build walking and wing flapping robots but they have different goals than what we need in our daily transportation activities.


The problem with starcraft is - interface overhead is significant part of the game. AI doesn't have to cope with that - every click is perfect, and moving the mouse from one edge of the screen to the other takes no time.

If you want to make it fair - place an AI-steered robot in front of the screen, and make it record the screen with camera, and actually move the mouse and press the keys.

Then I can agree it's fair :)

But then of course AI would be incredibly bad.

Right now the advantage doesn't come from faster thinking, but from much higher bandwith and precision that AI has when controlling the game. It's anything but fair.

With chess it's not a problem, because interface overhead is negligible.


Those are different engineering problems. I'm pretty sure that they could eventually build a pixel perfect camera and a fast pixel perfect robot mouse. They'll be at least as good as human eyes and hands, probably better. Done that, they'll keep winning.

It's surely interesting technology with positive impacts in a lot of areas but is it that the important part of the experiment? Humans need keyboards and mice to interface with computers, computers don't (lucky them.)

Sorry to insist on that analogy, but it looks to me as if my car should be able to fit my shoes and walk before I admit that it goes to another city quicker than me walking.


No, these are not "just" engineering problems.

When you're trying to individually blink 30 stalkers at the perfect time they have almost 1 hp - latency is everything.

Camera has latency. Depending on various factors it takes even milliseconds of exposure for camera to gather enough light that it registers as a clear image frame. Human eye works on a different basis, but also isn't instant. You cannot cut that in software, human player cannot train to lower this. But AI doesn't need to do it - it has image provided as a memory buffer.

Image recognition has latency (both in the brain and in computer). Even as simple stuff as recognizing where the computer screen is as opposed to the background. It takes time. AI doesn't need to do it.

Muscles (engines in robot hands) have latency.

Mouses and hands have inertia and can't be moved instantly - have to be accelerated and stopped and even if you have optimal algorithm to be 100% accurate - it takes time.

It's not only hard to implement, it's also physically IMPOSSIBLE to do without introducing significant delays.

AI that is controlling the ui directly doesn't have to deal with most of these tasks, so it has a huge advantage in a game like starcraft. It's not that AI is so much better, it's that AI is high-frequency trading and human player is sending requests to buy/sell by telefax. By the time your request is processed the other guy had opportunity to do 10 different things.

If you want to focus on the part of the job that is doable now - sure, go ahead. But then don't abuse the unfair advantages you have and announce you "won". It's very low threshold to win in starcraft when your opponent has effectively 100 times the lag you do.

I'm sure someday we will have AI that can beat human player in starcraft without abusing this advantage, And I'm pretty sure the fastest way to this isn't to put a real robot in front of a screen, but it's to limit the intraface bandwidth of the AI to be on the similar level as that of human players.

> Sorry to insist on that analogy, but it looks to me as if my car should be able to fit my shoes and walk before I admit that it goes to another city quicker than me walking.

Let's remove the roads that we made specifically for cars and speak about this again :) Will your car move you through an untamed wilderness quicker than your legs? Possibly. Or not at all.

If I walk into a bullet train, slowly walk inside it, and walk out of it at the end of the route I will be even faster than the fastest car. Is it fair to say I'm faster than a car? After all it's not my fault the car doesn't fit inside that bullet train :)

We need to compare apples to apples, and comparing AI that doesn't need to deal with half the sources of latency with a human player that does, in a game where latency is very important - just isn't fair.


If you don't put any limits on the AI, it's not Starcraft any more.

You could make an AI which tries to hack the human computer to force a leave. That would also constitute a "win". Or one which hacks its own computer and displays "You win" immediately. Or one which tries to kill the human player, if we want to be really dramatic about it.


Chess and Go both limit computers to one move per human move, and they’re still very interesting games for AI. You’ll always have limitations. When you’re playing a game, the limitations are largely arbitrary, and you choose them to make the game better achieve whatever goal you’re after.


You are right, but the point here is to force it to win by pure decision making. Having an AI play a game was always about challenging ourselves to improve our understanding of intelligence. Limiting APM is just another way to force us to come up with new ideas.


So, in some sense this is a limitation of starcraft. The goal of this project is presumably have the AI play a high strategic depth game. However, with sufficiently high micro certain strategies that have low "macro depth" become unbeatable. So it's true the AI would win, but it plays in ways that do not expand our understanding of SC strategy, it is simply using a simple to understand and impossible for human to execute strategy. Think of aimbot in a shooting game, a human can try to play smart and attack from unusual angles/lay traps/crossfires, but if the AI can simply get instant headshots the AI can run straight to objective and win. It would be a winning play, and humans understand why it would be a winning play (boringly so), but it is outside of human execution.


But it's important to be clear about what's being measured. If the AI can take and successfully win engagements that no human could because of their superior micro, it's not necessarily winning via superior strategy (as is claimed).


but still, if you want to measure that then play a turn based game. if i could micro as good as the pros i'd be pretty damn good too.

hooking it up to a camera looking at a screen and a robot arm with a mouse would be more fair though.

edit: ok they did have a camera version, but i still want a robot arm.


In the showmatched they made the computer have to look at a regular screen to control, the stalker micro was much less impressive - and mana won.


For now. Give them another month. This is like AlphaGo vs Fan Hui all over again -- people knocked that accomplishment at the time because he was just a master, not one of the top players in the world. Well, not much longer, AlphaGo beat Lee Sedol, the best player in the world.

The ceiling here is going to be incredibly high, much higher than the level of play that people are capable of, even when restricted to a single window.


This doesn't nullify the observations that people are making here.

Part of the difficulty here is describing what a 'fair' match might be. Specifically, I think fairness has to do with a goal many people have for AI: to improve human play. The strategies in Chess or Go that were employed could conceivably be used by human players. There aren't any hard restrictions preventing humans from learning from that play, even if the AI is entirely superior.

It would follow that a 'fair' SCII match would employ strategies that humans could implement. Making extra workers, for instance, might be a real lesson from AlphaStar play. The insane stalker micro, however, could never be done by a human.

From this perspective, I think the important takeaways were:

* The AI leaned heavily on super-human stalker micro.

* The AI had some strategic blind-spots, namely the immortal harass.

* The APM comparison isn't terribly meaningful; a lot of human APM is spammy/twitchy button presses that doesn't do all that much, whereas the AI can presumably make each action count. There were also AlphaStar APM spikes that likely go along with the stalker-micro issue.


None of this really matters though. The AI is improving every day through training. Give it another few months of development and it'll be able to trounce humans under any "fair" set of handicaps you can think of, like limiting average and max APM throughout the game. We saw the same pattern with AlphaGo. There's no reason whatsoever to suppose that humans are fundamentally better at this game than an AI can be.

When AlphaGo first one, people said it wasn't fair because it was running on a whole cluster of computers. Well, within not much time at all, it was good enough to run on a single computer and still beat top humans. We are dealing with exponential progress here. The writing is on the wall.


It's tempting to assume the AI will just keep getting better and better, but that's not guaranteed, and I was happy to see that the Deepmind folks in the video clearly acknowledged this. In the game that MaNa won, it's possible that he did so by finding a strategy the AI agent had never encountered before, causing it to respond with nonsense (e.g. not building a Phoenix and pulling its entire stalker army back to deal with warp prism harrassment). In a game with a strategy space as large as SC2, it's possible that an AI will never be able to saturate the space of viable strategies, and it will always be possible to find edge cases that the AI has no idea how to handle.


The point isn't that the AI won't improve or win with those conditions; I agree it likely will, and soon. The point is that the conditions of the match matter and that this one missed the mark.

It absolutely does matter whether the AI can use obviously super-human techniques, because then it's not nearly as interesting for human observers. I'd much rather watch an AI that was a strategic genius that won despite being hamstrung in terms of micro/techniques.

> There's no reason whatsoever to suppose that humans are fundamentally better at this game than an AI can be.

Who's claiming this?


Lee Sedol was not the best player anymore at that time (not saying it wasn't an impressive/important achievement, but overstating it doesn't help either - the "beat best human players part" came later in 2017).


Lee Sedol was still top 5, certainly no worse than top 10 at the time. By all mean he wasn't the best and most dominant, but the difference with the top was tiny.


I don't understand who's downvoting you, this is accurate. While AlphaGo/Zero improved quickly to superhuman play, we are just in this thread comparing timelines, so that is relevant.


What kind of evidence is going into this analogical reasoning? Do we also extrapolate similarly for other things? We went to the Moon in 1960s. Was Mars a month, or a year, or a decade away? Then we sent robots to Mars. Did we yet send any robots to Alpha Centauri?

Different problems have different difficulties. Solving simple problems quickly doesn't mean we'd also be able to just as easily solve the hard problems. Often the comparably simpler problems have the best reward/effort ratio and thus make quick progress, which doesn't need to be the case for hard problems.


Going to the Moon is a completely different endeavor than making an AI better at a game that it's already quite good at. This is a red herring.

If you had bet against AIs reaching parity with top human players in any previous game, whether it be Checkers, Chess, Go, etc., you'd have lost. I see no reason why StarCraft II should be any different.

We can reconvene in the comments here a year from now and see where AlphaStar is then.


It's not the accomplishment that people knock. It's the spin, the inaccurate article titles and the hype.


It doesn't seem like hype to me -- it seems like a genuine, significant accomplishment. Sure, they might not be able to beat the best pro players consistently right now, but I suspect that is right around the corner. Would you rather they stay completely mum until they've reached that goal too? And why? I'd rather know now, and then be able to follow along as it gets better and beats higher and higher-ranked players.


Just FYI, Lee Sedol wasn't the best player in the world at that time (nor is he now). AlphaGo went ahead and beat the actual #1, Ke Jie, 3-0.


The AI lost because it completely messed up the response to the immortal drop, nothing to do with micro.


This was my read as well. It seems that Mana simply found a strategy that the AI had not found. Due to not having trained against it, the AI produced nonsense results. The commentators noted that the obvious response was to build a Pheonix and just completely shut down the harassment. The situation is similar to Alpha Go vs Lee Sedol match 4.

One of the hardest parts about these kinds of human vs ai expositions is making sure the AI has explored the full possibility space, so that can handle all situations. The techniques at play lack the ability to perceive a completely new situation and formulate a good response. (Though anyone who's lost to cheese in games they later learned easy counters for know that humans, while better than state of the art AI, aren't perfect here either.)


Mana got himself in the same situation where he was surrounded by stalkers on multiple sides, but this time the micro wasn’t so crazy that he couldn’t manage it, and he was able to take on one group at a time. The immortal drop, while unanswered, was not really that effectual.


But it was answered: AlphaStar pulled a huge stalker army that was about to hit MaNa's base all the way back home to (attempt to) answer the drop, repeatedly. If you have more complexity to your army but fewer army units, as MaNa did, a delay like that is how you win the game.


It’s funny because this works against the standard Ai too.


That's what I said on Lobsters. They were always good at builds, micro's, etc. The one thing they couldn't do was judge human intent, esp if they were being mislead (esp time wasting). I was waiting for one of the players to try to screw with its head to see what it did. Mana showed two gaps: the back and forth thing; that it ignored the observers giving up constant strategy information. Then, he got the first win.

Now, the questions are how many more such glitches will show up and can they eliminate them with better algorithms?


And against human players up to Masters 3 or so :) When you're still using the all-army hotkey, defending with a small and precise group isn't happening.


That time, the ai didn’t really even try to engage. In fact, the ending of the match was marked by the entirely absent group of stalkers as the natural was engaged.

It’s likely safer to say the AI was confused in general at that point, possibly related to the camera change, but we didn’t really get to see the quality of stalker micro that game


"possibly related to the camera change, but we didn’t really get to see the quality of stalker micro that game "

In software, changes in assumptions can break what depended on them. There could be many assumptions in its neural net centered on full visibility. They should probably retrain all or just some from scratch with the camera change in from the beginning to see what happens. Then, it will be firmly encoded into the strategies over time.


They mentioned that they retrained after the camera change and it was equivalent to the AIs that beat Mana 5-0 by their metrics.


The immortal drop let him keep AlphaStar occupied while he built up a critical mass of immortals (it becomes harder and harder to effectively micro stalkers against immortals as numbers go up, probably even for an AI), then let him put AlphaStar in an awkward position when it was camping near where the warp prism was hiding.


The results are obviously impressive, but even then there is a lot of work to do as far as learning efficiency goes:

"The AlphaStar league was run for 14 days, using 16 TPUs for each agent. During training, each agent experienced up to 200 years of real-time StarCraft play. "

MaNa probably played less than 2-3 years of Starcraft in his whole life (by that I mean 24hr x 365d x 3), and was learning with a much less focused/rigorous methodology.


Another way to think about it is that a human brain is mostly doing transfer-learning, on top of a 99%-baked deep net that was wired up during foetal development from our DNA, where that DNA-persisted model has "seen" hundreds of millions of years of training data.

Humans don't have to learn to process, recognize, and classify objects in visual sense-data, for example. We can do that from the moment we're born, because we already have hundreds of precisely-tuned "layers" laying around in our brains for doing just that. We just need to transfer-learn the relevant classes.


This is a widely underappreciated fact when it comes to comes to comparing the 'training experience' of humans versus bots. And it extends far beyond processing 'sense data' - A human likely has some level of understanding of how the game works based on experience from other games it has played and from 'real life' - we know almost instinctively that 'high ground' is likely to give a combat advantage without having test it in game.


Not only that, humans (and many other eusocial species) have an instinctual intuitional understanding of many aspects of game theory.

For example, humans, even from infancy, prefer games where it is possible to punish cheating (i.e. take revenge upon cheaters) to games where it is not. This isn't just "we're animals that have evolved to enact tit-for-tat strategies [by e.g. injustice triggering rage] because they lead to cooperation which leads to egalitarian utility"; this is actual analysis—instantaneous, intuitive analysis—of a system of rules, to notice, in advance of ever being slighted, whether you'll be likely to end up in an "unjust" social situation if you agree to the given ruleset. There is an "accelerated co-processor" of high-level abstract game-theoretic information—and layers to extract that information from sense-data—that ship as part-and-parcel of the human brain model. We never need to learn how to judge unfairness, any more than we need to learn how to see.


And perhaps worth noting that the great apes we evolved alongside have the same kind of outrage to unfair trades.


"humans, even from infancy, prefer games where it is possible to punish cheating...this is actual analysis—instantaneous, intuitive analysis—of a system of rules, to notice, in advance of ever being slighted"

[Citation needed]


All of our knowledge of how to play games and so on has come from our current lifetime. We do not have a "genetic memory" that means we have learnings from cavemen or some other such nonsense. Our DNA contains instructions on how to grow a human, it's not a mega hard drive with millions of years of collective memory.

If a 19 year old is good at Starcraft, he's good at Starcraft because he spent two or three years playing a shit load of Starcraft and we are much more efficient at learning higher level strategies than AI are. These AI agents nead to try damn near every possibility to adjust their weightings for various actions. Humans understand pretty much the first time when something goes wrong, oh better not do that OR similar things again.

It's incredibly impressive that a given human can become GM level at Starcraft within a few years and to take an AI to that level takes 200 years of training, as well as an inhuman reaction time, perfect micro/clicking, etc. It shows how amazing our learning skills are.


We may not have "genetic memory" but a ton of human capabilities are baked in at the DNA level. Sure, we need to practice in order to specialise those abilities for particular tasks, but that's more of a calibration phase on a fantastically capable machine, rather than a construction phase.

Totally agree with how impressive humans are, though. In fact, one of the most amazing things to me about robotics is finding out how close to global optimal some humans can actually get.


The GP is underselling the fact that in the human years of being a pro player they think through many more games and may even dream of it. I certainly went to bed after a lengthy session with images of the game still in front of me. Although that might be more about micro, the macro skills are somewhat transferable from other "games". RTS simulate economy, amongst other things, after all.

GP's claim, "99%-baked deep net that was wired up during foetal development from our DNA" is also unfounded, if not completely overblown. I am far from a student of biology, much less an expert, but intelligence is still seen as an emergent property. The real kicker might be that organizing thoughts might be a "game" of it self, that is learned in development and constantly exercised. Talk about self-play.

I recently read a similar question about "inherent mathematical language", ie. capability, and the given opinion was that there is no consensus, except perhaps for basic addition, which I guess concerns vision, ie. seeing a set of things and knowing the count is +++++. That works only up to around +++++++ items at best, according to findings.


Perhaps a nit, but still fascinating: the human visual cortex finishes developing after birth. A newborn can't really distinguish between objects. The ability to differentiate, focus on and track objects is developed over the course of several months.


True. Humans are pretty unique in that regard, though; pretty much no other animal is like that. It's easier to understand human neonatal development if you just considering all humans to be born premature. (It'd be really interesting to know whether that's literally true—whether keeping a human baby in the womb for an extra few months would actually result in the same stages of mental development being passed that occur in a regular baby of that age who has been sensing and interacting with the world.)


I've read somewhere that we are basically born prematurely (as you said) because if we waited any longer then our enlarged head sizes would make delivery quite possibly fatal.


My brother was born a week or so after his due date; they induced labor for him for exactly this reason. Perhaps unsurprisingly, his head circumference was literally off the charts.


Maybe off-topic, but that's one side of the coin, and I suppose the other is that being exposed to more sensory input accelerates development, or makes it even possible (on higher levels of cognition). If this wasn't the case, why wouldn't we just be bigger and carry longer? Is size viz megafauna really that suboptimal for any more significant reasons than being hunted human hunters? I would almost say that longer pre-natal development was suboptimal, because we'd either become bored, or supersmart, but anyhow superegoistic for lack of nurture.

Calling it premature is ironic, if we reach nominal maturity only after 10 or more years as far as fertility is concerned--the equivalent in AI would be the procreation of a neural net, perhaps after exploiting a bug in the game, breaking out to rewrite a better version of itself, or colluding with itself in self play. Yes, this is going off-topic.


> why wouldn't we just be bigger and carry longer?

The consensus in the evolutionary-anthropology community is that our hips (pelvic bones) have to be the size they are, in proportion to the rest of us, to make us able to walk upright. "Building bigger" doesn't really work, for the same reason that you can't make a giant robot—if you scale humans up, the pelvis would need to be made out of something stronger than bone to support the additional load.

The same is not as true, though, if you just make the person wider—because then you spread the same load over "more pelvis." (This is just a personal unfounded hunch of mine, but I think some human subgroups—e.g. midwestern Americans—who are at the genetic limits of baby head size, and who avoid C-sections, are currently selecting toward bigger-boned-ness.)

> I would almost say that longer pre-natal development was suboptimal, because we'd either become bored, or supersmart, but anyhow superegoistic for lack of nurture.

Keep in mind that we wouldn't be conscious for any of it. The development stage that "wakes you up" to the outside world would just occur later on, as occurs in animals with longer gestation periods (e.g. elephants, with a gestation period of 18-22 months.) This would give things like your ocular layers longer to finish developing, without really having an impact on the parts of your brain that learn stuff or think stuff.


Hypothesis:

Being born “prematurely” might allow for more flexible brain wiring. Adapting better to an environment quite distinct from ancient ones we had evolved in is possibly one of our key cognitive advantages compared to other animals.


Is there evidence for this? My mental model has been that DNA encodes more along the lines of hyperparameters: amount of gray matter vs white matter, locations of brain regions and folds, etc, but the connections between neurons, and their weights, were all learned. There isn't that much information you can stuff into DNA, after all.


Connections between neurons, the synapses, are encoded. So much so that they are given individual names. This is a fun one to read about to get an idea:

https://en.wikipedia.org/wiki/Calyx_of_Held


> Humans don't have to learn to process, recognize, and classify objects in visual sense-data

Do you have a citation for this? It doesn’t jibe with my understanding of development. For example, animals born paralyzed are blind: https://io9.gizmodo.com/the-seriously-creepy-two-kitten-expe...


Human genome isn't even a gigabyte of data. That's less than a byte per neuron and a big chunk of that data actually has to go into "how to make a kidney cell" and "which way to route veins". So while some basics have to be hard-coded, it can't be remotely close to "99% transfer from ancestors".


That's not how any of this works. We do not have "millions of years" of information encoded into DNA. DNA doesn't store that much data. In fact, it's about 1.6 gigabytes only! And most of that information is basically a ruleset for growing proteins which become our body.

All the stuff we've learned about games and so on have come from our current lifetime. I don't have caveman memory for how to fight a tiger.


I said "deep net" for a reason. A DNN model almost always turns out to be far, far smaller than the training data that was used to create it.

For one example: any smartphone's face-recognition feature. Each such feature is a DNN which took millions of hours of face data to train... but the resultant model fits on an ASIC.

Our DNA doesn't directly encode such a model, but it encodes a particular morphogenic chemical gradient, and set of proteins, that go together to make specialized neural "organs" (like your substantia nigra, or your basal ganglia, or your superchiasmatic nucleus, etc.) which manage to serve the same function to your brain that access to a pre-trained "black box" DNN model would serve an untrained NN in achieving transfer learning.


Our DNA is NOT a trained deep net, nor is it a deep net period. Our DNA is a string of proteins which encode other proteins which gives the series of tasks needed to create and operate all the structures of the brain and body.

The "training" of our deep net happens during our lifetime. We are not born with a trained deep net so your analogy that somehow we are born with a highly capable deep-net encoded into 1.6GB of DNA makes no sense.

Can you imagine how capable a human being would be if it was born into a world with no other humans or learning sources? Imagine a new born baby born into a world with some accessible food/water close by so it wouldn't die from lack of nutrition or wild animals, but crucially without any other humans. It would be utterly fucking useless, no language/reading means no way of assimiliating new knowledge. That baby would end up being a totally incapable human, regardless of the DNA or structure of the brain.

As far as we currently understand, if infants aren't exposed to language and communication at a very young age, they are either incapable or severely stunted in terms of communication for the rest of their life.

My point is, that we are very much dependent on the learning that we get from the point of birth ONWARDS. We get the amazing capacity to learn from the structure of our brain and body, but we'd be absolutely incapable idiots without other people to teach us, our books, language etc. We understand "games" and game theory from playing games with other kids, we're not born with "game theory" encoded into our DNA as one other commenter seemed to think, the same for language learning, and everything else.

Anyway, the point of this whole debate was that it's incredibly impressive that humans can learn to play a game as complex as SC2 in a tiny fraction of the time it takes a cluster of GPUs using a huge amount of energy and resources. Not forgetting that we also have to use a physical body to control our actions in the game, which adds a whole other level of complexity since we have to understand how to manipulate a mouse/keyboard etc, whereas the AI is essentially acting directly with the game, like a human with a neural link. The other kicker, is that if you just changed one aspect, like picking a new map neither player had seen, the AI would be sent hurtling back to square one whereas the human would only be partially affected. These series of demos only make me more impressed that given the huge resources given to Google, they can just about beat a human and even then after 200 years of training time and various other artificial advantages.


You are willfully missing the point. Animals have instincts. The complexity of humans does not make them an exception to this rule. There are in fact large amounts of brain function that are baked in at birth (or developed in a predictable timeline after birth -- humans are basically born premature). Humans are able to instinctively perform behaviors which are not taught, although the majority of critical behaviors in humans are socially learned. Feral children (like Genie) are functioning organisms with complex behaviors. They're just defective humans because humans rely on a distributed learning system called culture in order to do the work that biology cannot.

You are insisting that because humans do not have instincts at a certain level of abstraction (playing video games) that no part of these instinctive brain functions play a role in the development of skill at Starcraft. This is wrong. Abstract reasoning is not simply learned, but it is HONED by experience and neural development. An AI has to do an enormous amount of work in order to replicate functions that humans can already do. This is the basic visual problem in AI that stumped researchers in the 60s who thought that tasks like visual recognition, spatial rotation, etc would be trivial because they are trivial to evolved organisms.

You're relying on some kind of mental model where brains are just masses of neurons that form all of their connections and complexity after birth. This is ultimately a political idea, and it's wrong. No neuroscientist believes this. Brains have pre-defined areas (with fuzzy borders) and many behaviors do come baked into the template. Complex behaviors like language do not, perhaps, although even there, the underlying functionality that permits language is an evolved trait (which is why other animals can't learn language). Research the FOXP2 gene, as just an obvious example.

Edit: Your post contains "structures of the brain". What exactly do you think the structures of the brain are, if not evolved modular solutions to complex problems? Your visual center is somewhat trained after birth, but it already exists. The same goes for speech, motor control, and all of the other unconscious or semi-conscious processes that all humans (and other animals as appropriate) share.


One macro technique used by AlphaStar agents that is not used by human pros is building extra workers beyond currently exploitable capacity.

This gives them reserves when attacked and some workers killed. They can also ramp up mining at a new base quickly by moving the extra workers there.

Apparently the benefits outweigh the costs for these workers for AlphaStar. It will be interesting to see if some pros decide to adopt the technique and if it improves human performance as well.

Disclaimer: I do not have much Starcraft experience.


Workers mine 40 minerals per minute and cost 50, taking... 15 seconds to build? I forget. Workers beyond 24 provide zero benefit (better to send them to the natural).

Let's say you make 4 extra at a cost of 200 minerals and then lose 4 workers to harassment. You are out 200 minerals in both cases, but the prebuilt workers in the prebuilt case will mine an extra... 100 minerals? (40 + 30 + 20 + 10).

This doesn't take chronoboost into account though. I don't know, the gain is marginal, and the opportunity cost is having a smaller army (2 zealots for example)

Please correct my numbers if I've made a mistake, I forget build times and havent played since hots


The numbers you cite are close enough that your estimations are good to work with (12 seconds to build, closer to 60 minerals at full efficiency but down to 40 for probes #17-24, etc)

The extra workers aspect was the most interesting decision-based adjustment AlphaStar made on conventional pro level wisdom of "standard" play. It has a couple of factors in play, that I trust the AI factored in and more and tested over several games for its long-term benefit to winning a game:

- every 8 probes you build requires a pylon as well. total cost of 500 minerals

- workers are safer in the main than in an unoccupied natural (long distance mining) to harassment and pressure

- when your expansion completes, having 4 workers vs 8 workers vs 16 workers potentially has huge impact to the immediate spike in income

- what you mention -- the prebuilt workers will dampen the impact of most worker harassment to purely the resource cost of the lost workers.

My guess was that well executed harassment by an opponent in practice games put AlphaStar in very limited situations with a crippled economy that it couldn't fight its way out of, so this was a catch-all harassment "counter" -- it's ok if you kill a few probes, at least it won't throw off my economy completely and I can still continue my overall gameplan.

After that I think the next most important aspect was planning ahead for a bigger income spike when their expansion was done without waiting to build out another 16 workers after the nexus was ready.


Yeah, it looked like Mana was copying this behavior somewhat in the live game.


I bet there's a sweet spot in-between that will come out of this, like saturating your natural to 24 workers minutes before expanding.


Yeah that stalker micro really showcases a particular advantage leveraged by the AI.

I'd love to watch the results of constraining the AI so instead of seeing the whole map at once it has to pan around the same way a human would to get updated information on each battle. Counting those "info-gathering" window pans against the actions tally might yield slightly fairer APM metrics. (EDIT: Turns out they built a new agent for game 11 to do just that)

One of my biggest beefs with strategy games of this genre occurred around the time sprites went 3D and the player viewports got smaller (presumably to showcase all the cosmetic detail, and since it became harder to distinguish between visuals when zoomed out farther). I always feel too constrained on the modern games - like I can't see enough of the map at once. In my opinion that "full size viewport" gives a multi-tasking edge to the engine that the player doesn't share (beyond the human cognitive overhead from context switching you already pointed out).

On the other hand I find it fascinating our AI's have become strong enough at our games that we're having to handicap them to avoid players crying foul that they're not fair.


I agree. Most RTS games feel constrained because of the limited viewport. Supreme Commander has a nice feature where you can zoom all the way out at any time.


And a very important part to SupCom's zoom feature is that at a certain zoom level it switches to a rich visual overlay of unit icons and pending/queued orders.


I would agree with that. If you take a look at the exhibition match replay, there's some cases where it makes objectively suboptimal decisions. We couldn't see this during the live stream, but the double immortal warp prism caused AlphaStar to bring back its entire army from across the map, when a few units at home would have been enough to defend. It even kept trying to blink its stalkers to a place where the warp prism couldn't be reached. Perhaps this version with the limited viewpoint hadn't been trained with enough games?


Also worth noting that it starts by imitation learning from pros. I'd be curious to see if the macro can be learned without imitation; a much harder challenge. Also, playing with full visibility as was mostly the case in the demonstration is quite lame...


I'll bet you that AlphaStarZero comes out in a year and just learns from scratch.


I'll take you up on that bet; they started with a version that tried to learn from scratch they seemed to have scrapped that approach.


I bet the very early internal versions of AlphaGo learned from scratch and didn't work very well either.


Correct. They started with pure self-play and it didn't work at all.


It wasn't a full visibility - Alpha had a fog of war. It just saw the whole map at the same time.


That's still a large advantage that humans don't have access to. Not just in the "pitiful humans can't take advantage of such a large viewing area" sense, but literally the game will not let human players zoom out that far.


Also I wonder how it handles invisible units. Because as a human player you can see the shimmer if you look close. Can it see that or are they just totally invivisble to it?


Presumably completely invisible, as it was looking at raw unit stats rather than the visuals.


I wonder if that would let you win with something like mass dark templar with phoenix's to snipe observers. You could run right past it, and it could never anticipate you.

Or better yet, imagine zerg where you can burrow every unit.


It would be the same as with a player: as soon as you do something with those invisible units, or imply that you have it (eg dt shrine), its sufficient to say that invisibility is in play, and appropriate tools should be used. Its not like you can do anything about dark templars even if you see the shimmer, if you have no sight, beyond body blocking.

Regardless, the article describes cheesing as the common tactic in early iterations, with economic-play being learned later — one of the described cheeses is dt rushes, which the AI apparently learned to deal with, so it should have some understanding of invisible units (alternatively it learned to ignore the dts and base trade or something).

I don’t think the shimmer is useful enough to be a significant loss for these prospective AI’s quests for world (sc2) domination


If you learn, why not learn from the best, the pros? These people already have spent years figuring out what works and what doesn't. Why not draw from that pool of knowledge and instead spend extra time going through the same motions?


Because then you don't know whether the AI learned by experimentation or by mimicking. To draw an analogy, imagine the difference between somebody reading and following an algorithm to solve a Rubik's cube, as opposed to somebody being handed a Rubik's cube and experimenting. If expert-level strategies can be reproduced without being explicitly shown to the person/AI, then it means something is going right in your methodology.


Two reasons I can think of:

An AI trained from human strategy might end up more limited than one that could learn from scratch. It could be stuck in a local maximum of play and be unable to escape.

An AI technique that requires a large dataset of pro play to learn will be much more limited in terms of applying it to other games.


it seems like in some cases at least it didn't have to move the camera (it had direct interfaces) which for some of the stalker micro battles (especially in game 3 or 4?) the battles were larger than the screen space -- it would not have been possible to micro that well if your control interface limited what you can control or where you can place them.


This is a great point, and something that seems a bit lost in the discussion:

In StarCraft 2, the game IS the interface. That is to say, the developers have constructed the game in such a way as to be difficult to control; and human mastery of the interface is a large percentage of the game. Strategy in the game is important, of course -- but this is not chess, where human beings are not limited by the interface of the game. In StarCraft, you are intentionally given a limited interface to monitor and control a gigantic game while under incredibly tight time controls.

And I should also note that Blizzard is extremely reluctant to add features that make it easier to control the game. I have a friend who works on the StarCraft 2 team. We talked at length about this one feature that he designed and proposed for the team to make a specific aspect of the game friendlier towards players. It was turned down for exactly the reasoning above -- the game is the interface. By making the game easier to control, it disrupts the entire experience; an StarCraft 2 that is easier to control is no longer StarCraft 2.


That would actually be an interesting thing for someone from blizzard to do, get two similarly skilled high level players, and compare the win/loss rate by doing two 7 games matches with each player having a match with a 10% increased view size, and see what the impact is.

Essentially try to quantify the advantage of increased view area.


Yup, exactly. To add onto this, for people less familiar, there's a non-stupid reason for this: economy of attention.

Attention/APM is often called the "third resource" (after minerals and gas), spending it wisely when you have several areas at any given time that could use attention is part of the strategic and tactical decisionmaking. For example, usually in a battle you wanna be paying most attention to the fight rather than your base, but sometimes it's actually better to jump out back to your base to increase production or economy, and knowing which situation is which can be challenging.

Obviously, if you make the game mechanics too easy to control (letting the computer do more of the work), then this part of the game becomes less interesting, because you don't have to weigh trade-offs as much anymore.


Are there any bolt-on augmentation interfaces that utilize the same API the bots use to allow players to more effectively enter their intent?


It's a question of whether "played with human level latency and precision" be a part of the rules of the game we are making the AI play.

I would say yes, because StarCraft was very clearly balanced for human players. We already saw some indication that when played with super-human micro, mass blink stalkers is a stronger strategy than when humans are in control. Without the active intervention of game balancing, RTS metas tend to devolve into "mass one or two units" which was what happenes to every Command & Conquer game (and why SC is a respected eSport while C&C is not).

I suspect this will happen when you have agents playing parameters that don't match what the game was balanced for. The strategic landscape will shrivel up and the game cease to captivate us.


APM is one thing. I am curious what would happen if it could only see a limited view (as in the last game with MaNa, which it lost to him) and physical click dynamics (i.e. clicking + gaussian noise as an action, instead of giving direct commands). That way there will be misclicks, preventing this super-efficient Stalker micro.


Also these wins are not using same inputs that human receive (ie on screen image) and outputs that humans are allowed. They instead use PySC APIs which has much more flexibility, perfect information and no constraints of limited screen real estate and pixels. There is a claim in that article that they have another version being trained that uses on screen only information but I still don’t know if AI is allowed to bypass the physical constraints of controller. So if AI has access to super human controller you will see AI performing super human actions like many commentators have described here.


Perfect information is a bit of a stretch. There was still fog of war. The AI just played as if the portion of the map visible and actionable at any point in time was the whole map. They retrained with a restriction to a given locus of attention that can change, akin to a screen the player is looking at and acting on.


The final game in the video has this limitation. It does affect the performance of the agent.


This is exactly what I think, I'd like to see how Alphastar react to "cannon rush" or other weird bo where you need to be "smart" to counter it and just not be based on insane / none human micro.


This is how it responds to cannon rush. : ) https://www.youtube.com/watch?v=vYdWQjTWTFM


Isn't the point of a cannon rush to build the first cannons where they can't be seen?


Surprisingly not. The trick is usually to build pylons (or other cannons) such that they protect the cannons from being attacked by probes. Building them out of sight is usually too slow as a rush.

Still, he didn't do that either.


In the beginning of SCII I only saw people trying to hide it. But I guess the strategy evolved, interesting.


Sometimes you may see a photon cannon used to deny an enemy's natural expansion to try to gain an economic advantage. Depending on the map and matchup, it may also complicate the enemy's early attempts at scouting and aggression.

Typically, you don't see more than 1-2 photo cannons, because you don't usually want to "over-invest" and lose what advantage you gain.


This is not how a player would do a cannon rush, it needs to be hidden / at the edge of the opponent view.


That's inaccurate. The best cannon rushers generally build them visibly, but not just anywhere. If you look at someone like QuasarPrintf as an example, a player that keeps a fairly high rank on an account that literally only cannon rushes (there is no anonymity, no pretense about what's going to happen), he wins despite people knowing what's going to happen and putting the cannons mostly well in view of opponents on a lot of maps.

Printf is part of a fairly small group of cannon rushers that don't simply see it as just another cheese, because what generally defines a cheese strat is that it can be easily countered if you know it's coming; not so with their cannon rushes.

Now, with that said, Printf (or any other "I always cannon rush" player aren't winning tournaments), but that's partly because not many players decide that they want to stake their development on any one strat like that, and if they do, it'll likely be one that's deemed more legitimate by the community.


AlphaStar makes up for its slightly subpar macro with REALLY good at micro. Thus, more micro heavy counters like cheeses are unlikely to beat it.


The strategies might be subpar, but the economy sure isn’t. It consistently had better economy.


It was very good at microing its macro.


I am really impressed it learned when to pull probes in that game against Mana where the AI was pressured into his natural.

It was also extremely active with the stalkers, deciding to split them in three and not let Mana cross the map with his immortals.


> For context, I've been ranked in the top 0.1% of players and beaten pros in Starcraft 2, and also work as a machine learning engineer.

What's that hireability like?


What was your SC2 alias? I played at a similar level as you.


Mana tried to outblink an AI?

Damn I really need to watch these games :)


Totally. What would be interesting to see is a low APM bot that still beats human players. A lot of that macro was unbeatable.


And also, latency is lower


In a nutshell, AI micro was flawless, makes up for suboptimal macro?


The macro seemed fine -- AlphaStar usually had more workers than the human opponent, in every game, and was producing more army. The suboptimality seemed to be in army composition (blink stalkers) and strategic decision making (pulling all of a superior army back home to defend a single warp prism drop).


> While they have similar APM to SC2 pros

Wasn't the APM closer to half that of the pros?

https://storage.googleapis.com/deepmind-live-cms/images/SCII...


This is super deceiving and I'm kind of upset they posted this image, knowing it would mislead people not familiar with the game. The AI sits around during lulls at <30 APM - meanwhile MaNa and TLO were literally spamming keys to keep their fingers warm, not actually doing anything.

During the fights, the critical moments in when MaNa would top out at ~600 humanly inaccurate APM (this is 10 inputs per second), the AI would jump up to over 1000 - we don't know exactly what it was doing, but it was presumably pixel-precise. Meanwhile the physical inertia of the mouse is a challenge for humans at that speed - imagine trying to click five totally different places with perfect precision in a single second.


Do you know why TLO's APM is sometimes so large? Did he actually peak at 2000, or is he using a repeater or something like that?


APM gets inflated by counting several single actions as multiple separate actions. For example a Zerg player may want to turn larva into 30 Zerglings, they do this by pressing one button and holding it down as the UI repeats a separate action for each larva transformed.

By comparison selecting a single stalker, and having it jump to a new location is much more effort, but counts as fewer actions.


A thread from a few years back about TLO’s APM: https://www.reddit.com/r/starcraft/comments/4pnbv8/tlo_somet...


A huge part of a human's APM is meaningless spam, for example right-clicking the same unit multiple times to attack it, or setting the same waypoint thousands of times in the early game when there's nothing to do. The computer might be at double the human's effective APM, if only we had a credible way to measure that.


There's a tail which shows that a small number of AlphaStar minutes had > 1k actions.


AlphaStar interacted with the StarCraft game engine directly via its raw interface, meaning that it could observe the attributes of its own and its opponent’s visible units on the map directly, without having to move the camera - effectively playing with a zoomed out view of the game

Additionally, and subsequent to the matches, we developed a second version of AlphaStar. Like human players, this version of AlphaStar chooses when and where to move the camera, its perception is restricted to on-screen information, and action locations are restricted to its viewable region.

I was really curious whether they would attempt moving the camera like a human. Sounds like it's still a work in progress, but very exciting! Even this isn't enough to make it fully like a human player, as I believe it is still getting numerical values for unit properties rather than having to infer them from the pixels on the screen. But it seems possible to fix that, likely at the cost of drastically increasing the training time.

The benefit of using pixels, of course, would be that the agent would become fully general. It would probably immediately work on Command & Conquer, for instance, while the current version would require deep integration with the game engine first. But I think the training time would be impractically long.


The live game that was just played was against this version of AlphaStar. Mana did win, but it was by exploiting some poor defense against drops and hard countering the stalkers he knew AlphaStar favours. The AI still looked very good and the developers claimed that this version of AlphaStar wasn't significantly weaker than the versions which didn't have to use the camera.


You aren't kidding about the stalkers. Check out the bar chart at the bottom of the page:

https://deepmind.com/blog/alphastar-mastering-real-time-stra...

I guess it makes sense that the AI would favor such a micro-heavy unit. I imagine it would be a nightmare to deal with perfect blinking.


Dealing with perfect blinking is basically impossible, since you can blink back your units right before they die. Stalkers are balanced around the fact that HUMANS have limits to how well they can micro.


While the "skill cap" on blink stalkers is extremely high, there are many hard counters that can stop even perfect blink micro. MaNa won because he went for one these. Immortals are the perfect hard counter to stalkers because

- cost-for-cost, they are more efficient in a faceoff (resources)

- immortals are space-efficient dps (damage per second) in a battle. In a given battle, an army of 4 immortals is far more likely to all be in range of an enemy and doing damage than an army of 8 stalkers bumping against each other trying to get to the priority target

- immortal shots do not have projectiles, but are instant. No matter how perfect your stalker control, once an immortal targets a stalker, it is guaranteed to take 30+% of its hitpoints in damage.

The last point is very important. Once MaNa had 3+ immortals, even with perfect blink micro, a little bit of target fire and timing micro on MaNa's part allowed him to slaughter the stalker army one stalker per volley, while it takes them longer to clean up the immortals (especially with shield battery support).

Another thing glossed over in this discussion -- AlphaStar did more than classic blink micro. It did a very technical maneuver (the casters briefly allude to it) of triggering the barrier on one immortal with a single laser, then focusing all fire on an immortal whose barrier was already down from a previous iteration of this tactic, and then walking away until the barrier has worn off (while blink-microing weakened stalkers). Repeat. This is a detail of increasing the efficiency of trading stalkers with immortals that humans don't often even think about, let alone execute (because good blink control is often more impactful). That AlphaStar came up with this shows that it's not just about perfect execution of micro, but also perfect understanding of micro.


I'm also excited to see the future of this bot when they demonstrate a terran AI with near-perfect marine/stim/medivac micro.


Perfect micro bots don't excite me much, because they've existed all along, and it's not an AI task.


There was a "perfect zergling micro vs siege tanks" bot some time ago that would micro lings away from the one that was being fired at by the tanks, thereby negating all the splash damage. The effect was insanely powerful.

But as you say, showing that a bot can have perfect micro is not very interesting. Of course a computer can have better control of well defined tasks like moving a unit away just before it dies, especially doing so for many different units concurrently. What is interesting is the wider strategy and how the computer deals with imperfect information.


Here’s that perfect zergling video: https://youtu.be/IKVFZ28ybQs


The interesting part to me is that, as far as I understand, the AI figured out this strategy by itself, basically deciding that it would be a good way for it to win games, rather than being specifically programmed to do it. That's actually pretty cool!

Other than that, I agree, and am also much more interested in what happens when you have a more level playing field (using camera movement rather than API, limiting reaction times and CPM, etc). I look forward to future matches where this happens.


I think there is some debate about what the neural net did and what was hardcoded. So far all starcraft AIs consist of hardcoded intelligent micro ruled by a neural net that picks one out of less than 100 possible hardcoded choices. And things like "expand", "scout", "group units", "micro" are hardcoded outside of the neural net, part of the API in fact. When the researches said they only used 15 TPUs for 14 days on LSTM, this makes me think they really narrowed down the search space of the neural net and hardcoded a lot of the micro or at least trained separate micro nets.


Not really. The version which learned from scratch was scrapped as it didn’t work at all. This version learned by observing pros. So it didn’t learn by itself, it imitated and perfected pro players.


It was not programmed to do the thing, but all these tactics were in seed replays, from which the agent started its learning. So, it actually not figured the move _by itself_, only found it useful.


I'm scared. Medivacs only healing the front line and perfect stimming only the backline will be SOOO broken.


I'm curious, would the AI be able to see cloaked units? In sc1 you could see them,( I think sc2 is the same) but it was very difficult. How does the 'raw' interface expose that subtlety?


This is actually a great question. Like what does it mean for a unit to be cloaked?

If humans can, under ideal circumstances, see cloaked units... Maybe the only mechanic that shows up (like for bots or an API) is the inability to be targeted using an attack command (i.e. you can still be hit with splash damage from ground targeting)


My understanding is that the AI sees things via an API the game exposes, so presumably cloaked units are completely invisible to it until they're revealed.


Not sure but I think in the video they say the AI does not see cloaked units.


yeah I was disappointed to discover it worked this way.

don't get me wrong, it's a major accomplishment in AI regardless, but it's a significant advantage and it would be easier for me to appreciate the AI's skill if I didn't have to keep reminding myself that it can see the whole map at once. it's such an information advantage.


Actually, I would say this might be the strength of AI from another perspective: the ability to observe and monitor global information without losing attention. Or in other words, attend to the whole picture from get go without being overwhelmed.

While it is an unfair advantage in competitive gaming, but in more realistic settings, there is no requirement that AI needs to have only 2 eyes. It can have as many as it could handle, while human can't scale the same way.


While that would be amazing if true, I'm pretty sure if you take away the stalker blink micro AlphaStar loses hands down to humans. This isn't taking away from Deepmind's victory at all, but I think micro was what made the AI come out ahead in this one. In many of the games, Mana had much better macro only to lose to blink stalkers.


You play the game as it's written. Come back with another version of StarCraft that isn't so micro-intensive and we can see how the AI does on that.

Chess and Go don't have any form of micro and AIs are nevertheless dominant there.

I'd say, give AI development another year and I wouldn't expect there to be any kind of game, in any genre, that humans can beat AIs at. Whether it's Chess, Go, other classical board games, Civilization, MOBAs, RTSes, FPSs, etc.


> Chess and Go don't have any form of micro and AIs are nevertheless dominant there.

Yes, but chess and go have a tiny problem space compared to something like Starcraft. People want to see an AI win because it’s smart, not because it’s a computer capable of things impossible for humans. If the goal was perfect micro they could write computer programs to do that 10 years ago.


Then maybe we need a better game than StarCraft to test this on? Some kind of RTS that's less micro-heavy, perhaps? Maybe even an RTS where you can't give orders to individual units at all, like the Total War series? You can't fault the AI for winning at the game because of the way the game itself works.

Even if you limit the AI to max human APM, it's still going to dominate in these micro-heavy battles because it's going to make every one of its actions count.


> Even if you limit the AI to max human APM, it's still going to dominate in these micro-heavy battles because it's going to make every one of its actions count.

right, and we saw that with the incredible precision with stalker blink micro. There are many ways you could make it more comparable to humans. They have already tried that by even giving it an APM.

> You can't fault the AI for winning at the game because of the way the game itself works.

But it does make the victory feel hollow when it wins using a "skill" that is unrelated to AI (having crazy high APM with perfect precision because its a computer). Micro-bots have been around for decades, and they are really good. The whole point of this exercise is to build better AI, not prove that computers are faster then humans.

It would like if they wanted robots to try and beat humans at soccer, and the robots won because they shoot the ball out of a cannon at 1000 KPH. They win, but not really by having the skills that we are trying to develop.


I just can't help but feel that nothing AI does will ever be good enough according to this mindset, i.e. true "intelligence" is by definition things that computers cannot do.

Beating the world champion in Chess was, at one point, considered an impossible achievement for computers. Now it's considered so routine it doesn't even count as AI according to many. And in a few months when AlphaStar is beating top human players without having to use APM or viewport advantages, what will the next goalposts be?


The point is, it's like being impressed by a calculator because it can multiply two massive numbers faster than we can... no shit, that's the whole reason we use computers, because they calculate faster than we can...

There's nothing impressive in coding something that can execute something far faster than a human, or be so accurate and beat a human. There were Quake 3 bots that could wreck any human alive 10 years ago because they react in milliseconds and shoot you in the head perfectly. So what? It's obvious a computer can do that. It's like being surprised that a bullet beats a human in a fight, that's by design.

I would be impressed if a computer learned from scratch without knowing anything about the game beforehand, about the controls, or anything else, with ordinary human limitations. Using vision processors to look at a screen to see the inputs and controlling a physical mouse and keyboard. That would be impressive. But watching a computer do perfect blink micro at 1500apm is just underwhelming, since that isn't new tech, you could hand code that without deep nets.


> The point is, it's like being impressed by a calculator because it can multiply two massive numbers faster than we can

Yeah, exactly. And when calculators first came out, people were very impressed by them. They upended entire industries and made new things possible that had simply never been possible before with manual calculation. When you're pooh-poohing the entire computational revolution you might want to take a step back and reconsider your viewpoint. It only seems not impressive now because we were born in a world where electronic calculation is commonplace and thus taken for granted.

If you don't find this achievement impressive, then go look at some turn-based game where reaction time is eliminated entirely that computers still dominate at, like Chess or Go. The AIs are coming. Or give it a few months and they'll come back with a version hard-limited to half the APM of the human players and it'll still dominate. It's clear which way the winds are blowing on this. People who bet against the continued progress of game-playing AIs invariably lose.

Go read the comments here for this exact same discussion: https://news.ycombinator.com/item?id=10981679


> Or give it a few months and they'll come back with a version hard-limited to half the APM of the human players and it'll still dominate.

And this is exactly what is being argued here. Let's see that in particular, not a demonstration that computers are faster than humans. Of course they are. Whoever argued that, ever? This has been known and envisioned even before calculators were invented.

What people here are arguing with you for is that we want human-level limitations of the controls for the AI so it can clearly win by better strategy.

Isn't that the goal here?


> I just can't help but feel that nothing AI does will ever be good enough

It can be good enough in a certain problem space, such as chess. But unlike chess or go, which are purely mental games, Starcraft has large physical component (vision, APM, reaction time). It can make it hard to determine when it has “mastered” this RTS. Like you said, it may be a few more months (years?) before AlphaStar can master Starcraft on “mental” level. The physical component is trivial for a computer, so mastering that is not much of a milestone.


Depending on how you define Chess, seeing the pieces and physically moving them is part of it as well. Chess-playing AIs haven't been required to have robot components because that's not the interesting part of the challenge of Chess. I'd argue the same is true of StarCraft, even more so, given that it's an innately computer-based game in a way that Chess is not. It seems arbitrary to require the presence of an electronic-to-physical bridge in the form of a robot only to then operate physical-to-electronic bridges in the form of a keyboard and mouse. Just let it run via the input devices directly. Give it some years and humans will be able to do this too.

In other words, this isn't an interesting handicap to apply.


> It seems arbitrary to require the presence of an electronic-to-physical bridge in the form of a robot only to then operate physical-to-electronic bridges in the form of a keyboard and mouse.

It's not at all arbitrary. SC2 match is won by a combination of reflexes and physical quickness with which the actions are executed, and strategy.

The whole point is to even the playing field in the area of the physical limitations so that only the strategy part is the difference. You know, the "Artificial INTELLIGENCE" part?


I said before, you could just integrate the intelligent micro part of the Ai into the game for humans to control.

For game design the problem is, the border to macro is not a straight line, but fuzzy, so how far does it go.

For SC2 and this specific bot, the problem isn't there, if the AI merely controls a strategy over hard coded tactics.


"Yes but X has a tiny problem space compared to something like Y. People want to see an AI win because it's smart, not because it crunches numbers."

1980: X = Tic-tac-toe, Y = Chequers

1990: X = Chequers, Y = Chess

2000: X = Chess, Y = Go

2019: X = Go, Y = StarCraft

2030: X = Any video game, Y = ???


Is a AI that wins at Starcraft only because it has crazy high APM really going to help get to the next X? We could have built that 10 years ago. All it proves is that computers have faster reflexes then humans. That won’t help them become problem solvers for the future.


You seem to forget the way it learned to play every part of the game (not just micro fights). That is, not by having any developer code any rules, but simply by "looking" and "playing".

That's the great accomplishment and nothing like that could have been done 10 years ago.


What makes this interesting is if they can make a computer program better at Starcraft strategy then a human. How they did that is irrelevant. If having developers code rules makes a better AI then deep learning, then the former is the most impressive solution. What they did is a great accomplishment and the AI they created was amazing, but I feel like the faster-then-humanly-possible micro makes any accomplishment hollow, because that is really nothing new.


> How they did that is irrelevant.

Emphatically not.

If they beat human performance in this (non-AI-building) field by humans painstakingly coding rules for specific situations, then that's cool I guess but not groundbreaking, because the solution doesn't generalise.

If they beat human performance in a field heretofore intractable by software by throwing the basic rules and a ton of compute at an algorithm and then waiting for six weeks while the algorithm figures the rest out by itself, then that absolutely is qualitatively different.

The reason being, of course, that if they can find an algorithm that works like this across a wide enough problem space then eventually they'll find an algorithm which will work on the question of "build a better algorithm." After which, as we know, all bets are off.


If you think the how is irrelevant you are completely missing the point of this exercise. Maybe to you only the result matters but for every other task and humanity the how matters. Simply imagine next taking on a different Game like one version of the Anno series. If developers did it by hand, you need 50 devs sitting there for probably a couple of months, figuring out the best, rules their sequence and putting them in. That is about $20 Million just to get a similar AI for the next game. Compare that to download all available replays, requiring maybe 2-3 data scientist to get the data into shape, renting some compute in the google cloud and you get the same or a better result for probably half a million $.

Watch and learn from data alone is why modern machine learning is considered a revolution and novelty. Buying compute time in the cloud is in comparison (to devs and hand coding) dirt cheap and the results are often better.

Deepmind is not working on this problem for the benefit of gamers or the Starcraft community. Making the perfect bot is not the aim. Tackling the next hurdle, next hardest problem in machine learning is. On the way to become better at generalizing the learning algorithms.


Speed of play is a fundamentally important gameplay mechanic of any real-time game. One of the main reasons the pros are better than amateurs at these types of game is because they play and react faster.

And yes, of course computers are much better at doing things more quickly than humans. It's not even remotely close for us. The AIs are clearly better. It's not cheating either; they are legitimately better at it than us.

It sounds like you're simply objecting to pitting people up against computers in real-time games entirely.


So all they really proved is computers are faster then humans. I knew that before this started.

The Deepmind team knows the challenge isn’t to beat humans at Starcraft. That is trivially easy with the advantages you mentioned. The challenge is to be better at strategy then a human. That is why they tried to add artificial rules to make the AI have similar physical limitations to a human (emulated mouse, rate limited actions, emulated screen and visibility). There have been micro AI bots for years that could out preform any human. They knew they weren’t just trying to build another micro bot, because if they were it wouldn’t be much of an accomplishment.


> The Deepmind team knows the challenge isn’t to beat humans at Starcraft. That is trivially easy with the advantages you mentioned.

It's not trivially easy at all. No one had come close before. It took an entire team of ML experts at Google to pull it off. These hard-coded micro bots you're referring to didn't holistically play the entire game and win at it. They're more akin to an aimbot in FPSes, not a self-learning general game-playing AI.

This is yet another in a long string of impressive AI achievements being minimized through moving the goalposts. It's facile and it's boring.


>It's not cheating either; they are legitimately better at it than us.

This is not 100% true, the AI still skips the mechanical part (it doesn't have a mouse, keyboard and hands) in this particular case. This alone can introduce insane amounts of additional complexity, and will make AI to not be pixel precise.


The APM of AlphaStar was about half of the professional player in this match.

Check out: https://youtu.be/cUTMhmVh1qs?t=3189


But when it counts, such as during micro-heavy battles, it's much faster and more precise than a human.


yup. you could have 200 apm, but as long as your clicks and button presses are perfect, you are going to win against someone with 800 but is super imprecise.

blink stalkers are basically perfect for an AI because of the precision they can blink them around.


I sure hope so---then I could a 4X AI that was worth a damn.


except Scrabble


I assume you’re joking, but just in case you aren’t, Scrabble bots have outperformed top humans for 20 years with little more than a basic Monte Carlo tree search.


they haven't; I'm a tournament Scrabble player and the best program beats the best players at most 50% of the time.


In the TLO matchup, the ai wins with an army of disruptors, and unupgraded stalkers; ofc, TLO wasnt playing his best (in terms of micro or race), but it was still doing well with a micro-lacking unit (outside of blowing up its own army repeatedly)


Agreed. The micro was just too perfect to match. Can you imagine it with something like ravegers or reaper Grenades?


You'll likely be happy to hear that this has been (is being) addressed.

I watched the live broadcast of this announcement where they did a recap of all 10 previous matches (against TLO and Mana) and they talked about this concern. During today's announcement they presented a new model that could not see the whole map and had to use the camera movement to focus properly. The deepmind team said it took somewhat longer to train but they were able to achieve the same levels of performance according to their metrics and play-testing against previous version.

However...

They did a live match vs LiquidMana (6th match against Mana) against the latest version (with camera movement) and LiquidMana won! LiquidMana was able to repeatedly do hit-and-run immortal drop harassment in AlphaStar's base, forcing it to bring troops back to defend its base, causing it to fall behind in production and supply over time and ultimately lose a major battle.


It sounds to me like, although it could see the whole map at once, the fog of war was still applied. So the bot really just got as much information as the minimap would normally give a human player.

> it could observe the attributes of its own and its opponent’s _visible units_ on the map directly


No, not true. Just had an extended argument with a friend over this. Here are some of my arguments against what you're saying:

1. While it's true that a human player could see everything the AI is seeing, the human player has to spend time and clicks to go see those things, whereas the AI sees it all simultaneously without having to use any actions or spend any time to take it in.

2. Emphasis on the computer seeing it all simultaneously. The computer can see the state of two health bars on opposite sides of the map at the same time, or 100 healthbars in a hundred places at a time. A human cannot do that, and even trying to move the view around fast enough to do so would render it impossible to actually do anything else.

3. If it's true that seeing more at once is not advantageous, then it must also be true that seeing less at once is not disadvantageous. So by that reasoning a player playing on a 1 inch x 1 inch screen would not have any disadvantage, since after all they're getting just the same amount of information as long as they move the screen around enough! Reducto ad absurdum, a player with a 1 pixel x 1 pixel screen has no disadvantage either, because they have access to the same information as long as they move around quick enough. It quickly becomes evident that smaller screens inhibit your knowledge of the game state, and therefore larger screen benefit your knowledge of the game state.


One thing they said early on in the ~2 hour video I was watching was that, while AlphaStar had access to the full data of everything within its fog of war, it seemed to need to partition its access to it, in a way that was similar to a human checking different screens, and did so about ~30 (or was it 37?) times per minute.

This might be why changing to having to observe only one screenful at a time (rather than the zoomed out view) didn't seem to have as large an effect.


This is why a lot of competitive games have rightly decided not to support ultrawide monitors. Being able to observe more of the game map simultaneously is a huge advantage. The only fair way to support them would be to cripple the player, by cutting off the top and bottom of the viewable range, not by extending the left and right range.


> whereas the AI sees it all simultaneously without having to use any actions or spend any time to take it in.

Starcraft is a single-threaded game, so I would think that the AI ultimately still has to enumerate through each visible unit one-by-one to collect their information. Why is that so much different than enumerating through each visible screen and then enumerating through each unit on that screen? Either way, the AI could do it much faster than a human, whether it had to click through the screens manually or not. How would it be possible to eliminate this advantage? It seems to me that it's just part of the nature of AI.


You can eliminate that advantage by letting the AI only see the unit information for things on screen, like they did in the last game.


No, that doesn't eliminate the advantage -- that's what I'm trying to say. Even if you make the AI move the screen around manually and only let it enumerate units that are on-screen, that's still going to take roughly as long as just enumerating through all the units on the map in one go. It's just a matter of executing "foreach all_units" versus "foreach screens { foreach units_on_screen }". In either case a computer could do that much faster than a human.

Let me put it the opposite way: If you gave the human player a real time list of every visible unit on the map and all of their information, such that they didn't have to move the screen around manually and could see everything at a glance just like AlphaStar can, would that take the advantage away from AlphaStar? No, it wouldn't because AlphaStar could still go through all that data much faster than any human ever could -- no matter how it's formatted or what you have to do to access it. To AlphaStar, checking all the visible screens is just as much work as scrolling through a list of units.


I get what your saying. But screen movement is rate limited (meaning you can't loop through all possible screen positions in 1ms) so you have to actively choose where you want to focus, just like a human player. Think of it more like calls to a web server then "foreach screens".


Can't you click on the minimap to move the camera instantly anywhere on the map?

EDIT: I guess you would still have to wait for the next frame to get rendered, which could add up. True, that does change things a bit, but of course a computer could still do that way faster than a human.


They noted that the agent used around 30 viewport changes per minute, about the same as human players.


This sounds like a real advantage in the AI's favor though: It can focus its attention on a lot more things simultaneously. It's not just a UI difference; the AI is actually better at this, like how a pocket calculator is actually better at division than people. This latter bit we just accept; we don't defend humans by saying the calculator is cheating because it isn't writing out the calculation by hand.

Similarly, robots are physically stronger than people at any given task you can think of. That's a real advantage of them.


It is certainly a real advantage, but I think the argument is that it's not as interesting as an AI that could win on the strength of better decision-making, or the innovation of novel strategies, etc.


AI wins on the strength of better decision-making and novel strategies in Chess and Go, though. I have no doubt we'll see this in RTSes in the near future as well. For now we may not be quite there yet, as this is simply the first time it's beaten a pro player in any way. Compare with the AlphaGo match vs Fan Hui. A year later and it was dominant over all pro players.


> AI wins on the strength of better decision-making and novel strategies in Chess and Go, though. I have no doubt we'll see this in RTSes in the near future as well.

Yes, likely! I wasn't doubting it's possible or even likely. Only that seeing an AI do flawless 1000 APM stalker micro and macroing perfectly, while pretty cool, is not as exciting as seeing an AI use a novel strategy (edit: especially one that a human could theoretically execute)


I'm guessing that while there's a delay for decisionmaking, there's no delay between when it decides to move a camera somewhere else and when it does move the camera (direct API access), whereas humans need to move the mouse or hit a key, which is gonna take at least like 50-100ms where they're not doing anything else.

When they were talking about delay they were talking about delay between new information -> deciding/acting, which I think obscures the fact that humans have to do new information -> deciding -> acting, where acting takes non-zero time.


{{Delivered in the voice of the female British lady who would narrate Beyond 2000 series of shows - or the Modern Marvels narrator, to your mental predilection}}

After just decades in development, it is clear that the endeavors of those research scientist have finally bore fruit. And today its in the form of:

Intent based modeling, augmented with AI, which provides the reality we see today in both gaming and weapons systems.

The user, who must be human, is provided a range of inputs based on the desired outcome of the interaction with the systems and the real world.

What results, is truly remarkable.

A human is capable of multi-dimensional abstract thought, in a way that a computer cannot. As such - their intent is wired over to a swarm of objects with the most amazing capabilities.

A user can direct a swarm of either virtual bots or physical drones to accomplish a task. She can also combine the efforts of both physical and virtual for even greater effect.

Here we see a swarm of bots who are thwarted by a physical barrier.

The human driver can then instruct his virtual AI bots to attack the security system of the building to allow his drones to have passage.

But she does this through merely the intent for the portal to be open. The bots do the rest.

All the while the user is updated with real-time information on how the operation is progressing.

So, in the future, you may soon see just such technology applied to your kitchen or living room, where bots will cater to your every waking need - and sometimes your non-waking needs as well.

More

Applications are open for YC Summer 2019

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: