I don't think that's a wrong outcome, but people trying to frame it that OpenAI beat OG at their own game in the same lines as go, chess, or checkers seems misleading. I think I am just still feeling sensitive to the sentiments and outcome that ended up being the outcome of systems which got on the press pipeline like watson. I wish it was more celebrating how they were able to create this ensemble learning environment, so I'm hoping for that more detailed analysis especially how they were able to adapt (and hopefully not throw away) the learning they did after some of the really large patches.
* OG would absolutely handily win against any non-professionally-contracted dota team under the exact same challenge conditions. (i.e.: OpenAI Five's test was about as good as they could give it at this stage.)
* OG's play within the scope of the game was novel, in that they were deliberately picking strategies that OpenAI Five didn't "understand" (i.e.: creep skipping) and still lost.
* OpenAI's play was novel, even within the constraints of the pool: No human team would pick the 4-core + CM lineup that OpenAI favoured. Even if you played with that limited hero meta, 4-core would likely not end up being a deliberate strategy picked by human teams with any frequency.
* OpenAI's play in all team fights was demonstrably better, playing exactly up to the limits of the heroes they were using and no further. (It is, in fact, OpenAI's teamfight which won them the game.)
* A number of the reductions and restrictions in hero-pool were to allow for OG to have a fighting chance: All heroes with controllable illusions or clones were banned because OpenAI had too much of an advantage in "micro".
If you sum all of those things up, I think it looks painfully obvious that OpenAI Five, if trained against the full hero pool, would completely and utterly dominate gameplay at the professional human level. It wouldn't even be close.
Because OG are mechanically better than them.
Flawless reactions make for a big difference (see euls/hex scripters for the impact of that on a game). Throw in perfect global state and zero mistakes, it becomes very very hard to out-skill.
The question that people actually care about is whether the AI can be strategically better. If you watch pro dota you'll see that there are often teams which flash, briefly super strong and then fall off as better teams study them and learn to counter their play. OG themselves have had serious problems with CIS teams that play fast (making them arguably the worst team vs a fast playing AI).
The hero pool is also important, as currently it's very limited in offering counters to deathball.
> OpenAI Five, if trained against the full hero pool, would completely and utterly dominate gameplay at the professional human level. It wouldn't even be close.
Fundamentally disagree. The mechanical advantage increases but the strategic disadvantage also increases. Time will hopefully tell.
We already know the answer to that; computers dominate humans strategically. There are two types of strategic decisions - those made using maths, and those made by guessing. Computers are perfectly capable of doing both, and better at at least one. Besides, Go is far more strategic than DotA and computers are better than us at Go.
The reason we are seeing research in videogames is purely tactical. Historically computers couldn't compete tactically with humans because there were too many variables and decisions to write up an if-then loop that made sense. DotA is 95% a game of mechanical domination, 5% strategic concerns. The strategic concerns could be modelled with some pretty basic Bayesian models and probably be world class; it is item-hero combinations, a few rules about location on the map and guesswork.
All the challenge is taking that strategy then implementing it mechanically - doing all the stuff OpenAI 5 is demonstrating regarding jukes, tactical positioning, spell timings and judging relative strength with incomplete information.
DotA isn't as fast as it looks either, it is quite a slow game in terms of reflexes. Reflexes tip close games, but positioning tips the average game. OG was losing because computers are just better at estimating margins of safety.
This one is not so certain. One of the OG players (Notail) mentioned after the game that the AI definitely had a few weaknesses such as:
* AI didn't bother checking the trees after the splitpushing hero disappeared from the map.
* AI doesn't handle invisible units well.
* AI has poor warding.
He said he regretted not exploiting these weaknesses more. There are certain splitpush ("rat Dota") heroes that potentially could have helped OG to the detriment of AI e.g. Antimage, Furion, Morphling, Juggernaut, etc. Instead, what we saw was OG using a non-traditional split push hero (Viper) from the limited hero pool try to split push, which failed because the hero got ganked too often (Viper does not have natural escape mechanism unlike many of those splitpush heroes above and he didn't itemize for splitpush e.g. Shadowblade first item) and it doesn't push very fast. With these splitpush heroes and well-placed observer wards, I think a splitpush-based lineup could possibly take the game.
OpenAI's play was very impressive. I did not see this coming after OpenAI lost to some of the weaker pro teams at last year's TI. However, I would give OG even chances at winning a longer series (Bo5 or Bo7) over 5 or 7 days where they could learn from OpenAI's playstyle and figure out strategies on how to exploit them similar to how in longer Dota 2 tournaments, we see initially successful strategies on Day 1 get countered by the tournament final. Another analogy is the 1v1 bot that OpenAI released. When released, even pro players would lose to it. However shortly after, the combined intelligence of the Dota 2 community discovered an exploit where if you lead the enemy creepwave into the jungle, the AI bot glitches out and the human player eventually wins .
This isn't true always, AI did check trees near radiant bot T1 in the co-op match.
Also, I am hoping OG competes against OpenAI during this weekend and streams it. It would be exciting to watch.
Putting a Riki mid in the first game and a Slark pick the second game doesn't really tell me a whole lot, since Open AI's team bought dust against Riki and the hero pool says 'early push and win or lose' which generally defeats a Slark that generally needs items this patch (47% win rate at this point). They clearly indicated in prior posts that them buying dust and dealing with invisibility are part of the skills, and it's really not hard to end up with a strategy of while in an attack/aggressive state and see Riki, use dust if available.
Many teams use a 4 core/hard 5 strategy at the not top tier and it's reasonably effective. The 4-defend-1/3 core strategy generally only works because Roshan/Aegis is in the game to make that min/maxed core that's ahead take additional chances with less risk and also when there are clear heroes that are just a tier above the rest in the current patch or meta. If you take that out or remove aegis planning from your team 4 core is almost trivially the way to go. Even in my casual experience this is generally how I've ended up playing so maybe it's less surprising to me. Once again I point at given the format was a professional team that is built around a 3 core strategy/mindset really the right team to play in this mode?
The two drafts for Open AI were Sniper, Gyro, CM, DP, Sven and CM, Gyro, Sven, WD, Viper. Both of these are push to win at 20 minute drafts and they were played as such. I did not feel like OG did enough ordinary/basic strategies given to carry them to the mid-late game like their drafts indicated they wanted to do (ES, WD, Viper, Riki, SF and Sniper, ES, DP, Slark, Lion). They tried some clear gimmick strategies just to see if the bots were totally broken, if they intended to do creep skipping they would have picked Axe who was in pool. It's hard for me to felt like they played better when it was the natural power spike of their draft, as if Open AI was held off or maybe a more reasonable hero was chosen other than Riki/Slark and a more coherent strategy with what was provided was used it would have been a better game when it hit 30-40 minutes. Instead their experiments with gimmicks didn't really pay off and they lost to the hard push team and didn't have a strong hero set to push out the lanes and create an opportunity if they failed.
Also, the Open AI drafts are ones that I feel like I were constantly playing in 2012-2013ish until a lot of the new heroes were added to the pool to make them less popular and relevant as they introduced movement/displacement mechanics that make the original hero pool less practical. Since those new heroes are taken out, this old form got put back in.
I really doubt that illusion control/micro control would have really made a difference and if it did I would have actually preferred to see that because that is new and novel. Dota is very different to previous RTS game AI's where every unit if controlled properly was of equal strength and could be maximized. Dota illusions generally are just worse in every dimension and have value in disjointing spells, doing rat things, vision, and maximizing some items/abilities that have fixed attack modifiers but otherwise are not a primary source of damage (100 damage with a 33% damage illusion after 15 armor is like 17 damage which is not a big swing when heroes regenerate constantly). Compare this to mutalisk micro where a perfect division of an attack from a group can mean two or three units instantly killed instead of one which generally would be the limit of human control. Divide and conquer with illusions for attacking purposes I would say is generally a bad strategy because I still believe killing the hero in front of you before attacking the next guy is still probably the best option. I find it hard to believe that perfectly controlled illusions would affect the game in a meaningful way compared to highly skilled players, and would have loved to see that. Sure some really great Meepo players can shine, but the game inherently has solutions to that problem and most others (echo slam, burst magic damage). In fact that's the kind of thing that I enjoy the most out of watching AI driven gameplay is seeing when they push the limits of what humans can even accomplish and to see if I can learn anything from it.
There are also plenty of other real world equivalents that once you change the rulesets that other people do better like speed limiting in NASCAR at Talledega, a football team that is better in the snow, or a hockey team that plays in a rink with different dimensions. I just wished that we had ended up with a more reasoned view of what happened and could celebrate their actual achievements in context than saying they beat a pro team at their own game which is kind of where the headlines have been going. This is just fundamentally different to what Alpha Go or Deep Blue did where they soundly and completely conquered on the same playing field.
The real trick seems to be just massive amounts of self-play. This model had the human equivalent of 45,000 years of experience playing DOTA 2, with 250 years of experience per day.
So the large section is run for each unit that is known to the agent. For heroes, each modifier / ability / item that hero possesses is processed. Since it’s max-pooled, only one of the modifiers / abilities / items would actually be considered by the agent. Later, the known units are grouped by their relation to the agent and max-pooled again, so the agent would only consider a single enemy non-hero and single enemy hero at a time. Also, what activation function is used for the FC (fully-connected) layers that aren’t relu?
An average human game lasts 30 to 45 minutes. If OpenAI simulated 1,000,000 games, that'd be over 28 years of non-stop gameplay to a human.
Obviously, this limited rule set is not pure DotA, and so the humans we're at a slight disadvantage.
The results of letting anyone play against it will be the huge test - AI often fails to "cheese" strategies, and this will be a good test of whether any exist. I'm excited to see the results.
So by limiting the amount of heroes, humans are clearly at a disadvantage. In the last international, from the hero pool, 110 heroes were picked and 98 were banned.
The micromanagement claim seems fair when it comes to illusions.
With only 25 heroes allowed (of > 100) they were unable to get >5k MMR...nowhere near pro level. What are they at with the full hero pool? 3k?
I'm not trying to cast aspersions on their AI work - it's obviously an incredibly hard problem. But instead of admitting that they failed to make a competitive team, they instead play a stripped down game that's easy for their bots' limited strategies, and then publish triumphant marketing releases claiming mission accomplished.
The industry and researchers in particular should really be more careful with their claims, and honestly also try way harder to actually completely solve the challenges in question.
OpenAI Five is the first AI to beat the world champions in an esports game, having won two back-to-back games versus the world champion Dota 2 team, OG, at Finals this weekend.
What they've achieved is actually impressive, too, so the deception is doubly annoying.
True, any given team will tend to strongly favor a subset of heroes, but that subset by necessity must allow for a wide range of play styles and strategies.
The limited pool for OpenAI largely favors 5-man "deathball" tactics, which OpenAI plays fantastically well (an amazing achievement in its own right), but also excludes all the heroes best suited to countering that strategy. Sniper, for example, is an extremely strong hero in just the right circumstances, but is generally considered fairly weak and readily countered (in the current patch) by heroes such as Phantom Assassin or Spirit Breaker, neither of whom are present in the OpenAI pool.
Aside from anti-deathball specific counter picks, heroes that favor a wildly different approach to the overall strategy (split push/"rat" dota) such as Tinker are also excluded.
Most importantly, being able to work out what kind of strategy the enemy team is going for during the draft and adapting your own picks and bans accordingly (and luring your enemy into useless bans or counter-picks) is a huge part of the game. Once the draft is over, teams then have to be able to adapt their desired strategy to the reality of the team composition they landed on, and the best approach may change several times over the course of the game. I really want to see an AI that can realize that something isn't working in its current deathball strategy and that it needs to switch to playing both sides of the map at once by keeping the enemy occupied in once place while taking objectives in another (or vice versa).
To reiterate, I am incredibly impressed by what OpenAI has already shown, and I look forward to seeing where they go next, but I do think there are some very interesting problems in the full game that OpenAI has yet to demonstrate a solution for. I have similar slight disappointment with the recent AlphaStar StarCraft II project, which was a similarly amazing piece of work but also seemed to indicate that each agent (swapped out after each match with the human player) was only really able to execute one strategy and had no ability to recognize that something wasn't working and adapt (as seen in the final (and only) loss to the human player).
So, do I think that human pro players would defeat an OpenAI trained on the full unrestricted game in a best-of-three match? No, almost certainly not, but I do suspect that the humans would win in, say, a best-of-seven, if they had the full suite of tools available to allow them to discover and adapt to what I see as fundamental shortcomings of the AI.
In essence, all of these game-playing models are playing by "instinct" (and playing very, very well), but still lack the ability to "jump out of the loop" to critique and adjust their own behavior on the fly. This is why the AlphaStar agent was vulnerable to being manipulated by the human players drop-harassment, and I believe OpenAI will demonstrate similar weaknesses once it opens to the public this week.
Machine learning has opened up all kinds of incredible advances that we can apply to the real world (see Boston Dynamics locomotion, or OpenAI applying the approach in their Dactyl robot hand project), but machine learning alone can't take us to general AI and I suspect it will always remain susceptible to this kind of deliberate manipulation.
It would be very ineffective if this AI had to input everything through a human body somehow. Now to be fair, it would be very difficult to not win on micro. There’s no real set of constraints I can think of that would make it a fair match.
Ideally they should probably be shifting to a game where micro just isn’t as big of a differentiator. Or, consider moving to a game where a strategic player issues commands and human players execute them. Off the top of my head I can’t think of anything that really fits that build.
Alternatively, spy party (an incredible game) could be an excellent candidate as it requires deception, and a strategic awareness of what the other player is thinking. And, at a high level of play, micro is practically flawless.
Chess is a strategy game with no micro. So is anything turn based.
Chess has no micro, but also has no hidden state. You don’t really need any concept of what your opponent is thinking to play optimally.
It's like the joke from Futurama about why humans don't watch the Robot Blernsball League (basically future-baseball).
Bender: Now Wireless Joe Jackson - there was a blern-hitting machine.
Leela: Exactly! He was a machine designed to hit blerns. Wireless Joe Jackson was nothing but a programmable bat on wheels.
Bender: Oh, and I suppose Pitch-o-Mat 5000 was just a modified howitzer!
how about this prediction: people will be suggesting that their machine learning models are a big step towards AGI, forever.
Clearly very exciting and a remarkable achievement but does anyone else marvel all the less when such colossal resources are needed?
Obviously it’s not entirely brute force but it seems much too brute force to be considered “intelligent”.