The cool thing is that bruteforcing computational power seems to get us decently close. I'm optimistic that with renewed interest in the reinforcement learning field, breakthroughs will be made on the algorithmic side within a matter of time.
It is interesting to consider that manually scripted behavior may have been responsible for some of the more strange moments in the matchups; possibly they may have had hardcoded place-ward() routines that resulted in all of that clumped-together warding... or maybe even "check-roshan-pit()" without adequate training data to identify Roshan's respawning patterns?
Between the two games, we saw Axe played by both the human team and the AI team. When played by the humans, blink-calls were completely shutdown by the AI's superhuman counterinitiations. That made enough sense. When the Axe was played by the AI though, I don't recall Axe ever even attempting any blink-calls. I'm curious if this might be the result of the AI overfitting to itself -- at AI reaction speeds, blink-calls are not a very useful maneuver, and thus the AI learns not to perform them.
Against a group of humans though, Axe's blink-call initiations are arguably the hero's biggest selling point.
We didn't get to see most of the hero pool, but I wonder how much the AI overfitting to AI playstyles will hinder the bots against humans in the future.
Of course, the bots have many other issues which loom larger atm imo but I felt interested in enough in this tidbit to point it out.
Dota is balanced around human reaction speeds so it makes sense to have an equivalent reaction distribution, otherwise, it gets arbitrary where a bot trained on 200 ms may behave differently to a 20ms bot.
Thats the thing I tried to understand. The blink-to-call is very specific of a sequence. Youd have to blink in middle of several enemy units and call them. What are the chances of the bot doing that randomly and therefor understanding its value?
Specially if the bot realizes random blinks are a watse of resource or jumping in middle of enemy is a bd strategy, the chanced of it trying that wih Axe and therefore learning it gets lower.
How much training datasets were these bots given though?
Use a unique name like “Galaxy” which doesn’t represent anything remotely close in the game - Spell names, skills, etc. There is a huge amount of stuff going on in the game and it’s such a heavy cognitive load for an outsider who doesn’t play Dota, it was annoying to keep checking if they meant the name of the engine or number five in a game of Five vs Five. Or Five vs 5!!!? I’m so damn confused.
Same thing here: https://openai.com/five/
Number #2 bullet point says “Defeat five of the world’s top professionals
Five will attempt this live at The International in Vancouver’s Rogers Arena this week!”
It is such a poor choice.
The numbers seem pretty arbitrary to me, that's probably what this blog post is talking about when it mentions why it lost.
I do think if they'd used some more sophisticated RL algorithms, perhaps with intrinsic curiosity, or some kind
of hierarchical task learning, they might have been able to reduce their training time and maybe been able to tune their hyperparameters a bit more
The game vs. Pain clearly demonstrated how humans can use wards to gain an information advantage over the bots that otherwise had a great chance of winning the game.
I don't know. I assume it's similar to how AlphaGo measures its ELO ranking. But the strange thing is, this is hundreds of years of self-play, not a public pool of humans playing against each other. How does MMR in simulation translate to MMR in real matches?
Before pointing out that it's possible for an ELO rating, consider that Dota MMR is a bit different – every game you win, you get +25. Every game you lose, you get -25. This changes at the very high / very low levels, or if the matchups are wildly imbalanced, but that's the general setup. Or it was, a few years ago.
Does anyone have a guess?
We don't claim this is perfect, which is why we label the chart "estimated MMR".
It might be worth hiring one of the top dota pros to coach OpenAI's bots: they could point out each small mistake that the bots made which pros wouldn't make. That might make it more tractable to beat the top team next year. (The actual dota coaches might be even better for this purpose than the pros, too.)
Looking forward to the outcome! The goal of "beat the top dota team" is one of those ambitious ones that few companies take on. It's really interesting to see the incremental progress.
The solutions would probably take the form of adjusting weights, adding additional dimensions, removing or refactoring existing signals, or deciding that extra training would naturally solve the problem.
I had an MMR of around 4.4k solo/team, I think the average player had 2.25k MMR, the standard above average player had around 3.5k to 4k MMR. Anything past 4k MMR exponentially made matches way more intense. Its always been this way as far as I remember even back in DoTA 1 days
4k MMR in DoTA skillwise is about the equivalent of Platinum in rocket league, or Diamond in Overwatch
I had thought this as well, but the more I think about it the more I'm questioning how far along the bots actually are.
They're excellent at teamfighting. No doubt about it. But they seem to be inferior to humans at nearly everything else. Basically anything that requires strategic thinking, the bots pail in comparison to even a 3-4k mmr player. I'm really starting to suspect that the bots have essentially been able to be successful purely due to their tactical abilities. And I see very little reassuring progress in strategic abilities tbh. For example, the highground push with no buybacks on the AI side that ended in a 5-man buyback and teamwipe from the humans was a glaring strategic error. The warding is bad. The Rosh play is terrible. Item choices (another enormous strategic element) are still scripted. Nearly all heroes with abilities that require significant strategic thinking are left out of the AI's hero pool. Axe was pushing Radiant top without a TP while his base was getting destroyed and all of his other teammates were dead.
On one hand, it's easy to feel like OpenAI Five is making good progress because it's legitimately challenging pros. Upon deeper analysis though, the bots haven't actually demonstrated ability at more than a 2.5k-mmr level on anything other than laning, teamfighting, and perhaps defensive map movement. Given that I haven't seen much evidence that the AI is making progress on the truly strategic elements of the game, I'm not entirely convinced that what we are seeing isn't close to a plateau already.
More breakthroughs are certainly possible, but I've seen enough to make me more skeptical than I was before I'd seen OpenAI Five play with my own eyes.
I could see the Captain AI having various modules to help it decide what should be done next. For example, a drafting module, a module to predict where the enemy players currently are (some kind of threat heatmap?), what the enemy players are likely to do next (i.e., where they will farm, smoking, roshing, etc.), and what items the team needs (i.e., centralized item purchase decisions). I imagine these modules would be trained in an entirely different way from the way the AIs are trained in-game.
Technically, I can't help wonder if some kind of hybrid ANN-expert system architecture wouldn't work better. ANNs are rightfully popular due to their effectiveness but they seem so inefficient in this case. A few hand-coded rules could eliminate the basic mistakes the AI makes, and for the rest they could evolve the rules using some type of genetic programming. The latter would still allow the AI to come up with its own novel approach to the game.
Without this they would get stomped.
I'd suggest Total War: Warhammer II (it has an interesting competitive scene, and has both tactical combat, and strategy gameplay), which is very different than DotA 2, but it would be super intriguing to see how it performed, and how fast it could learn.
If you built Five in the right way, it should be able to learn with mostly any other game.
I can also imagine you offering this to gaming companies in the near future, so that they could provide a decent computer AI instead of the crap they usually offer now :)
(source: I used to be a videogamer in my teens, and occasionally still play some strategy games, not as often as I would like to :D)
DoTA has always been a much more micro focused, whereas starcraft is both macro + micro focused. Infact, the popularity of the game stemmed through this. MOBA (the genre DoTA is based on) is essentially the "Fun" parts of starcraft, controlling and leveling up your hero
MOBA is from a game called Aeon of Strife, a custom Starcraft 1 modded map. I played it a few times growing up.
When I hear about AI and video games, I get slightly confused though with this terminology. We've had hardcoded "AI" for 20+ years already that's actually been fairly decent. "AI" was also the official term used in many of these games.
When I hear of AI and video games now, I assume everyone is talking about neural networks, machine learning, etc.