
OpenAI Five: Goals and Progress - gdb
https://openai.com/five/
======
shawn
As someone who was once semi-pro in dota (4400 MMR, get rekt), it's _freaky_
watching these bots play. It's uncanny. Little things... Like, when the bots
are taking a tower, one of them will stand in front of the tower and tank the
creep wave, so that their creeps do more damage on the tower. They had to
learn this.

Insta-TPing right when an enemy wastes their stun and can't cancel their TP.

Grouping up as 5 at the beginning of the game and pushing into the enemy
jungle. _Pubs never do this._

The most interesting part is that OpenAI appears to be discovering new
knowledge in the dota scene. For example, they always take the ranged barracks
first, never the melee. This is exactly the opposite of what the pro scene
does. Therefore, the smartest pro team should study what the bot is doing and
trust that on average it's a better idea to always focus on the ranged
barracks first. After all, if it was a bad idea, they probably wouldn't do
that.

The most hilarious part was when OpenAI paused the game, then resumed it. This
illustrates that there is still some unexplainable randomness.

Question for OpenAI: Is it more accurate to think of the bots as 5 separate
minds, or a single mind controlling 5 heroes?

EDIT: By the way, TI is going on right now!
[https://www.twitch.tv/dota2ti](https://www.twitch.tv/dota2ti) If you're new
to the scene, take a peek. TI is always so high energy -- even if it's hard to
follow what's going on, listening to Tobi (the shoutcaster) go nuts during the
game is always a highlight.

And of course, /r/dota2 has the best memes anywhere, hands-down.
[https://www.reddit.com/r/DotA2/](https://www.reddit.com/r/DotA2/)

~~~
eertami
>semi-pro in dota

>4400 MMR

OP is being extremely satirical here by the way. He means he's not great but
knows how to play (and definitely not semi-pro) but that context might be lost
if you don't play Dota!

>Is it more accurate to think of the bots as 5 separate minds, or a single
mind controlling 5 heroes?

They answered this on the last stream, iirc it's 5 identical clones with the
same goals, but not sharing any knowledge, info, or decisions with each other.

~~~
hokumguru
I think 4400 MMR places OP in at least Ancient-1 ranking which is
approximately the 95th percentile. I'd call that at least semi-pro.

~~~
a_humean
Errrrr, at best a dedicated amateur.

I'm just 3.5k (I think that's 70th percentile), but I know lots of 4.5k
players. To describe the average 4.5k player: Probably has regular groups of
people they play with at different skill levels (anywhere from 2.5-5k+),
regularly plays battle-cup on Saturdays, maybe played amateur JoinDota league,
maybe had a laugh and played open qualifiers only to lose in the first couple
rounds, and probably log between between 10-20 hours per week into the game.

4.5k players know how to play to a very good standard and beat the vast
majority of other players, but are miles away from the weakest of the
professional scene. 4.5k doesn't even appear on the leader boards.

~~~
bkovacev
I definitely agree with your statement, however..

Solo - the guy that is the captain of Virtus Pro, was at 4k for the longest
time. There's more to dota than just mmr.

There are players at 5.5-6k range that still do not understand the basics of
team play, but are just extremely mechanically gifted and are in great gaming
shape.

------
minimaxir
From a presentation standpoint, I am impressed by and appreciate the effort in
making the project process transparent and accessible, even to those without
an AI background (in contrast to recent AI literature which tends to
_obfuscate_ the secret sauce).

~~~
furi
There is nothing transparent about OpenAI. They have never released any of
their models to the public, despite the fact that their models play completely
different strategies to humans in an extremely heavily modified version of the
game (multiple updates out of date, 80%+ of the heroes turned off, many core
mechanics disabled or modified beyond recognition). Without them releasing the
models for people to practice against there is absolutely no way to tell the
difference between AI superiority and the humans being unfamiliar with the
enemy tactics and even the very game they are playing. Compared to actual
professional Dota, where pros have tens or hundreds of matches played by their
opponents to study, an ecosystem of thousands of top level players hashing out
new strategies for each patch and months to practice that particular version
of the game, this is not a test I would call "open".

------
Leary
The same 18 heroes? While impressive this is less of an improvement since the
August 5th match even if they beat the pro team.

I thought they'd at least remove more of the rules (5 couriers, no illusions)
or add some heroes.

~~~
nstart
Not sure if they can remove the rules of no illusions. The bots would
completely wreck the humans if they were allowed to use illusions. I don't
even want to think what would happen if they learnt how to use phantom lancer
or nature's prophet. Most people throw all their illusions/summons into one
bucket and the hero into another. The AI being able to control each unit
perfectly would be terrifying.

~~~
evozer
But it would be fun to see NP bodyblock the entire enemy team with one set of
treants.

------
foobaw
I wonder if any updates have been made since the last match to remove more
restrictions. The most common complaint from users was the courier changes.

------
exabrial
I really want to see them play humans with no restrictions on the humans! I
get it they're still in the learning phase but I want to see the gloves off

------
doctorpangloss
Is dealing with imperfect information a research goal?

Does the OpenAI team think there's a way to adapt the UX of DOTA 2 "Perfect
Information Edition" to communicate the game better to human players?

~~~
modeless
AFAIK the bot's "vision" is subject to fog of war, so it's not a perfect
information game in the usual sense. Yes, it gets precise numerical values for
hit points etc from the API, but only for visible units.

Honestly I think that it would not be much more difficult to train a bot that
looks at screen pixels and outputs keyboard and mouse events instead of using
the bot API. In fact it might be easier to code, but the problem is it would
require several orders of magnitude more processing power to train, which is
impractical. I am confident it would work if the processing power was
available, given the success of these techniques on other problems.

~~~
drexlspivey
This would require the bot to learn to point the screen to the right place

~~~
modeless
It would require the bot to learn a _lot_ of things. Perhaps a curriculum
learning method would be appropriate. I don't see any reason why it wouldn't
be possible though, given several orders of magnitude more compute power.

------
dbelchamber
I'm very excited about this. When I watch this new breed of AI play, I find it
really interesting what they value and greatly enjoy speculating as to why in
human terms.

------
nstart
I watched the Open AI play against the "team" of pros at the calibration match
earlier this month. Couple of observations and takeaways.

The first is that the bot strategy currently revolves around the special rule
of 5 invulnerable couriers. Bots find microing lots of units effortless, so
the map constantly showed each bot's courier flying back and forth carrying
regen. The bots never had to really go back to base or their shrines to heal.
This is important because it changes the meta of the game entirely. The way
the game is structured allows only one (very vulnerable) courier per team.
Usually this means that after a team fight, teams need to reset since they've
expended significant resources for the fight. But that meta was non existent
under the rules for matches against the Open AI five. The humans had trouble
coping with this as they weren't used to the idea of ferrying regen
constantly.

Takeaways here - I could go on about the nuances of a single courier. But
basically, the bots' gameplay will likely have to change once it comes down to
1 shared courier per team. Not sure how that will affect the architecture of a
"no shared mind". Also, humans will likely need to take a page out of this
gameplay and realise that couriers are a highly underutilized resource. Every
second it's not doing something for no reason is just as bad as a hero not
doing anything.

The second observation comes from the last game of AI vs pro humans. This was
an interesting game where the audience picked a losing set of heroes for the
team. Despite a predicted chance of winning being less then 2% (iirc) he AI
could have probably won on account of being mechanically better than the
humans. But their insistence on sticking to a strategy of "push hard" found
them doing really strange things. The strangest of this was Slark running
ahead to cut down creep waves in the lane on its own. The human players knew
this would happen and they kept forcing the Slark to go hide in the trees and
at some point they were always able to corner it and get the kill. Over and
over again. The Slark never changed.

Similar things happened around the map during this game.

What should have happened was that the AI should have adapted to its
disadvantage, and poured its efforts into first defending and then snowballing
later with its mechanical advantage. But that element of "intelligence" was
never there.

The takeaway is this. The AI will eventually beat the humans on account of
them being always mechanically better. They need very slight changes in their
strategy to win 99.9% of the time. They can be aggressive beyond any human
possibility because they can calculate everything to perfection. How long it
will take them to travel across the map vs how much longer it will take for an
opposing hero to have its ultimate ready for example. There are a lot of
mechanical components to Dota that the AI will always have an advantage over.
But the AI will likely always reveal quirks that can be turned into dumb
winning strategies (aka cheese strats). Something like the whole team fighting
from the trees for example might just confuse the AI terribly. We don't know
but every now and then someone will discover it and the teams working on the
AI will have to "patch" the behaviour.

Final takeaway from all of that - I'm not sure if training the AI towards
"objectives" is really the best metric towards making an intelligent bot. It
seems like what's instead happening is that we get software that has no
intelligence at adapting in the moment to things its never seen even if they
are brain dead. But it'll get better at hiding them through mechanical
perfections.

Upside - We get AI's capable of doing increasingly complex things in a
seemingly perfect manner.

Downside - We get a scary future of AI filled with byzantine issues that need
to be "patched".

