
OpenAI Five Benchmark: Results - runesoerensen
https://blog.openai.com/openai-five-benchmark-results/
======
a_humean
I'm really excited to see the limits get lifted, in particular around items
and couriers.

The team of semi-pros/ex-pros/pros that played against the AI commented that
the AI was using the highly unusual 5 invincible couriers to enable a style of
play that isn't possible in normal dota. The AI's solution to dota was
unrelenting aggression once a small early game lead was established after ~10
minutes. This early aggression was possible because the AI was able to ferry
items (healing consumables) with such frequency that once the human team made
one mistake in a team fight it wasn't possible to recover.

Normally after winning an objective you are forced to reset as you have
expended a lot of resources to order achieve your objective, and you are
actually most vulnerable to a counter play right after you win something big.
With a constant stream of healing this risk was significantly reduced.

Also, the drafting is a clear limitation that needs to get lifted. The AI was
essentially pursuing a "death-ball" strategy of grouping as 4 or 5 and pushing
right down a lane, which can be countered by picking mobile and fast heroes
that can put pressure on different parts of the map and slow down the "death-
ball" by forcing reactions. However, none of those heroes were draft-able, and
so the humans were forced to play the AI's strategy. The AI's strategy favors
team fight coordination (laying stuns correctly, and correctly calculating
whether damage through nukes is sufficient to kill any particular target) and
reaction times, at which the AI was clearly superior to the human team.

~~~
justicezyx
Great summary.

I'll add:

\- The coordination between AI bots are clearly beyond human level. Or at
least as demonstrated from similar performance from the humans on similar
style of heros.

(It might not appear too different from the show match, but based on my 2-3k
hours watching pro games, the coordinations are noticebaly better than the
best team in history, aka Wings gaming 2016 TI champion).

I am not sure how such coordinations are modeled in dnn, which itself seems
the most valuable from this research.

\- In general I think with this show match, it pretty much sealed the doom of
human players in dota2.

As it shows that the general approach is scalable and capable to handle the
problem itself. As from laning to team fight, and item building, the AI did
not show weakness at all.

I was worrying about AIs general inefficiency in deriving the winning
strategy, laning stage, and team fight coordinations, which turns out to be
obviously superior to human players.

Drafting probably will be even more favorable to AIs. The challenge would be
can they train faster by observing the change log, I.e. finding winning
strategy without training from scratch each time after a patch release.

I seem no reason AIs lose to vp/liquid/lgd (the top 3 going into ti8). The
idea that split pushing hero can deal with the team fights seems underestimate
the AIs discipline, which is clearly superior to the best humans.

\- Last is how much computing resources are used in the training and playing.
Hopefully value can team with open ai to release a benchmark bot team for
calibration and a different ladder systrm of playing different AI strength
level

~~~
iotb
As someone who plays Dota, the matches were a clear demonstration of sloppy
snowballing and brute force cheesing. Coordination 'appears' to be beyond
human level because the bots are collectively synced on a team goal value. If
a true 'pro' or pro team was allowed to observe and play multiple games
against this bot, I'm more than certain one could find an exploit of such an
unintelligent mathematical approach to action selection at the individual or
group level. In fact, this would be what you would use dark seer strategically
for or a whole host of other characters and functionality that is currently
banned.

Not everything can be calculated especially when tricks are intentionally done
to throw a bot off.

> I am not sure how such coordinations are modeled in dnn, which itself seems
> the most valuable from this research. There's a tuned group/individual
> driver function centered on various calculations. This is not actually a
> valuable part of the research as its dynamic and game dependent and can't
> cover all of the possibilities thus why someone broke their 1v1 bot (corner
> case)

> In general I think with this show match, it pretty much sealed the doom of
> human players in dota2.

If you are indeed a player and have viewed that many hours of dota2, I
question the nature of such a comment. A great player wouldn't look to see how
to 'beat' their bot, as it is a bot w/ no intelligence, the strategy would
instead be to try to break it and shove it into corner cases. It's not playing
the full range of characters that were intended to disrupt cheesy snowballing
so I wonder why you're making such an optimistic statement .. being that you
claim you have watch so many dota matches. Do you play much yourself? Maybe
that would change your opinion.

> As it shows that the general approach is scalable and capable to handle the
> problem itself. As from laning to team fight, and item building, the AI did
> not show weakness at all.

I'm starting to see a pattern with your commentary. The gameplay look like
your typical "south" players. Hardly anything impressive : Aggressive
boneheaded tower diving and aggressive and cheesy snowballing.. If you can
last past 30min, you outwit and outplay such people in the mid/late game.

> superior to humans > superior to humans > superior to humans

More than 50% of the Dota 2 dynamics aren't even present and are restricted at
the moment. Are you getting paid for this post?

~~~
nopinsight
Your point regarding the fact that the bots ‘may’ not be adaptive to
surprising strategies is a good one. We do not know for sure in the case of
OpenAI Five as there are too few public games to look at.

AlphaGo Lee (the version which won 4-1 against Lee Sedol) did seem to get
thrown off track by Lee’s surprising move and lost that game.

However, AlphaGo Zero, which is based on some of the same principles/sets of
algorithms, were much stronger than AlphaGo Lee (More than 3 stones according
to DeepMind. Three stones is about a difference between top pros and top
amateurs/beginning pros.) and seemed like it would be insusceptible to any
surprises thrown its way from human experts.

The difference was that AlphaGo Lee learned from play records of human Go
experts while AlphaGo Zero did not and only learned via self-playing. Dota 2
is clearly more complex than Go but if the same principles apply then an AI
trained from pure self-plays would be adaptive to most surprises in the
domain, _if_ the system had explored those edge cases before (which depends in
turn on how the self-plays were conducted during training).

(As a side note: OpenAI Five probably chose the “simple-minded” snowballing-
cheesing strategy because it determined from extensive experience that the
strategy is most likely to yield a win _given_ its capabilities (which are
advantageous to humans in some respect like instantaneous global information
observation, great coordination, consistency, etc). This is very different
from the reason some human players choose the strategy. Perhaps precisely
because Five bots don’t get sloppy that the strategy is so effective for
them.)

~~~
iotb
> We do not know for sure in the case of OpenAI Five as there are too few
> public games to look at.

Thus the nature of a canned showcase demo. We do know they have a slew of
restrictions. As an avid player, I know exactly why : because such combos
require much deeper and true intellect to play efficiently. Even as such,
given that I know i'd be up against an optimization algorithm, my strategy
would be to create as much chaos and uncertainty as possible. Information
theory is clear as to the impact this would have : It would be unstructured
noise that would be hard to optimize and likely not seen before or
significantly reflected in the AI's weighting system. This is the basis of
adversarial attacks. I'm sure with a decent amount of games I'd be able to
figure out a suitable one for 5 linked bots.

The perspective as to what's going on with this demo is much different if you
actually play the game. I've actually seen a number of games like this bot
exhibited. It's a strategy low skilled players engage in with the hope of
overwhelming opponents with brute force. The character restrictions favor it.
So its not by accident that this all converged into a demo that favors an
unintelligent brute force optimization bot.

It favors something that can do range/hit point calculations
quickly/accurately. Snowballing is required because there is no broader
intelligence among the bots. When the bots snowball, it's essentially just one
big optimization function. When they're stretched apart, the calculations are
much harder.

Knowing what I know about the game and the fact that I'm up against a Weak AI
bot with an optimized model, I'd know exactly how to screw it up with an
adversarial attack. I'd train a team of people on that and show everyone
exactly what human intelligence of capable of and why its superior. This
happens in your average dota 2 match constantly.. Low skill players attempt
brute force strategies just like these bots and you essentially wait them out
and pick them apart. This isn't a new and amazing style of gameplay or
something. There's already names for it.

When I used the term 'sloppy' I meant against the spirit and nature of the
game and w/o consideration of the 'way in which one wins'... Ambushing towers
at open 4v1 or 2 is some very hamfisted foolishness. Even in regular pub games
with upper avg. players, there'd be a sharp punishment for such bro-tier
gameplay. It usually results in an equally massive 'gank'. The way the human
players responded in these pressure scenarios really has me questioning the
whole event as I see avg. random players make far better decisions every day
in dota.

That's just my unfavorable two cents. I'm not impressed because I understand
how their bots are doing what they're doing, where the advantages lie, and I'm
aware of what restrictions they placed on the game in favor of their bot.

Elon claims he's worried about a dark future with AI, it's actually solutions
like this that are most scary because there is zero intelligence and a [by any
means possible so long as you achieve the object] steering function. If you
want to unleash chaos and destruction on the world and see a darker side to
human intelligence you've never seen before, start releasing such 'weak AI' to
manipulate people from the shadows. This is not strong AI or a path to it.
It's more of the same Weak AI provided with exclusive and insane amounts of
computer power/data and an objective to optimize for by any means necessary.
In cases where it dominates, it's almost certainly a reliance on finding
loopholes/flaws in a particular game not actual intelligence. You should see
the danger in this right away.

Funny because OpenAI originally opened with the spoopy terminator like dangers
of AI being so destructive we needed a group like them to save us... To now
openly sharing such unintelligent and dangerous weak AI optimization platforms
in the mainstream fear. Sort of like the 'Do no Evil' Mantra that was just
slogan.

I think this is a great engineer accomplishment that no doubt taught them a
lot. I don't see any broader 'safety' ideology underlying this... Just another
great team of people trying to achieve AI like everybody else utilizing
popularized approaches. It's better to just come out and say that. We can drop
the 'Save the world from AI'/'Safety' superman talk and get to the brass tax
of what they are doing and how, if at all, its different from what anyone else
is doing in the space.

~~~
nharada
My impression is that when rich and powerful people talk about "the dangers of
AI" what they really mean is "the dangers of AI (to me when it's not
controlled by me)"

~~~
red75prime
It is nothing new, or particularly bad. If we (good guys) will not have
(insert powerful technology), then bad guys will have it and everyone will be
worse off.

------
apeace
If anyone from OpenAI sees this:

The one thing that really surprised me yesterday was when OpenAI Five seemed
to pause the game. The commentators speculated that “it was learning” because
the humans had paused in game 1 due to lag.

I assume that’s not right, as OpenAI five is not training itself as it’s
playing (wouldn’t make much sense to add one more game sample to the billions
it has already trained on).

I thought it was interesting that the commentators had this misconception, and
was wondering what lead to OpenAI Five hitting pause.

~~~
gdb
A network blip caused all the players to drop from the game.

Incidentally, we'd just changed the code a few days earlier from
"automatically surrender when a human disconnects" to "do nothing if a human
disconnects; automatically pause if all humans disconnect". Had done a lot of
advance planning for what might break!

~~~
abhiminator
Off the tangent question here -- Do you have any plans to pit OpenAI Five
against the winners of this year's 'The International 8' tournament scheduled
for later this month?

Folks in the DotA 2 community are going crazy over this possibility, would be
incredible if this happens.

~~~
Dibes
Looks like they are at least playing pros from TI8[0].

[0]: "These results give us confidence in moving to the next phase of this
project: playing a team of professionals at The International later this
month." from [https://blog.openai.com/openai-five-benchmark-
results/](https://blog.openai.com/openai-five-benchmark-results/) at the
bottom

------
crsv
In case anyone would like to watch a VOD of the event, it can be found here:
[https://www.twitch.tv/openai/video/293517383](https://www.twitch.tv/openai/video/293517383)

It was a really wonderfully done event overall and there's several interviews
throughout with different folks from the OpenAI team.

~~~
abhiminator
YouTube mirrors:

Game 1&2 [0]

Game 3 [1]

[0]
[https://www.youtube.com/watch?v=eaBYhLttETw](https://www.youtube.com/watch?v=eaBYhLttETw)

[1]
[https://www.youtube.com/watch?v=_QQYaVUODkE](https://www.youtube.com/watch?v=_QQYaVUODkE)

~~~
ufo
(These are unofficial highlights btw, not mirrors)

------
johncoogan
I wonder how long it will be for them to build a version that actually uses
the screen images as opposed to all the data from the Bot API. Seems like it
would be a lot harder when you add an image processing step and restrict
processing of information from across the map without actually scrolling over
there.

~~~
lawrenceyan
The point isn't to see if the model can process images. OpenAI's goal is to
see if they can recreate the ability to plan and strategize over a partial
information continuous long time horizon environment.

You wouldn't want AlphaGo to have to input it's commands using robotic hands
right? It's the same thing in that, sure it might be interesting, but that
isn't what we care about. Image processing and robotics controls are largely
solved. Showcasing that a model can gain the ability to plan and think is the
novel stuff here, and is the path to where "artificial intelligence" if any
appears. That's the ultimate goal in playing any of these games.

~~~
sgillen
>> Image processing and robotics controls are largely solved

Image processing and robotic control are very far from being solved problems.
I guess you are saying that in the case of alpha go it would not be a super
difficult step to have a camera and robotic hand physically move pieces
around, and that's probably true. But I think in the DOTA case are new image
processing challenges that interact with the AI in interesting ways.

I'm mostly talking about the need to move the game's camera around to gain
more information. If you don't see your ally on your screen and need to see
how they are handling a gank or something (full disclosure I don't play DOTA
at all this could be a silly scenario). Then the AI would have to recognize
this and move the camera to the allies location in order to gain that
information. So really the novelty here would be in the network to somehow
realize what information it needs and then further to learn how to gather that
information. I honestly think that sounds like an extremely difficult next
step.

~~~
nsomaru
This is something i noticed; a human initiator would get counter initiated
almost instantly, every single time by OpenAI. The blink dagger is much less
effective. Pro humans do this too, but not every single time with perfect
timing.

Humans dont concentrate on the whole screen, attention is directed...

~~~
jacoblambda
It would be interesting to take this project up a few levels later on and see
how it compares to direct API interaction.

I would love to see camera/mechanical interface like mentioned by others.
Similarly, like you said humans don't focus on the whole screen. I would love
to see how well the AI could perform if it was given something like blinders
where only a small portion of the screen is in focus at any one time much like
how human eyes work.

------
minimaxir
During the thread from the competition there were a few comments suggesting
that the neural network could easily be run on a personal computer.

Given the massive network architecture linked in the post ([https://s3-us-
west-2.amazonaws.com/openai-assets/dota_benchm...](https://s3-us-
west-2.amazonaws.com/openai-
assets/dota_benchmark_results/network_diagram_08_06_2018.pdf)), I am rather
curious what hardware was used to make predictions for the Benchmark match.
Especially due to the 2048 unit LSTM.

The model outputs are also interesting; they're all discrete actions (even
movements), no continuous outputs (aside from win probability).

~~~
bllguo
not sure I follow; what's so interesting there? Wouldn't a continuous output
ultimately have to be translated into a discrete action anyway?

~~~
minimaxir
The big ones are the X and Y offsets/moves; you'd expect it the output to be
an explicit coordinate to move to/act upon (e.g. a specific value in the -400
to 400 space), but per the original announcement
([https://blog.openai.com/openai-five/](https://blog.openai.com/openai-
five/)), there's a 9v9 grid space. (-400 to 400 by increments of 100 each
axis)

Although making discrete choices 2-3 times a second is indistinguishable from
continuous movement anyways.

------
samfriedman
The included video of "growing pains" \- aberrant behaviors found during
training - is interesting. This kind of thing is fairly common when training
RL agents: some discussion of how the OpenAI team solved these issues would be
nice.

~~~
jaggederest
Video link on youtube: [https://www.youtube.com/watch?v=xiyUSI-
TEgo](https://www.youtube.com/watch?v=xiyUSI-TEgo)

------
fermienrico
Side comment: Open AI's design is excellent and it doesn't get in the way of
reading their articles. They've taken a Stripe-like design approach without
getting too overboard. Well done!

~~~
joliv
There’s a reason their site is Stripe-like: they hired Stripe’s creative
director:) [https://dribbble.com/luddep](https://dribbble.com/luddep)

~~~
dclausen
and Stripe's former CTO:
[https://www.linkedin.com/in/thegdb/](https://www.linkedin.com/in/thegdb/)

------
YeGoblynQueenne
>> These results show that Five is a step towards advanced AI systems which
can handle the complexity and uncertainty of the real world.

Great results that deserve congratulations- but they show nothing of the sort.
The distance between a game and the real world is about twice as big as the
distance between a game and a simulation of the real world (which a game isn't
even). Winning at dota is nothing like negotiating the real world.

~~~
red75prime
I can see your comments going up to "Weak AIs designing, producing and
controlling agile robots is a great step, but it is a long way before they
learn enough human psychology to perform as well as a door-to-door
salesperson."

I'm not sure what will follow.

------
runesoerensen
Related discussion from yesterday's match:
[https://news.ycombinator.com/item?id=17693169](https://news.ycombinator.com/item?id=17693169)

------
simonebrunozzi
Prediction and hope: within 1-2 years, Open.AI will be able to offer a GREAT
AI-powered challenge to human players across many games.

I occasionally play strategy games and I am always frustrated by a very stupid
AI that gets challenging only by beefing up its stats. I would love to play
against a smart AI.

I don't do multiplayer because: I like to play offline and/or whenever I want,
and I play so rarely that most other human players wouldn't enjoy playing with
me.

Open.AI team: please listen!

~~~
sgillen
Yes an actual good AI in Civ would vastly improve that game for me.

------
iMuzz
Can someone ELI5 what the significance of this is?

Why was Dota chosen as the game for an AI to get good at?

Is Dota more difficult than Go? Why or why not?

~~~
yuuichi
Two major reasons:

The first being that Dota is a team game which requires teamwork to win. This
presents a new challenge of having actors work towards both personal and team
goals in a balance (much like real players) to be able to win, which is very
difficult to train.

The next is that Dota and Go are two very different kinds of games. Go is an
"information complete" game, where all players have access to the entire game
state at any given time. Dota, on the other hand, is an "information
incomplete" game, as teams are vision restricted: there's no guarantee on the
state of anything out of vision, meaning that the AI has to develop what most
players call "game sense" in order to be effective.

On a tangental note, it's also an interesting problem from a state space
perspective. Go is technically "solvable" to a point where you could (with a
currently unobtainable amount of computing power) find an optimal move, but
Dota is almost unfathomably more complex: 10 heroes picked from 115, each with
the ability to hold any combination of 9 items from about 150, with abitrary
health and mana values, at abitrary positions on a large map, not even
mentioning the non player units (creeps and neutrals). If Go's state space is
our solar system, already a difficult scale to comprehend, Dota's is the whole
galaxy.

[Typed on mobile, apologies for any typos.]

~~~
TulliusCicero
> The first being that Dota is a team game which requires teamwork to win.
> This presents a new challenge of having actors work towards both personal
> and team goals in a balance (much like real players) to be able to win,
> which is very difficult to train.

I don't think this is actually a significant problem for an AI, because each
AI 'player' will be the same copy of code, thinking the same way. They don't
need to explicitly communicate if they have the same thoughts and
expectations.

------
alkonaut
How much of an AI’s advantage is ”I/O” related and how much is actual strength
in strategy/gameplay?

Are the AIs actions rate-limited, delayed, and perhaps time randomized a bit
so that an AI doesn’t get an advantage because it can to actions faster or
better synchronized to exact times, or more responsive (say to avoid any
action under a certain reaction time, all actions are scheduled 50+rand(5)
after the AI queues them)?

------
exabrial
5 thousand lifetimes of play versus .45 lifetimes, I think the outcome was
predictable!

~~~
make3
yeah, but you gotta add hundred + millions of evolutionary learning to model
the physical world and the actions of our peers, built in to our (mammal)
brains

------
sigi45
Why did i miss it?! :(

I think first it was planned for 28th now i read it one day later :/

~~~
ufo
It was originally scheduled for _July_ 28th, but was rescheduled to yesterday.
Well, at least you can still watch the twitch vods :/

That said, you still have a chance to catch the match against a professional
team that is going to happen sometime between the 20th and 25th.

------
exabrial
Is the ai forced to use voice communication and chat to coordinate bots?

~~~
make3
there is no communication at AI between bots afaik

------
doctorpangloss
Would you be interested in developing an agent for Hearthstone?

~~~
ralfd
Already done, sort of. Blizzard was not amused, as botting makes the game less
fun, so it wasnt released.

[https://elie.net/blog/hearthstone/i-am-a-legend-hacking-
hear...](https://elie.net/blog/hearthstone/i-am-a-legend-hacking-hearthstone-
with-machine-learning-defcon-talk-wrap-up)

------
KaoruAoiShiho
Did you think the AI would win despite the adversarial draft?

~~~
backpropaganda
I'm curious to know if the AI can win with a 50-50 draft. It can be argued
that the main reason the humans lost was because they didn't have a feel for
the meta of this 18-hero version of Dota, while the bots did. What if you give
the humans an equal chance at the game wrt the draft? That would have made for
a much more interesting game than having the humans play a game they've never
played before in their lives.

~~~
KaoruAoiShiho
It seems likely that actual pros instead of a mismash of commentators and
community stars would be able to win game 2 even with the draft favoring the
AI.

~~~
ufo
For what is worth, those commentators are actually very good players. Every
year they team up in the qualifiers for the world championship and they
usually make it very far.

~~~
iotb
I'm an active player with enough hours logged to interpret the game. That
being said, the observations should also be obvious to even the most novice
player. I don't need a commercialized and obviously favorable and biased
stream of commentary clouding a much more sound analytical capability. I'm
pretty sure they're not going invite commentators that point out the obvious
reality as to what is going on. Per the commentators, bot like behavior gets
extra positive humanized characterization and is marked as intelligent.
Whereas snowball-cheese is known as the lowest tier and least intelligent
strategy in the game. There's even a set of memes in the community for people
who attempt such strategies.

People need to start doing more critical analysis of their own and stop
relying on commercialized and biased information and commentary when settling
on an viewpoint. Watching others play only gets you so far in understanding.
When you play and you see what I'm saying for yourself, you can skim through
the provided clips and understand exactly what's going on.

The fact that this continues to get hyped up vs someone stating what's going
on is plain sad.

------
_0ffh
I think the third game would probably have been a lot more tense, if they'd
let the OAI5 make the last one or two picks by itself.

~~~
gpm
I bet they weren't quite sure what they had to do to make the game even. The
previous rounds were hugely in favor of the ai and it sounds like they don't
have much experience playing against humans at this level.

------
tehsauce
For a research organization openai sure seems to be putting a lot of effort
into PR. Beating players using a subset of the game is impressive but there is
no mention whatsoever of the fact that only a subset of the game is allowed.
The research is cool by itself, why do they feel the need to promote so
heavily that the AI is beating human players "in the first 17 minutes of"
rather than giving a realistic picture of their progress. This is how you
create an AI winter.

------
atx7
I like how their priority is not removing the significant restrictions of
couriers, items and heroes, rather playing a team of professionals on a
"custom game" based on dota heroes

~~~
apeace
If you watch the match or read their blog, you’ll see they frequently say
removing those restrictions is their top priority. They simply wanted to see
what level the AI is already at by playing top-level humans for the first
time. Hence the name “benchmark”. In fact, they have been removing
restrictions at a fairly rapid pace already.

