
How we built the unbeatable Dota 2 bot - sillysaurus3
https://blog.openai.com/more-on-dota-2/?!
======
crsv
By "unbeatable" they mean it lasted a few hours in the hands of the public
before the hardcoded behaviors were exploited and it was defeated
consistently. Turns out hardcoding the mechanism for players to manually block
creeps (a more "creative" play style, but a fundamental element of lane
control) allowed for the exploit. The bot did not teach itself to control the
flow of creeps to gain a laning advantage against the opponent.

Not to diminish the teams' achievement. This was awesome. But again, it's just
more over-hyping of AI that should be called out for what it really is.

~~~
sambe
They neither call it unbeatable nor mention hardcoding anything. The
"unbeatable" is a quote from a professional tester, and they discuss the
losses, acknowledging the need for further improvements for more general play.

~~~
kevinwang
Looks like they did hardcode the creep blocking behavior. Well, they hardcoded
the reward for creep blocking and trained the bot specifically to learn
behavior to maximize for that reward.

~~~
visarga
... and used curriculum learning.

But that is not unfair. Humans receive plenty of curriculum training as well,
we're not supposed to figure out the world by bumping into walls. Even in
Dota2, the top players learned from observing each other how to deal with the
bot. In fact, efficient retraining to include new strategies on the spot would
be a very human-like learning ability.

~~~
DeepRote
Yeah but humans can more easily make macro-decisions based on micro
situations. It's easier for us to look at a map and figure out which side is
winning, sometimes that's really hard for a computer to do.

I'm not a Dota2 player, but like SC2 for example is a game with LOTS of room
for AI improvements. I've always thought that having some sort of APM limit
might actually encourage AI authors to adopt new and unique approaches to
macro-strats, but it doesn't seem to be on the horizon.

When it comes to do a small thing rapidly, I think bots are almost always
going to win.

When it comes to do something large-scale with finesse, I think humans are
going to have an advantage for a LONG time.

I think that part of what makes human agents so effective at certain tasks,
especially in the context of being up against another human is that we can
evaluate an event and better understand the WHY of it relative to the player
that played it.

If I see a player pull back a bit, sometimes I think to myself that maybe they
saw something they weren't expecting or something they weren't quite sure how
to handle. When a computer sees the same move, a floating point number among
millions changes slightly. I can try and figure out why they might be pulling
back, if I did something weird or if I did something totally normal I might
suspect it is bait, etc. I can think all these things in a short period of
time and while large AIs might have better FLOPS than me, it doesn't
understand what I'm doing, why I do it, etc.

Curriculum learning isn't as effective in bots as it is in humans is my
contention, I guess.

Fair/unfair is a pointless observation when it comes to humans vs bots. The
diversity of human-based problem solving is the perfect friction to train AIs
against, imo.

------
_dps
This is a highly editorialized title that clearly doesn't follow the posting
guidelines:

> Otherwise please use the original title, unless it is misleading or
> linkbait.

> Please don't do things to make titles stand out ...

(the original title is the much more sedate "More on DOTA 2"; in case it is
changed by moderators, at the time of this comment it says "How We Built the
Unbeatable DOTA 2 bot", which is not even a quotation from the article)

The submitted title has certainly shaped the commentary (see people reacting
to "unbeatable", and whether it's an impressive achievement, vs reacting to
the technical content).

So, in a nutshell, I think we should avoid titles like this, and I believe the
guidelines transparently say so as well.

~~~
sclangdon
To be fair, "More on DOTA 2" is completely meaningless out-of-context. The
article could literally be about anything (in the DOTA 2 universe).

So whilst the new edited title is a bit click-baity, it is nevertheless more
meaninful than the original.

~~~
Ntrails
>So whilst the new edited title is a bit click-baity, it is nevertheless more
meaninful than the original.

It's also a lie, given that the bot was beaten multiple times by plenty of
different people.

------
flipgimble
This is an unquestionable technical achievement, but its important to keep in
mind that the AI view of the world is through the bot API. This is compared to
a human that decodes game state from observing the monitor. Not being an avid
DOTA2 player I marvel at skilled players that can follow the game based on
what looks like a very busy and colorful mess to me.

At the same time most of the deep learning field has been focusing on
implementing super-human perceptual abilities (vision, hearing, translation)
that come instinctually to humans. The higher level reasoning and
memory/attention augmented machine learning is still cutting edge research. I
think DeepMind and OpenAI are driving research towards that end.

------
iMarv
As a developer I surely am amazed of the capabilities of the bot, but as a
Dota player I am a bit more skeptic. The bot only wins in a certain scenario,
Shadow Fiend vs Shadow Fiend, no bottle, no runes, no jungle creeps and so on.
This cuts down the possibilities of playstyle a lot. This is stuff the pros
are used to have available, and are training with. The bot was trained without
it and has a fitting playstyle. For this certain type of 1v1, a pro would have
to adapt. I still think that if you give them enough time, they will figure it
out.

~~~
ryzawy
These are _exactly_ the rules of Dota 2 1v1. This scenario was not
specifically built for the bot. See here:
[http://wiki.teamliquid.net/dota2/Dota_2_Asia_Championships/2...](http://wiki.teamliquid.net/dota2/Dota_2_Asia_Championships/2017/Solo_Tournament)

~~~
iMarv
That is correct, what I was trying to say is, that the playstyle of the
professional players is not used to those rules. It takes them out of their
comfort zone and they have to adapt. I would love to see the bot matched up
with a high mmr player who is playing matches with these rules on a regular
basis

~~~
ryzawy
While you are obviously right that it's not a "true" match, I have to disagree
that it takes them out of their comfort zone. Especially Arteezy (playing one
of best Shadow Fiends), Sumail are extremely good mid players and both have
actually participated in the solo 1v1 tournament I've linked. There's even an
official 1v1 game mode in-game - I would argue that they both play these
matches "on a regular basis".

~~~
criloz2
I love the bot, but not, those matched are not played in regular basis, even a
great part of the community is not happy with tournaments when they include
those modes, the best example is the TI, the first version have a small
tournament to see which was the best mid player, it was not very popular, and
was removed.

------
sillysaurus3
This is making waves in the Dota 2 community over at
[https://www.reddit.com/r/DotA2/comments/6u2xvm/more_info_on_...](https://www.reddit.com/r/DotA2/comments/6u2xvm/more_info_on_the_openai_bot/)

My favorite part:

 _The first step in the project was figuring out how to run Dota 2 in the
cloud on a physical GPU. The game gave an obscure error message on GPU cloud
instances. But when starting it on Greg’s personal GPU desktop (which is the
desktop brought onstage during the show), we noticed that Dota booted when the
monitor was plugged in, but gave the same error message when unplugged. So we
configured our cloud GPU instances to pretend there was a physical monitor
attached.

Dota didn’t support custom dedicated servers at the time, meaning that running
scalably and without a GPU was possible only with very slow software
rendering. We then created a shim to stub out most OpenGL calls, except the
ones needed to boot._

~~~
gcp
These kind of things are why board games are the initial preferred testbed for
a lot of this stuff, though :-)

------
itchyjunk
If you watch the vids and read the blog posts and re watch it and reread it.
You end up with just more and more questions. The bot learned to raze from the
Fog Of War based on playing a pro once or twice? It changed its build based on
losing once? (sure you say you white listed it but it still had choices no?).
Wand not being a common opening is also not true. It's common AGAINST heavy
casters. (I wont rush wand AS a bat rider but I would AGAINST him. I would
AGAINST shadowfiend as well.) Was the animation canceling also taught similar
to the creep block? independent RL?).

So many things don't fit in.. but that might be because I am neither good at
DOTA 2 nor at AI stuff. Good luck with the 5v5's openai. And if the AI bots
become public, maybe i'll get to play against them some day.

~~~
systoll
Perfect animation cancelling comes with the DOTA 2 API.

If you tell the API to make you attack a thing once, you're 'attacking' until
the projectile is created, and then you're 'idle' immediately afterward.

So long as you're feeding the hero actions, it'll always animation cancel.

------
murukesh_s
Those who are thinking that AI is being overly hyped, and that the demos like
these are not smart enough, think about it, today's AI is smarter enough IMO,
as the potential of real-world applications of AI as of now itself could be
vast. It just need some practical polished implementations. Did the Dota 2 bot
learn the game from scratch by learning the rules all by itself? No, as
evident from the post itself - "small amounts of “coaching” with self-play.
But think about it. It need not be that smart to replace humans. IMO, many of
our jobs do not need to very complicated self-learning. In real world day-day
jobs like agriculture, factories, and to a great extend what we call as
skilled labour like driving, management etc, the kind of work is often
repetitive. AI is already better than humans at recognising speech, vision and
to some extend languages. Combining these skills with mechanical robots in a
meaningful way can easily replace what we, as humans can do, except in niche
fields. Even there computer can offload our job in several ways. Robots/AI can
do it much much faster, and much more efficient than us and it will only be
getting better and better with overall advancement in research and the amount
of data available..

So it's kind of interesting and equally scary that we are looking at a
potential future, possibly near, where most of the work can be handled by a
trained robot with humans only supervising them in exceptional cases. Robots
can then could be only need to be smart enough to alert the human supervisors
when they reach a state where they cannot perform the job normally, for e.g.
power grid going off in a factory and they couldn't figure out what to do, it
could only be a 0.001 % chance. The point is AI need not get as good as humans
in learning complex scenarios to replace majority of us, it just need to do
99% of what we repeatedly do, much better than us.

------
kevinr
Standard---maybe not objection, but complication: I'm waiting for the day AI
can do this in 2100 KWh. (Human brain uses ~15W continuous. 15 W * 24h * 365
days * 16 years.)

------
simpx
I'm curious about how bot learn to creep block?

How does the bot understand the value of long-term strategy?

~~~
sockmeistr
It doesn't understand the value; from the article: "We also separately trained
the initial creep block using traditional RL techniques".

------
isseu
I was expecting more details of the architecture..

