
More on Dota 2 - darwhy
https://blog.openai.com/more-on-dota-2/
======
lvoudour
I know it has been mentioned a lot the past few days, but since the articles
keep flowing about it I'll mention it again:

It's a great feat and kudos to the openai team, but it is VERY unfair for the
human players who rely on a sensory interface vs a direct API connection.
That's unlike chess or go where the interface isn't important. The really
impressive feat will be an AI that uses the same sensory information to make
decisions (and I really hope that's where the openai will head next)

~~~
BaronSamedi
I completely disagree. An API is software's natural input mechanism just like
the senses are a human's natural input mechanism. Having the AI use human
senses unfairly handicaps it. More importantly, however, is that this is not
the key problem.

The key problem is teaching the AI strategy and tactics. What heroes to pick?
Where to lane them? When to rotate? What items to buy? What spells to level
up? What enemies to target with which spells and in which order? These are the
hard problems and they are very hard indeed. A 5v5 AI will have to become
expert at risk calculation, Pareto optimization, basic military principles,
and many more things. Compared to these problems, the choice of input
mechanism is trivial.

~~~
_dps
I'm almost entirely ignorant of DOTA 2, but the coverage suggests to me that
it is winning mainly through what I would call "micro" excellence (firing at
just the right range, careful maneuvering, and so on). It's clear that this
kind of precision control is extremely advantaged by having direct access to
the state of the game, rather than having to estimate it in real-time from
vision data.

So, to me, based purely on the news coverage, it's not clear that it has
learned anything like "superhuman" levels of strategy. We already know that
computers have superhuman reaction times and precision calculation abilities,
so it seems to me the interesting question is whether an advantage would
remain after factoring those out.

~~~
skgoa
As someone who plays dota, this bot really isn't that impressive. The hero
they chose is seen as a very difficult one to master for humans, precisely
because judging distances, current life, current mana, damage etc. is so
difficult to do on the fly and even if you can keep track of all of that in
your head, you need extremely precise inputs to outplay your opponent. Yet by
going through the API, they handed all of that to their bot on a silver
platter. They pretty much let the bot sidestep the core challenge of this
hero, while it was kept in place for the human players.

On top of that they also reduced the complexity of the game quite
significantly by limiting items etc., which further reduced what humans could
do against the bot. Even then, the bot utterly failed once humans were allowed
to use a tiny bit of creativity.

And that's not even taking into account that this was not even close to the
complexity of a real dota match. The big challenge in dota is in the decision
making with incomplete information and in coordinating 5 people with only
voice and the ability to ping the mini map, in a giant "search space" created
by hundreds of different heroes, items and game mechanic interactions.

~~~
mabramo
In terms of Dota, I don't think it's that impressive, but it is still cool.
I'll be impressed when I see a 5v5 with bots that adapt to the opponents'
strategy. I do think that it currently could be an excellent tool for mid-
laners and cores to practice. OpenAI is also blowing the door open to Dota AI
development and we will soon see bot tournaments. Engineers will develop AI
and put them against one another in standard 5v5 matches with a pick/ban phase
and all.

In terms of AI, I don't /think/ there's anything groundbreaking here. Correct
me if I'm wrong, as I don't follow AI research, but this technology is nothing
we haven't already seen. I believe the development of AI for Dota is a
publicity move to get people excited about what AI could be for humanity. This
might be the way to introduce AI to non-technologists and get people excited
about it.

~~~
quiteawhile
> This might be the way to introduce AI to non-technologists and get people
> excited about it.

Yeah, I'm almost positive that this exhibition is intended mostly to raise
awareness and create this hype. Go and Chess, for most people, are simple
games compared to Dota2, so if Elon is worried about AI and want people to be
more aware of the threat he perceives it certainly helps to make this big show
and get all those impressions with a game that is considered by the majority
of people (especially younger) to be more complex/harder than what has been
done before.

------
grogenaut
I agree that it's a bit of a over hype tactic in a toy situation which might
be marking to maybe make people think the technique and bot are more capable
than it is. But I think people are overly down on it too. It's a demonstration
of a technique in a way the general public (and gamers) will understand.
OpenAI isn't going to make money off of building game bots... people wouldn't
watch. The human drama is a major ingredient in e-sports.

But we shouldn't be down on this while we were going gaga over a lego sorter
done the same way a month or so back.

It looks like for some things we can almost have a plug and play ai solution.
EG, like we are seeing with image classifiers, this doesn't take years of phd
doctoral research and game theory to build up a world class bot. Which is what
everyone used to do. This is moving some of these techniques into the "get
data set, get hardware, download library, train" plug and play type solution
which we're seeing more and more with in other areas like machine
classification. Eg stuff anyone with a few years of experience can do, maybe
not amazingly, but better than they could hand coding the solution. The
problem becomes one of gathering good training sets or building an accurate
simulation to train in.

This means, I think, that you'll see way more of these types of ai solutions
where people would have balked at a hand coded solution before. This in turn
looks a lot like mobile's change to computing where things that were annoying
to do on your home pc became different just because you had a camera + gps +
computer + radio in your pocket.

I know my company has started using classifiers a lot more for things that are
kinda sliding bad user actions instead of coding up huge rules engines. We may
not be as effective as a several area deep engineers writing rules and doing
data analysis, but instead we have 1 engineer per problem space being about
70% as effective which is still a huge win over not solving the problems at
all.

The funny thing is that this bot actually pulled off the stereotypical
hollywood training montage with just a few weeks of hard work it beat the best
in the world. Just get some sweet rock in there and you've got it all.

------
bdz
Props for the $12k donation to OpenDota. That's really awesome! Tho I
personally always preferred Dotabuff.

~~~
literallycancer
The features are slightly different so people usually use both.

------
rtpg
Is there a good self-contained example of how people set up learning in ways
where "the AI doesn't initially know the rules"?

I've heard this many times and conceptually I get the principle, but I have a
hard time understanding how you create a legitimate starting position or
measurement mechanism beyond "losing/winning".

~~~
sanxiyn
The linked article says "The bot received incentives for winning and basic
metrics like health and last hits". So apart from losing/winning, losing
health is bad, last hitting is good. You could add more, but apparently that's
all OpenAI used.

~~~
mannykannot
It also says "We also separately trained the initial creep block using
traditional RL techniques." I have no idea how significant that is, but it
seems to be getting a fair amount of attention.

~~~
gcp
It's a highly specific procedure that happens before there is interaction with
the opponent, so without "handing over" the understanding that having creeps
on your high-ground is good, it's very hard for the learning to see through
the noise and discover this.

~~~
damnfine
Exactly why this is not impressive to me. The point is to be able to learn the
rules, but all I see is some of not only the rules, but the actions already
prespecified in many cases. Yes its hard, and thats why humans still rule the
roost.

~~~
Twirrim
Surely all this RL process did was speed up what the computer would have
learned, by a large stretch? The "cost" factors they chose would have hit them
ultimately, regardless of if the bot stood still in the base, or wandered off
elsewhere.

~~~
mannykannot
My guess is that in general, many complex strategies are effectively
unreachable without something like analysis, on account of intermediate states
being disfavored, leading to algorithms being trapped by local minima in the
cost function (I don't know whether that would be an issue for this game,
specifically.)

------
kyberias
The classic reinforcement learning -based AI (from 1992) that beats humans
(maybe not top players though) in Backgammon:
[https://en.wikipedia.org/wiki/TD-Gammon](https://en.wikipedia.org/wiki/TD-
Gammon)

------
omarforgotpwd
I’m by no means an expert, but I’m fascinated by the idea that a neural net
playing against itself can substantially outperform a supervised learning
approach with a large training data set. I mean, gathering training data and
making sure it’s labeled correctly and all that is a huge hassle so if you
could eliminate that step or even reduce the amount or quality of training
data required that should be a big win for AI, right? Especially if doing this
not only makes things easier but also improves the performance of the model.

~~~
NamTaf
Reinforcement learning isn't a new idea - I did a Berkeley-based edX course on
it a few years ago now and it was not state-of-the-art to my knowledge. That
had no deep aspect to it, we just generated a reinfrocement algorithm that
utilised a good measure of performance (specifically, it was pacman and the
score value is pretty good at that) and changed a few algorithm weighting
variables at each iteration.

My understanding, from talking to a ML friend this morning, is that the latest
progress is taking reinforcement learning and applying deep learning
approaches (nets, etc.) to it. The key becomes finding the right scoring
algorithms to tweak the neural net correctly towards the desired outcome.

The self-play really is the reinforcement side of things at work. How you take
that 'score' and use it to correctly modify the input weightings - be them in
a neural net, traditional algorithm, etc. - is the key.

~~~
noway421
Why wouldn't algorithm reach a local maxima when playing with itself, or even
degrade over time by opening up to unknown attacks?

~~~
gcp
One typically keeps pools of trained networks to combat this.

------
debacle
I find the mechanism for learning + the timeline far more impressive than what
they accomplished. A series of 5 bots that can consistently compete with 5
humans at the 4.5k+ level would be a very impressive display of AI training.

~~~
distances
What I'd like to see is them implementing AI for a strategy game that benefits
from an overall vision, such as Civilization.

And then selling the AI to Firaxis.

Yes, Civ 6 AI still isn't anywhere close to what it should be.

------
jsnell
Dupe, more discussion at
[https://news.ycombinator.com/item?id=15031470](https://news.ycombinator.com/item?id=15031470)

~~~
aaron695
Unfortunately the HN title there which still isn't correct at time of writing
this, distroyed a proper conversation.

(Assuming the original article didn't fix their title)

------
hfsktr
I loved the different ideas to throw the bot off like pulling creeps in a way
that would not work for a human. Slacks' courier strategy was entertaining as
well.

I don't know enough (no more than a layperson) about AI to have any meaningful
comment there. Do they need to train the bot on every hero the same way or
does it only need to relearn the hero specifics (and not items/strategies)?

~~~
hfsktr
Another thing I noticed. Others talked about it using API etc etc so that
means that you can't visually trick it by stopping an attack mid animation
like you can with human players right?

------
tahw
The version they used for TI had a variety of rules that completely changed
the metagame of 1v1 (no bottle, for example). Even ignoring the obvious API
advantage, the match was unfair because the Pros had never trained under the
constraints that the AI team brought.

~~~
jdoliner
They played under standard 1v1 tournament rules:
[http://wiki.teamliquid.net/dota2/Dota_2_Asia_Championships/2...](http://wiki.teamliquid.net/dota2/Dota_2_Asia_Championships/2017/Solo_Tournament)

------
taormina
That's a good hero to lead off with. Soulstealer (is the HoN name, blanking on
the original Dota name) is a hero with very basic mechanics. Or rather,
there's a ranged, AoE ability, an ultimate that boils down to "stand in the
middle and hit the button", but the rest of the mechanics boil down to "last
hit lane creeps well" which is a huge Dota 2 game mechanic. And this hero does
better as they succeed at last hitting.

~~~
yurrzz
Nevermore, the Shadow Fiend is the original Dota name.

------
Aron
I believe biological explanations might account partially for the openAI bot
outperforming human players.

------
CoffeeBob
Does anyone else have a problem with the line, "the graph is surprisingly
linear, meaning the team improved the bot exponentially over time"?

~~~
pycal
I think what they're pointing to is that the ELO system "true skill" that dota
uses is a log normal distribution. To your point I'm not sure that means that
player skill improves along the distribution, but I think it does mean their
probability of winning increases exponentially.

------
Havoc
Really cool writeup. Enjoyed that thoroughly

------
misticdeveloper
So it appears their bot isn't cheating with vision, and has its action speed
capped to human levels. Interesting!

Sad they had whitelisted item builds. I thought the whole point of a machine
learnign bot was it was supposed to learn these themselves.

5v5 full game is way more complex than Starcraft. Hope OpenAI are ready.

------
juskrey
AI players that use internal game calls could beat humans from the beginning
of the history.

------
kensai
It is amazing what Peter Thiel's OpenAI does! Congrats to his genius.

~~~
kensai
I sincerely don't understand the downvoting. It's Peter Thiel's as much as it
is Elon Musk's OpenAI.

