
AI bots trained for 180 years a day to beat humans at Dota 2 - f3f3_
https://www.theverge.com/2018/6/25/17492918/openai-dota-2-bot-ai-five-5v5-matches
======
TulliusCicero
Somewhat misleading title: they're playing a restricted set of DotA 2 (only 5
heroes instead of 100+, some mechanics turned off), and some important
decisions are handled by hard-coded rules rather than the main AI.

------
protonimitate
>>On one hand, this is a testament to the power of contemporary machine
learning methods and the latest computer chips to process vast amounts of
data. On the other, it’s a reminder of how fundamentally unintelligent AI
agents are.

forgive any ignorance, but how do changes to the game affect AI behavior? E.g.
New characters, new maps, balance changes, bug fixes, etc. Do the AI have to
be re-trained in a similar amount time, or is it expedited? How adaptive are
the A.I. systems?

What I'm getting at is - in a real world scenario (like the city
transportation example from the article), how quickly can the A.I. respond to
things unexpected changes (natural disasters, accidents, etc)? Will they be
able to handle the unexpected in a safe time frame?

~~~
relenzo
A Reinforcement Learning machine would have to be re-trained for changes such
as that, but not from scratch. Depending on the size of the change, it would
take much less time to adapt to such a change.

Several notes:

-The article notes that a lot of the hard parts of human video-game playing have been done 'for' the AI by hardcoding. It doesn't actually have to look at the screen to parse information--it has direct access to game variables that player would have to access through menus. More relevant to your question, a lot of the strategic decisions like item purchasing, ward placement, and probably character placement were just picked by the programmers or set to be ignored. The whole thing is less impressive than the headline sounds. I think it was just learning--where to run on the map and when to attack things?

-RL is based on deep learning, and there are fundamental issues with deep learning's ability to adapt to genuinely new scenarios. None of these system can presently adapt to something genuinely unprecendeted in a time frame you would consider 'safe'. They need at least several opportunities to observe how the world works after the changes and the consequences of their actions. To try to make this concrete--they can't reason about what implications a flood/power outage/landslide has for their traffic management. They can only learn from _trial and error_ <-(the important part) how traffic behaves during a disaster.

~~~
xapata
> RL is based on deep learning

False. Reinforcement Learning is a category of algorithm. There are many
possible implementations, not only "deep learning". Further, you might be
surprised at how well a generalized, trained model might mimic "reasoning".

------
swarnie_
A great achievement even for a game which is relatively simplistic like a
MOBA. Still waiting for AI to become competitive in Starcraft which i think is
a much greater challenge.

~~~
izzydata
An impossible to beat Starcraft AI is already doable without all this AI
training. Just by controlling every individual unit seperatly in impossible to
counter maneuvers far beyond any human actions per minute.

You could basically get an AI to win every single time with a handful of
zerglings in a few minutes this way.

You could train an AI for 1 billion years a day to play pong or you could
hardcode it to never miss the ball.

~~~
brianwawok
Are you sure? i.e. has this been proven on pro level SC players?

I think a smart player could know the player was going Zerg, and wall in.. and
make some decisions that make the micro of Zerg less useful.

~~~
izzydata
Mostly speculation and observations of many hours in Starcraft. I don't think
anyone has really set out to create an AI where the only goal is to never
lose. I can imagine a program that can take one unit and program it to always
remain 1 pixel farther than the maximum distance any enemy can shoot it unless
for some reason it knows with 100% certainty it will win an engagement.

I'm not aware of any public api or hooks into controlling Starcraft or it
would have probably been done by now. It would also be used for cheating so
they can't make that anyway.

------
sctb
Main discussion:
[https://news.ycombinator.com/item?id=17392455](https://news.ycombinator.com/item?id=17392455).

------
AstralStorm
Please wake me up when the AI can play a serious grand strategy game with
hundreds of long term options. Without mostly hardcoded reward function.

