Hacker News new | past | comments | ask | show | jobs | submit login
OpenAI at the Dota 2 World Championships (openai.com)
510 points by frisco on Aug 11, 2017 | hide | past | web | favorite | 213 comments

Really was quite interesting to watch (https://www.twitch.tv/videos/166172514?t=7h3m10s). Honestly, the bot played _extremely_ well, but I think the biggest advantage was how much faster it's reaction time was and it's movements were likely much more precise than a human is with a mouse.

I'm pretty interested in seeing their 5v5 results as well. It seems like that will have similar results as the bots can coordinate, but it's still a bit up in the air.

I'm really not sure how well a bot like this would do with 4 human teammates though. I would guess the bot would be a strong laner, but fairly weak overall due to it's inability to communicate, though if it's really good at learning how it's current teammates are playing as it goes, it may do alright.

I'm also pretty curious about the match limitations: no shrines (regen), no soul ring (mana regen at expense of some health), no raindrops (fixed amount of magic damage block)

"No soul ring" has been a traditional rule for 1v1 competitions since at least TI3 [1]. It doesn't necessarily make sense for Shadow Fiend play, but it can make some other hero matchups, like Bane and Omniknight, drag out forever.

The shrine is inactive for the first five minutes of a normal game, so it's a non-factor in most 1v1 matchups.

[1]: https://www.reddit.com/r/DotA2/comments/1ka4bp/q_why_was_sou...

Looks like Bot was defeated at least 50 times - https://www.reddit.com/r/DotA2/comments/6t8qvs/openai_bots_w...

The method the reddit commenter (menohaxor) described to beat the openAI bot reminds me of the Go Deepmind Showmatch, Game 4, Move 78, where Lee Sedol made a highly unorthodox move. I'm reminded of this because in any serious 1v1 mid SF match, no one would try to win in the way by baiting enemy creeps into the jungle repeatedly to win by slowly chipping away at the T1 tower, it actually sounds like a joke on paper. So it seems that for the time being, in real-time games like dota and Starcraft, unorthodox plays and cheese strats could be very effective.

It would be a fantastic side project to teach bots to learn to speak "Dota". Learning to communicate efficiently is likely much easier as the vocabulary and intentions behind them are constrained.

I was not thinking bot-to-bot communication, but communication in mixed teams. I.e. participate in a mixed team of bots and humans, and communicate with team mates without them necessary knowing that you are a bot. We are approaching Turing test, but in this case it would be enough to talk "Dota" - i.e. communicate efficiently about the game actions with your team - no need to be able to discuss about any topic.

Side question: if we consider a team of 5 bots, would they need to communicate?

If you wanted them to play "within the rules", they would need some way to communicate via the game.. Text chat would make the most sense since adding in TTS / STT would seem unnecessary.. They could even communicate in some kind of shorthand language that only the bots understand..

I think the grandparent is asking whether the bots would need to communicate at all, as opposed to just "knowing" what the other bots are "thinking".

I kind of envision bots communicating via some form of "side channel" like bees dancing (coded messages in micro-moves/emotes withe their avatars)...

Don't humans usually use voice chat in competitive Dota? I.e. a out-of-game channel (as well as several in game channels such as minimap pings).

Just off the top of my head, they would only need to communicate if computation time is enough of an issue and they wanted to share conclusions without all of them recomputing those conclusions, or if there was some randomness they wanted to share before it became apparent.

I think so. The slight difference in latency would push their decisions slightly out of sync.

Like if the opponent changed where one unit was moving within a lag delta and a different decision was made, but also seeing a teammate bot execute a decision first would alter the inputs for the others.

I'm very curious to see how a 5-AI team would "plan" smoke ganks. Really exciting time for technology as deep learning is moving into gaming.

I'm reminded of the games played in the movie 'Her'.

For the non-DoTA players out there, there's a lot of specific jargon in the above comments.

I've played a few dozen hours, and I understand most of them.

1) "smoke ganks" means that a team goes invisible, in order to surprise the opposing team, and kill as many as possible in a short time.

2) laner: one of the main 3 roles that one of the five players in a team can take (e.g. support, laner, etc).

3) and so on : "shrine", "regen", "soul ring", etc, are all very specific locations or objects or concepts (regen = regeneration).

DoTA is a really, really interesting game. Very strategic, great lore and graphics, and a huge fan base.

laner refers to the initial resource gathering period where you kill the minions spawned from each team's base

That's the laning stage. The laner is the person playing in the specific lanes during the laning stage.

The bot programming system for Dota is set up to have 3 levels of AI:

1) Team level, which dictates what the team as a whole is attempting to achieve

2) Mode level, which dictates what the character is trying to achieve individually

3) Action level, which dictates the specific responses being undertaken

Smoke ganks would be a case of the team level instructing the mode level to simultaneously group then seek a kill, which in turn would tell each of the action levels to move across the map, attack, etc.

For reference: https://developer.valvesoftware.com/wiki/Dota_Bot_Scripting#...

I'm actually curious to see how it would've done if Dendi would've gone for a very unconventional play. I wonder if the AI is more trained towards the generic 1v1 matchup, which let's be honest, a lot of the 1v1s look play very alike and it comes down to micro. But if he would've done something completely different, would the bot know how to react?

all those limitations are standard rules for 1v1 match. Its not specifically for that match only

Saw this live and as someone that's played Dota for many years, it's not as impressive as people make it out to be. A Shadowfiend 1v1 is primarily a test of mechanical skill and not judgment.

A similar comparison is if a Counter-Strike or Quake bot would instantly snap to your head getting a headshot. Like, yeah, cool I guess, and it would beat 100% of human players 100% of the time, but it's not even remotely as impressive as Chess or Go. The razes in particular are "skill shots" that are sometimes hard for humans to estimate whereas for a robot it's just simple math.

I will say that the one impressive aspect of the bot was raze faking, but I think that it was one of those cases where it was "coached" (same as the creep blocking).

It pulled some interesting tactical moves that I was keeping an eye out for.

- In game 1, when the AI triggers his flask of health regen (which gets dispelled if you take hero damage) it zones Dendi out knowing he's going to come in for an attempt to dispel his regen. It does this with lots of non-committal, pre-emptive shadowrazes. It doesn't do this any other time. It's aggressively predicting what Dendi may favor based on its own cirucmstances.

- In game 2, Dendi said he was going to try letting one creep ahead and see if it would give him a lane advantage. [very technical DOTA mechanics] What this does is if the opponent creep blocks better than you, their creep wave strikes your overextending creep without taking any damage, under their tower, thus significantly weakening your creep wave, making the tug of war swing very heavily in your favor. You counter this by also letting one of your creeps out ahead as soon as you realize it. The AI that was perfectly blocking all the way till the creeps met up in G1, as soon as it saw the leading creep let one of its creeps forward. A perfect response that is highly unlikely to have been coached but determined as the best choice to, what I'm guessing is to return the game to a familiar state.

- The bot cornered Dendi between some trees and his tower, and then with 0 fakeouts/hesitation shadowrazed him. It was not just a matter of knowing the target was in range, but also accounting for the probability that he could dodge it. Not 100% certain about the mechanics here, but I believe: Shadowraze has a ~.2s cast animation. It can be canceled during the first .1s (fake out), but after that it is committed to, you will complete the animation, spend the mana, and cast the spell. If the cast crosses this window and the opponent is capable of moving the short distance needed to dodge in this time, it is theoretically not something the AI can guarantee on pure mechanical skill. So cornering is an approach advanced players use to reduce the probability and room for dodging a skillshot, and I found it taking instant advantage of that opportunity very impressive.

Besides the obvious mechanical precision of a bot once it's made a decision, the other inhuman factor here is what Dendi mentions, that it has zero hesitation to take advantage of even the smallest opening it has available. A pro human who is even instantly aware of an opportunity, no matter how theoretically certain they are that they should take it, hesitates at least a little bit to think "is there a trap? Am I not considering something?"

Hesitation is an incredibly human trait.

Bots hesitate via thinking time. You see this in chess matches most obviously when a high level bot spends more time on some moves because it realizes the situation is unusually important or tricky.

> but I think that it was one of those cases where it was "coached" (same as the creep blocking).

What makes you think that? In the interview, they explicitly disclaim this.

No, actually they explicitly state that they also coached it on what they though was good or bad (4:33 into the video). Having said that, I guess we still don't know which game mechanics were coached.

In the interview, they literally said they coached it to have certain behaviors.

They said "yes, it does that"/"it's been known to do that", not "it was coached to do that". Just as a "We've seen it do that before."

Yeah, just re-watched the clip, I think you're right. My mistake (the word "coached" confused me).

4:33 into the video: "[...] and coached it on what we thought was good or bad".

I didn't get that from the interview. It implied the bot learned the game itself from a blank slate.

In this video listing "learned bot behaviors", they show creep blocking as one such behavior: https://youtu.be/wpa5wyutpGc?t=25s. So it looks like the bot figured it on its own. (blog post: https://blog.openai.com/dota-2)

The developer interviewed claimed that there was no domain-specific knowledge and implied (though didn't explicitly state) there wasn't any training against non-OpenAI bots or players. (I'd love to know the reward function they used for whatever Q-learning variation they ran with.)

If this is accurate, one of the things I'm most impressed with is that the bot figured out creep-blocking. (I can't find a good GIF, but this is walking in a wiggly path in front of the first wave of neutrals on your side, delaying their progress and pushing the lane towards you, which is good for ~reasons.)

Creep blocking isn't all that hard in dexterity--I am a terrible dota player and I can more or less do it. And it's one of the most common pieces of dota knowledge; every pro player does it and since it's relatively easy compared to a lot of pro micro, everyone else rapidly learns they should.

But nevertheless--the bot had enough games that it could randomly jump in front of the wave enough times that it noticed a win rate improvement for that slight wave push, and begin to do it intentionally? (And then get good at it?) Damn.

One thing I don't really know about Q-learning and the typical nets used for it: I am guessing it is likely that internally to the bot's evaluation functions, there is some learned feature whose activation correlates well to the location of the wave equilibrium (since that's a feature that correlates well with winning!) At that point, is it likely that the bot can learn in smaller increments--that is, it knows that pushing equilibrium towards itself is good, and thus randomly creep blocking a little becomes reinforced (rather than having to notice the creep block's effect on game wins?)

I don't know that they specifically said there was no domain-specific knowledge. iirc they said they didn't "teach it the rules of dota" but they also said the training involved "coaching". I interpret that to include showing the bot useful techniques (like creep blocking) which the AI then learned leads to higher win rates, etc.

They also mentioned that in the beginning the bot figured out that the best way to win was to not play the game (aka hiding in it's own base). Then it started running around wildly, dying to enemy towers in the wrong lanes at the map. So it definitely took some nudging getting it to do something more than being AFK in base.

It doesn't does not need any nudging.

The idea is that first generations run around dying so the optimal response is to just stay put and let the opponent die. This now has basically 100% win rate so both of them do it, lowering it to 50%.

The other part is that there are random variations to the strategy it uses. This it experiments - they won't stay in base forever since that is giving you only 50% winrate (based on whose creeps would push tower first).

This random variation then means that the bot might go out, autoattack a creep wave and then come back to to base. Or anything similar to that.

This new strategy now has almost 100% win rate and the stay in base is no longer viable.

If it went out and died to a tower, it doesn't matter, it can just not use it and there will be period of base AFKing again, but it only takes one improvement (pushing a wave or hitting tower couple times) to make the base AFK obsolete and force new progress.

> They also mentioned that in the beginning the bot figured out that the best way to win was to not play the game (aka hiding in it's own base).

"A strange game. The only winning move is not to play."

- AI "Joshua" in "Wargames", 1983

How did it figure that? There's pretty much no way to win if you're just staying in base.

The idea is that by staying in base you don't die, while your opponent (which is still pretty bad) roams around and dies.

This highlights the importance of gradual adversarial training: if the opponent was perfect, there would simply be no way to win, and hence no signal on how to gradually improve to eventually find the optimal strategy.

Also shows a weakness in most training algorithms today that don't have, for example, the capability to watch their opponent closely and try to mimic it's actions in case he's better than you. Humans do this so quickly you can see Dendi already tried replicating the AI's strategy after 1 loss.

Barring faction imbalance, there's a 50% chance to win by staying in base. However, I'm not sure how running around and dying could negatively impact your win percentage unless the other bot is also outside of the base.

I don't understand what you mean. If you stay in base, the opponent won't, and will quickly win the game.

This is during the training phase, the opponent will be a similarily trained AI.

Ah, I see, thanks.

You might (if timed correctly) steal aggro from your creep wave which will lead you to win the game if the other hero is AFK.

You might also pull aggro from their creep wave, which I think should also win you the game in most circumstances.

I think "coaching" just means "curriculum learning" (which is basically as you've described).

But in the future coaching might indeed just be a human expert talking in natural language to teach it how to be better! https://www.youtube.com/watch?v=L_jGDV_5gPA&t=6

Not whole lot but some of this stuff can be seen in this video - https://www.youtube.com/watch?v=wpa5wyutpGc

I still can't quite tell if this behaviour was coached to the bot or it figured it out itself.

I think they either had a reward for creep blocking or they used the "learning from human preferences" paper to coach it for that behaviour. I would be very terrified if it learnt creep blocking on its own with no help.

I expect your conjecture is correct. That's the whole point of deep learning -- there are many layers that automate what would otherwise be human feature extraction.

The magic here doesn't come solely from deep learning, but also from having access to massive simulation. Simulation can make an almost average human-level deep neural net become better than the best human. It happened for Go as well, where AlphaGo learned by self playing millions of games.

I think there is a deep link between simulation and AGI. An AGI would need to be able to imagine how people and objects would act and react in any situation, which is the same as the ability to simulate the world, or to imagine. We might be able to create small simulations like Dota2, but the real world will be much harder.

Yes, I've often thought that too, and really as humans we are running simulations all the time in our head. That's kind of what imagination is.

We run through conversations, physical events / muscle memory, are constantly predicting the world around us - we often don't even remember moving through the environment on regular routes unless something unusual disturbs our running predictions.

It's super interesting to see some of that come through, sure in a more 'brute force' manner, but we randomly brute force bumped through the world as children too, until we pruned our selection trees.

> The developer

I was surprised to learn that he is the CTO of Open AI.

> If this is accurate, one of the things I'm most impressed with is that the bot figured out creep-blocking

Same. It is such long set of moves to get a perfect block. To have such a long move-set figured out as something advantageous is very high level of planning from the bot.

So an interesting thing you could do as a game company with an "unbeatable" bot is to use it to balance a metagame. Let the bots all play each other and tweak character stats etc until they win a proportional amount of the time. (This presupposes that the bot learns the game the way OpenAI claims it is, without needing to learn from player replays or from playing in a very different way than human players do with tree search etc)

I think balance is a bigger problem than people realize. It's much harder than you'd think, and the most common way of balancing things is to simply make them more "samey".

I was part of the Warcraft 3 beta. When it started all the races felt so unique. It's a bit foggy now but I think, for example, all the elves buildings - which appeared tree like - would automatically regenerate. Once they started balancing the game, all these special traits fell away. All the races were homogenized and despite still being different, all played much more similarly. Honestly, it ruined the game for me. I feel if I had just picked it up upon release I would have enjoyed it, but watching all these unique traits and play styles fade away made me realize what could have been.

This is luckily a problem Dota2 doesn't suffer from, we've been incredibly lucky to have Icefrog be the main force behind it for over a decade. It's an incredibly well balanced game, at this International (World Championships for Dota2) 112 of 113 heroes has been picked and/or banned. The amount of diversity you have in strategy is amazing, there is a meta obviously, but teams have very individual styles and heroes they prioritize.

As far as I'm concerned Dota2 is probably the most well balanced competitive video game out there.

It's also interesting to note that icefrogs balance strategy revolves around strengthening strengths, and making initial weaknesses more pronounced. I think this helps a lot with diversity.

> I think balance is a bigger problem than people realize. It's much harder than you'd think, and the most common way of balancing things is to simply make them more "samey".

I completely agree, but Dota is the very rare exception to that rule.

If one heroes ability is "too strong", most games would turn it down 20%, but dota instead perhaps lowers the hero's armor by 1, or lowers their sight range, or base strength, or something even less intuitive and tangentially related.

What you get is 100+ very unique heroes, that somehow manages to stay fairly balanced. It's quite an amazing game, if you can handle the timesink.

A big hero pool you can only pick one of each from is probably easier to balance than something like an RTS where there are a fixed set of units for just a few alternate races and the players typically choose between them once at the start of their career.

You have to consider how every single one of them interacts with all of the others and even less serious imbalances will be present in the games all the times for players that are unable to switch to playing anything else.

I don't think that's true at all, since you have to consider how each team of 5 heroes combines and synergises. That's a far larger part of balance than individual heroes. Players also play roles, but typically play more than a handful of heroes and so those combinations are in reality also very numerous.

On top of that, you then have items which change how heroes can play and, more recently, talent trees which can significantly alter how they play also. Dota's balance is such that a hero can play both a carry role and a support role depending on how the player builds and plays them, and there's a number of examples like that which do get done at the pro level.

That's an incredibly diverse scope to balance.

On the other hand limiting it to one hero of a kind per game helps quite a bit. Overwatch had the problem of very static team compositions early on because they didn't copy that particular feature of MOBAs. I haven't been following that closely but I assume changing that helped quite a bit. Of course they might still have some issues because of the small number of heroes available in the game.

Each composition doesn't have to be perfectly balanced against every other either and the system of picks and bans helps.

> It's much harder than you'd think, and the most common way of balancing things is to simply make them more "samey".

which is the wrong way to balance a competitive game

Homogenization is a huge problem with nearly every major online game I've played (WoW, LoL, Hearthstone, etc.)

Your looking at the blizzard school of balance.

Think going to a theme park. It's very pretty and attractive but underneath it all, the experience is always on rails without any deviation.

That's one way to deal with complexity. I hate those games, because the moment you become human, and try to do something intuitive but undefined and unwanted by the programmers, you get penalized.

Dota doesnt do that, although the recent patch feels like a shadow of such design: but nothing like league or blizzard

> I hate those games, because the moment you become human, and try to do something intuitive but undefined and unwanted by the programmers, you get penalized.

That's what I loved about Ultima Online. Characters were basically "Here are 100 possible skills to learn and 5 base stats. You get 700 points to distribute among your skills. Good luck"

My fav part was that doing things that require strength made you stronger, doing dexterous tasks increased dexterity etc.

While fun, the game was also an absolute nightmare for beginners. You really needed to find friends that knew what they were doing to help you out. This of course created a wonderful community.

Have you tried Path of Exile? You might enjoy it.

+1 for PoE. The only reason I don't play is because I can't stop when I do.

Haha of course - and absolutely right. I loved it.

The recently released Albion Online feels a lot like UO (and also has some of the sandbox territory control of EVE)

Balance for humans and robots is different. Different play skills are easy or hard and those skills matter different amounts on different heroes . Effectively instead of balancing for fair heroes, you'd be balancing for maximizing the performance of heroes with the small human-robot gaps. It would be very easy to make the game boring because in some sense you're making the easy to play heroes the best ones and I think many games aim to be biased slightly in the opposite direction.

Humans are not machines.

Games are mostly based on the skills of the players, so should games be balanced by assuming a human can do what a machine can? Of course not!

A bot can usually make computations way faster and more presice than humans.

Let's say a bot plays against another bot, both with different characters. One character might seem really overpowered compared to the other, but that might only be because the bots can react to certain events thousands of times faster than a human. Without that reaction time the tables might turn and the other character might be clearly stronger.

So how fast should we assume a human can do a given action? How precise should we assume the player is at fast calculations?

I do see a lot of things it can balance, but letting bots learn by themselves with no limitations will definitely not work out well.

You can restrain the bot by emulating human limitations if the goal is to use bots to figure out the game balance.

Add latency. Limit its actions per minute including camera movements and maybe even inertial limitations of input methods.

And even if the bot is not a perfect imitation of a human it still might figure out surprising things that would take human players years of real time to figure out by chance since they can only spend so much time on playing and experimenting.

This approach seems like it would be more useful for bug discovery than game balancing. Actively collecting a slew of performance metrics, similar to how companies like Blizzard already do, would give you a better idea of how to proceed with balance changes because you can review the data as it applies to your actual playerbase and even organize it based on ranking categories to get an idea of how your decisions would affect players at different skill levels.

This was generally my interpretation as well. MMORPGs like World of Warcraft already struggle to balance classes with high skill ceilings for use by both the casual players and the hardcore players while keeping those classes enjoyable for both groups. In some ways you could call that the dev-meta, making a competitively balanced meta game while still offering an appealing level of fun for the various subgroups of the player-base.

> but that might only be because the bots can react to certain events thousands of times faster

AI bots such as this one are usually limited to emit actions every N frames, so they can't possibly react faster even if they can solve the perception problem. There is nothing interesting for AI advancement in competing for the best reflex speed - it's all about strategy.

Aside from human vs machine style comments, the issue this would also have to avoid is what's referred to as "over tuned". ie when content is balanced to a point where variation has to be maximally efficient. See also http://nethack4.org/blog/strategy-headroom.html

Icefrog also balances the game for fun, not just for the theoretical limit of what's equally fair.

Except Dota is not balanced so that heroes have the same win rate (LOL does that and it's a boring uneventful game for large stretches of time). Dota is balanced also taking into account the map, hero synergies, item builds, match progression etc, and it takes a genius like Icefrog to make everything fun and interesting at the same time.

These types of MOBA games are a good precursor of military unit management. To be honest, I'm not totally sure whether to be happy or sad about this sort of thing. The strategic control of drone units on a combat field, without the loss of personel (on the countries with this technology, anyways) should make me happy, and does, to a point. BUT the potential for abuse (and, no, I haven't even begun to extrapolate towards the Terminator, nightmare scenarios) is rampant. Fewer people with a conscience on the battlefield or controlling the apparatuses of war may or may not be worth the cost of the lives of young men and women..... but I think wars should be fought by old men and women with swords, anyway. If you're old and can look i to the face of your enemy while you kill them, there's a better chance it's worth killing and dying for (by the numbers, anyways).... plus, the President, Congress and the Senate have to serve in combat positions, in my little fantasy scenerio =)

These types of MOBA games are a good precursor of military unit management.

Not really.

There may be some limited use cases where this beats traditional sandboxing, wargaming or milsim. However, the primary distinction is that DOTA and the likes have fixed rule sets (albeit large and complex), and a near infinite number of simulations that can be run, wherein the actual "battle" will conform to those rule sets and play variability.

It may not seem like it, but we (military) only have a tiny slice of possible data from the past few hundred years, that would be appropriate to train a DNN on. Not only that, there are few wars that conform to previous war patterns. There are some constants, and we study those, but they don't really give you what you need for troop movements and tactics the same way as is done in DOTA.

Yes, this is perfect for a future of swarms of drones fighting it out against each other.

I think it's scarily close to happening, or already has. Just the other day I saw the Drone Racing League on one of ESPN's alternative channels, and all I could think about was how effective they would be as kamikaze bombs. If these people can race drones up to 150+ mph and turn sharp corners through obstacles, then the military is already two steps ahead.

DotA is not a MOBA and it's not a very good precursor for military unit management. I'd say Starcraft is a better comparison, and they are working on applying DeepMind to that right now [0].

[0] https://techcrunch.com/2017/08/09/blizzard-and-deepmind-turn...

Dota definitely is a MOBA. Dota is the MOBA from which all the others are devised [0].

[0]: https://en.wikipedia.org/wiki/Multiplayer_online_battle_aren...

Some people dislike that term. Many people in the dota community prefer ARTS over MOBA. Though yes, technically it’s a MOBA; many other games could be MOBAs because it’s fairly vague.

DotA and Dota 2 are the games that helped define the term MOBA. I'm not sure why you're saying it isn't one, especially since it's in the first sentence of both Wikipedia articles.

Sometimes people prefer to label dota as an ARTS: Action Real Time Strategy.

Some people dislike the MOBA nomenclature because it comes from Riot (maker of League of Legends; a clone of dota). In addition the term is pretty vague. Call of Duty, Starcraft, and DotA could all technically be “MOBAS”.

ARTS describes it a little better but I don’t think that term is great either.

In my opinion ARTS doesn't describe it better but introduces confusion since it implies these games have something in common with real time strategy games. I don't think they really have anything to do with RTS despite being originally conceived on the Warcraft 3 engine and assets. At least MOBA makes it clear that it's an entirely separate category.

> anything to do with RTS

Managing farming times and item spikes is part of the strategy. Of course it's not a true RTS, but I think that term is a bit better than MOBA; I'm also biased against Riot.

DotA is generally considered to have created the MOBA genre. That being said, you are right that MOBA's aren't as good a comparison for military unit management as games like Starcraft or Warcraft.

I think one of the biggest challenges in moving from 1v1 to 5v5 is the substantially increased possibility of creative strategies. In a 1v1, there isn't that much room for innovation, making it a perfect target for existing AI techniques, which are good at learning what is "in the data" but have not yet been shown to be capable of thinking outside the box. In contrast, in a 5v5 matchup, a much wider range of strategies is available. I'd imagine that in this more flexible setting, AIs would be highly vulnerable to cheese strategies, or in general strategies they have never seen before. Furthermore, if there are any exploitable quirks in the AI (which seems quite likely, given the established lack of robustness of neural nets), I have no doubt that clever human players will be quick to exploit them. On the flip side, I think that makes 5v5 a far more challenging and interesting goal that would be very exciting if achieved!

As a side note: I think the most impressive aspect of this AI is that it was trained without watching any human games. In contrast, AlphaGo was bootstrapped from a bunch of existing human games, which I imagine was crucial to its training. Being able to learn how to fake out the human player without having seen this action before is quite impressive.

Without even getting into misconceptions and inaccuracies here, the following quotes illustrate "the moving goalposts of AI": The almost-metaphysical "mystique" that many attribute (perhaps subconsciously) to human intelligence vs machine intelligence:

  > AI techniques, which are good at learning what is "in the data" 
  but have not yet been shown to be capable of thinking outside the box

  > AIs would be highly vulnerable

  > exploitable quirks in the AI

  > established lack of robustness of neural nets

  > clever human players will be quick to exploit them
Of course, it is important to identify pitfalls, so as to address them. But it is sad that no matter how good AI gets, there will always be pessimistic reasoning downplaying its success. With every step forward, there are the predictions of impassible future failures -- "just around the corner"!

Due to this mystical "cleverness" attributed exclusively to humans, I'm not convinced a majority of humans will ever allow machines to be considered 'intelligent', no matter how far the technology advances. I'd love to be proven wrong, though.

Gaming skill is often divided into mechanical skill and decision making. 1v1 like this is dominated heavily by mechanical skill (frame-perfect last hits/denies, the absurd creep block, etc). Generally, it's the sort of thing you can "download skill" in (cheat at) because computers are very good at it.

Decision making is how quake pros can beat aimbotters, and starts to become very important in 5v5 DotA, particularly at a higher level. You'd see it in terms of drafting, ganks, map vision/control, map positioning, sneaking rosh, and other unexpected meta strats. Considering the real-time computation requirements, this is almost certainly a much harder problem.

So while it's entirely possible bot-team would win just through insane mechanics snowballing the game (ganking 0ms reaction times would be fun), or through good-enough decision making + insane mechanics, they could also be outplayed.

"ganking 0ms reaction times would be fun"

wonder, against this type of AI, if it's even possible to land something like meat hook or sacred arrow?

on the note of good decision making, maybe even a failed hook could be enough to exploit the bot into death. for instance, hero X chases bot back toward tower. hook comes in predicting the bots path. bot would have to either walk into the hook or walk back into the hero chasing it.

I will consider an A.I. truly intelligent when it passes the Turing test with flying colors. And as a native English speaker, too (not just barely passing by posing as a 13 year old Ukrainian boy with a poor grasp of English, which is what happened last time).

As far I know, no A.I. developed has ever met this bar. It seems that the nuances, inconsistencies, and total freedom of language make it a totally different kind of problem to solve than a simple game-playing algorithm.

Or maybe I'm wrong and getting an A.I. to pass the Turing test is just another kind of game-playing algorithm with different parameters. At any rate, we're quite a way off.

Again, you're moving the goalposts forward whenever AI surpasses them. This time you move the goalposts forward by requiring a proof of native English proficiency. Next time it will be something else; that goalpost will be achieved, and when it is, you will move the goalpost forward yet again.

Perhaps you'll require the Turing+++ test be conducted with audio rather than text. Then it will be video. Then a physical android. Then it will need to prove itself better than the human in an IQ test. Then it will need to pass various real world success tests, because IQ tests are too synthetic. But those won't be good enough either, so it will have to live 80+ years as a human in the real world without being discovered. But that's not enough, because maybe its life was too easy -- the real human experience is characterized by pain and suffering as well as joy, so we'll also need it to somehow prove its experience is more real. And on and on and on. It's always something else -- with every step forward, the success will be dismissed and the goalpost will move forward.

I should say this isn't intrinsically a bad thing, since it pushes us to advance the field of AI more and more. But IMO this is too often promoted as a pessimistic criticism of the AI field, rather than an encouraging push of optimism.

Why must this notion of binary "true intelligence" exist, anyway? Can't we see that intelligence is a spectrum (most likely with multiple dimensions of proficiency), and AI is continuously advancing upwards?

The Turing test is hardly a new measure - it's been around for ~70 years and hasn't been passed yet. But otherwise I agree with what you said - this is just my personal measure of when we'll be in society-shaking territory.

It has been passed, but the definition of passed changes to make the test tougher each time. I think that is what the other poster is talking about.

Has it been passed? Turings original paper assumes adults conversing intelligently. That is the spirit of the test. Not a clever hack of the rules, like posing as a child which doesn't speak english.

This is the $20000 long bet from 2002 between Kapor and Kurzweil:


> Ray Kurzweil maintains that a computer (i.e., a machine intelligence) will pass the Turing test by 2029. Mitchell Kapor believes this will not happen.

Level 4 autonomous vehicles with a limited operational domain will already be in society-shaking territory without showing general intelligence, they only have to fail safely on outside domain problems.

Most of those are quite logical though. If you want to say that a general AI has an intelligence equal to or exceeding human intelligence then it should perform reasonably similar to a human at all tasks and in all scenarios.

A human that can solve complex equations in a few seconds would probably be considered highly intelligent, but a machine can do the same in nanoseconds. Would you say a calculator is an intelligent AI?

Will you consider a set of AIs based on the same underlying architecture that can perform several different online jobs as well as their human counterparts truly intelligent, without any reservations regardless of the approach used [1]?

This would include the ability to communicate with their human hirers and collaborators in natural language.

[1] We can also think of the human brain as a massive structure of relatively simple neurons with methods of communications among them that can be translated to mathematical parameters.

In the interest of a "fairer" comparison, I wonder how much of a difference it would make to force the AI to simulate mouse/keyboard input and interpret the raw screen buffer output, rather than using direct APIs into the game's guts, to more faithfully emulate its human opponent. I'm guessing the peripheral inputs wouldn't be much of a hurdle, but the image processing step could be very interesting.

How do you know it's not doing that already? I was under the assumption that open ai agents interacted with the environments through vnc.

If you're interested I did something similar to what you describe for MarioKart http://kevinhughes.ca/blog/tensor-kart. Its a lot simpler than DOTA though

Raw screen is not as big of a deal as it would seem, in DotA and Starcraft you can glimpse most of the information you can see on a screen through the minimap.

It would seem to me that this is an oversimplification. Doing screen based video processing is not simple and is not exact especially where camera movements are independent of game dynamics as well as the difficulties inherent in doing fast frame classification- there has been some good work in this direction recently but far from perfect(yolo, ssd). The route deep mind took in partnering with the game manufacturers to gain inputs directly while receiving global information about game state seems the simplest for their goal of training rl agents. We considered doing a project around this but figured it was a massive iceberg.

Oh I misread the "Direct access to game API section", I thought they meant making it so that the machine can only see "what's on their screen", so if someone were to cast a long range skill for instance they wouldn't get that information. Which is a distinct advantage for machines.

That is indeed, by no means trivial. I do think it's pointless though, that's just another exercise for the sake of making it another exercise.

To be clear 1v1 mid lane is harder than 1 unit v 1 unit in sc. But it is at least 3 levels behind 5v5 human play.

So first it's laining, which is showing in the game.

Then is the ganking basically 2-3 player working together.

Above that that's 5 men team fight.

Then you have the strategy planning in the band pick phase, and in game movement coordination.

There are other things like item choices, in game communication etc.

I guess bots never tilt...

This 1v1 match is definitely a proof of the strength and maturity of modern AI.

It will be extremely interesting to see if they can train a bot to achieve the above intelligence. If so, I guess the game is pretty much losing a lot of its appeal.

The micromanagement task in StarCraft Broodwar (with TorchCraft) that has been used by a few people already in the past year to do some RL research has been tackled basically only with #units > 1, because getting a balanced 1v1 is possible only with certain combination of units, and it's indeed pointless otherwise.

Kiting is maybe a notable exception, but it is also restricted to a relatively small group of unit matchups.

See for instance the following papers:

- https://arxiv.org/abs/1609.02993

- https://arxiv.org/abs/1702.08887

- https://arxiv.org/abs/1703.10069

- https://arxiv.org/abs/1705.08926

Full disclosure: I'm one of such researchers, and I have authored some of the papers on the topic.

I'm not sure about DOTA, but in LOL there is also champion select which happens before the game starts, where you try to pick/ban/steal champions to get a good team comp and get the other team to have a bad team comp. It's lots of know overall strategy, what your enemy likes, and some game theory.

You can lose the game before your summoner even hits the field.

Main difference is all Dota heroes are viable in competitive. Also, there's no mirror match

Only 3 heroes not picked in TI, iirc. It hasn't always been this diverse though. This patch is particularly well balanced.

Picked heroes have been >80% since TI3. Compare that to LoL's ~50%


Out of 112 heroes which are currently available in Captain's Mode, 107 have been picked at least once during the group stage and main event. Two (Lion and Tiny) have each been banned once but never picked, and three (Bane, Spectre, and Skele-- er, Wraith King) have not been picked or banned.


Put my war3 dota hat on... LoL is a simplified dota which are tailored to less-skilled players. The basic elements are all the same.

Yup, that's exactly the same in DOTA. The pick phase is considered by some commentators to be the most interestingly strategic part of the entire game!

(At a pro level, obviously. At the level I play at, not so much...)

Yep it's called Drafting

How does having a bot succeed cause appeal to be lost? First you're suggesting that the strategy will be "found" and never beaten (no room for improvement, not even in hero selection). Secondly.. Chess & Go have been around for thousands of years and they still are played en masse daily.

> How does having a bot succeed cause appeal to be lost?

You might think not beating bots is OK. Not for me and many people have been involved in competitive scene. The moment machine beats human, a large part of the competition is gone. The essence is that you want to be the best. But if all you can do is to beat some other inferior opponents, what's the point?

> Secondly.. Chess & Go have been around for thousands of years and they still are played en masse daily.

This is irrelevant.

Human Chess players lost against Deep Blue in 1997; people still play Chess. In fact, they've learned a lot of strategies from the strategically superior AI.

We've had forklifts for decades and plenty of people still enter power lifting competitions.

This will be HUGE for competitive Dota, even if they never take it further.

In much the same ways top chess players have learned from engines, I have to think top Dota players will practice against and study the hell out of this bot if OpenAI makes it available.

Not everything is replicable by humans... for instance I noticed it constantly animation-canceling razes and only finishing it if it was going to hit; a human will definitely mess this up. But other things can definitely be used. It was positioning somewhat strangely, for example.

It was streaming and it's over, so here's the VOD: https://www.youtube.com/watch?v=ac1getNs2P8

Wow, that was not even close either. And it only takes it two weeks to reach that level of play? OpenAI should create a team of five (as they said they would) and let it play in the qualifiers next year.

To be fair, the 1v1 Shadow Fiend matchup with these rules is A LOT about exact distances and timings. To which the bot has direct API access and the human has to got the whole GPU -> screen -> eyes -> brain -> muscles -> mouse -> CPU cycle.

Agreed, bot will always be good at spell casting, last hitting, which actually is a major factor in winning(the sooner you get higher level you can zone the other player).

Especially for Shadow Fiend, which, for those who don't know, gets extra damage for every last-hit. If you're good at it, you can have extra damage equating to a late game item as an early game character.

Shadow Fiend really comes into its own when blind shots happen, as with many other heroes, fights can be lost to line-of-sight. That requires knowing your opponent's mind and prediction. I can't watch the video just yet, but this is the human behavior you should look out for (such as AlphaGo demonstrated repeatedly in the most recent match).

5v5 is much more complex than 1v1 gameplay, it's not as simple as taking the current AI and making 5 of it.

I don't want to be "that guy", but I used to play a lot of LOL, and I feel like 5v5 is in a different ballpark in terms of difficulty than a 1v1. Even champion select is strategically intensive. TSM (one of the best US teams), lost to an amateur team because they got out done in team selection.


It is extremely difficult. 1v1 is like beating humans at chess. 5v5 is like beating humans at Go. That's the kind of jump it is.

I would even bring that to one more level of difficulty. There are so many choices can be made in-game to gain advantages over enemy: item build, gank/farm distribution, resource sharing, vision fight, choose to defense or let go/trade of tower/objectives...

If we can have AI that beat Pro at 5v5 in Dota, it would very likely be an AGI.

No. The value function for the AI will be way too specialized to look anything like AGI.

I'm not arguing that, just that it would be cool. The creators mentioned on stage that next year it would be a team of five bots. They also said it took the bot two weeks to reach the level it was at right now, so I'm curious to see how/if it'll evolve in the future.

also, it only will be fair if the bots are completely independents instead of a unique bot controlling 5 heroes

Given how fast these things generally progress, I wouldn't be surprised if these 5 bots would win the championship in 2018. Or maybe in 2019 at the latest.

I wonder if this a good move for Valve. How many people will be loose interest from the game if next year OpenAI manages to pull off the 5v5 challenge, thus bringing DotA 2 to the same level as GO and chess for AI, i.e. beating the best human(s). Has there been any study on the popularity of chess after AI moved in? How about prize pools, as a large driving force of DotA 2 is the competitive scene? I suppose Valve will find out soon enough and it may be a significant event to study of the impact of AI, if only by the scale (> 10 millions players [1]) and the traceability of the metrics. What if the numbers do indeed plunge? Maybe such public display of AI might be given second thoughts. And what of outside the gaming industry?

[1] http://www.criticalhit.net/gaming/dota-2-vs-league-legends-u...

Of course it's a good move. Millions watched AlphaGo, many of whom have never played Go before. Millions, almost every single gamer will watch AI vs Dota that will be amazing marketing for Valve.

I'm talking mid to long term here. To rephrase, is there a sizeable amount of players who will stop playing, knowing that they cannot ever be the best? Because when AI beats one, I don't believe there is a coming back. For the concrete AlphaGo example, of course it was hugely popular but I would be interested in the evolution on the number of players, how many of those viewers started playing balanced by how many existing players lost interest following the matches (excluding GO researchers :) ).

I don't think it will matter at all. Literally, zero impact.

Chess computers crush the world's best humans, yet here we are in meatspace, still vying to win the human world chess championship.

Also, 99.99999% of players realize they'll never ever be close to being the best in the world, yet they still play the game because it is inherently fun.

the problem is that players will use those bots against other players, just like they do in online chess. The burden of reporting them will be too much for online play probably. It's the ultimate cheat.

imo that will have zero impact. For humans it is still much more entertaining to see other humans play and seeing the best human players is far more exciting than a perfect AI. Chess did not loose any popularity just because no one can beat the machine. If anything, humans might derive new and interesting tactics and playingstyles from AIs and use them as training partners.

Human Entertainment is one of the few areas where i don't see AI taking jobs anytime soon, because that removes a lot of the entertainment value.

It is so weird reading all these comments. Almost half of them start out as "it's impressive.. but". Is it human nature or are HN commenters just so sceptical/negative of all the new tech?

One of the OpenAI guys mentioned that they could potentially use the same technique in real life applications like surgery. Surgery is not "just" run on a computer it has a physical component too. Is that really the next step or was he just throwing out a random example people could understand?

HN culture strongly discourages unsubstantive comments like "wow, this is great" that don't add to the conversation. It's hard to comment positively and substantively on a story like this, because it requires you to understand the subject matter well enough to describe some interesting feature, or bring up related work. It's usually much easier to find a flaw, or something that looks like a flaw to fellow non-experts.

This comment section actually seems ok, but usually you'll see mostly neutral or negative comments.

In my years on HN I think I've only seen a handful of comment sections that are predominantly positive or lauditory. That's not to say they are negative, but usually critical or cynical. I think it's good, keeps people on their toes.

I haven't actually made an 'It's impressive.. but' comment yet but I've wanted to so here's why I'd say that.

Firstly, much like most things a lot of the press releases have been pretty disingenuous. I'm pretty salty about Elon Musks tweet -

> OpenAI first ever to defeat world's best players in competitive eSports. Vastly more complex than traditional board games like chess & Go.

1v1 SF mid is several orders of magnitude less complex than an actual game of dota and probably a lot less complex than chess & Go as well. More than that, 1v1 showcases haven't been a thing for quite a while now for good reasons, the last time they did a 1v1 showcase was at TI4 (in 2014). When they did these the format was best of 3 with the first two games being QoP VS Puck and then the decider would be SF VS SF. This was done because QoP VS Puck is a very mind gamey, strategic, match-up whereas SF VS SF heavily emphasizes mechanical skill where you can snowball off of last hits and denies, giving the match-up more of a sudden death feel. You'd expect a bot to be very good at the SF match-up but struggle with the QoP / Puck match-up where there is a lot more decision space.

To go back to what I said about 1v1 showcases not really being a thing anymore: Back in Dendi's day mid was actually a solo lane but these days in a real game the mid has to constantly be aware of support rotations. The bots super aggressive positioning would be heavily punished even in fairly low MMR pubs so I suspect the pros have deeply ingrained positioning rules that make them hesitate rather than match the bots stance.

All that said, the bot is really impressive and I can't wait for the 5v5 bot next year. It will be super interesting to see what strats it goes for and how much it emphasizes the bots inhuman reaction times / micro skill.

>Is that really the next step or was he just throwing out a random example people could understand?

More of the latter than the former. He was talking more in broad strokes about ai research in general then this specific project.

Why not ask them directly, if you actually want to ask rather than make up?

For me it makes as much sense as comparing someone sewing pieces of clothes together with someone pulling a zipper. Did people also have races against cars all the time? Is anyone getting excited over some plastic not changing texture when submerged in water for days or even years, while humans get elephant skin rather quickly? I do find all of that highly interesting, but it's more a morbid curiosity than being amazed.

> One of the OpenAI guys mentioned that they could potentially use the same technique in real life applications like surgery.

Oh yeah, it will all be for benefiting the elderly and the poor, I'm sure. At some point, ever around the corner. It won't benefit the people who need to have surgery in the first place because the so called civilized world can't even deal with warmongers and power mad cops, can't reign in sheer greed and sociopathy -- but nobody who has excuses today will still have them when faced with fully automatic enforcement systems.

Generally, at least allow for the possibility that someone who is not utterly fascinated by something you like might not be less curious and progressive than you, but the opposite, and that what you think is the bigger picture being a fraction of what they see. At least until you actually asked the people whose comments you don't like.

> Why not ask them directly, if you actually want to ask rather than make up?

It would be annoying to ask the same question 10 different times in a thread, I think a top level comment is better. Also, I didn't see any made up explanation in the parent comment, it looks like a straightforward question to me.

> the so called civilized world can't even deal with warmongers and power mad cops, can't reign in sheer greed and sociopathy

Can't this be used to shut down literally any conversation? It's obviously not possible for any one innovation or idea to solve all these problems, so it's not really that useful to point out. Would you dismiss any story that doesn't say "war solved"?

> allow for the possibility that someone who is not utterly fascinated by something you like might not be less curious and progressive than you, but the opposite, and that what you think is the bigger picture being a fraction of what they see

Where are you getting this from? Nobody said this, who are you attacking? This comment is so absurdly negative, and it seems completely unprovoked.

As a longtime DotA player and someone who's following the pro scene, this is very impressive. Especially considering how it's beaten Sumail, widely regarded as one of the best 1v1 players in the world. Can't wait to see what OpenAI have in store a year from now for 5v5.

Sumail won once until they gave the bot insane creep blocking skills.


>The bot didn't recognize items on ground so he expended Mana picked up mango then killed. So yes, he won, but it was more gimping the bot.

That's insane that he figured out how to beat it so quickly. I feel there are other ways to cheese it too. Like maybe survive until 6 -> rush shadow amulet -> smoke -> activate and walk into lane during fade time -> ult when the wave/bot is on top of you. I bet the bot has never seen invisibility and wouldn't know what to do

Not that this isn’t very cool, but 1v1 Dota isn’t anything like the full game, it’s mostly a competition of who has better micro. If it can beat a team of pros at 5v5– which is where the imperfect information, short-vs-long term strategy and inter-agent communication challenges come into play— then I’ll be impressed.

Honestly, you don't even have to make all the AI heroes separate. We automatically assume that each would be controlled by a separate virtual player, but one virtual player could very well control all controllable units on a team. That'd be interesting to watch.

Is the "two weeks" of time in the Dota environment, or two weeks of training time running in parallel as with A3C or something?

From what the devs indicated on stream it sounded like processing hours (with a ratio of around 300 in-game hours to 1 processing hour).

Interesting. That means it accumulates the equivalent of 24 * 14 * 300 = 100,800 hours of experience. That's about double the amount of practice one gets for playing 10 hours a day for 14 years.

Is it taking in raw pixels or reading from game memory? How did they gather data to train for this? (Did Valve give them an API?)


> Bot scripting in Dota is done via lua scripting. This is done at the server level, so there's no need to do things like examine screen pixels or simulate mouse clicks; instead scripts can query the game state and issue orders directly to units. Scripts have full access to all the entity locations, cooldowns, mana values, etc that a player on that team would expect to. The API is restricted such that scripts can't cheat -- units in FoW can't be queried, commands can't be issued to units the script doesn't control, etc.

I wouldn't be surprised if they had access to a less restricted AI. Until we see more concrete information I remain sceptical that they are only using the official API.

Why? Everything it needed for that 1v1 was available. There was nothing in the FoW that needed to be known, it can keep timers about enemy cooldowns, gold and xp (approximations at least since it's somewhat random).

I'm not suggesting they had information that a human player wouldn't (i.e. cheating), I just suspect that they may have access to a richer api.

Which would give them what? The second clause of your sentence contradicts the first.

The statements are not contradictory - they can have a superior api which doesn't permit 'cheating'. For example they could expose c++ api rather than a lua api for performance concerns. Furthermore some of the api calls (e.g. GetNearbyCreeps) have arbitrary restrictions.

I'm sure they got some kind of official API since this seems to be a partnership between openAI and Valve.

Blizzard's Starcraft did the same thing for their recently announced AI project where Starcraft 2 got an API for the AI developers to work with:


Looks like its also being hosted on Twitch.tv in addition to Youtube:


Twitch is the official streaming partner for 7 years.

This was fun. It's our intern's last day on the ML infra team here. And he happens to be a competitive collegiate DOTA player. We were all crowded around watching the screen, shouting. Couldn't have planned a better send off. Nice job, OpenAI!

As a ML researcher and an avid dota fan, I'm jealous! And it must have been great with Dendi the legend there too.

If I remember correctly, Deep Mind recently releasead a kit to train bots to play Starcraft 2. Would anyone be able to compare the differences and difficulty of playing dota and starcraft for a bot?

Elon posted these AI fear/regulation tweets right after he retweeted the opeanai dota 2 blog post:



All these fancy stuff and valve still can't detect people feeding chicken...

I wonder if some of the things the guy says are misleading.

In the video, we see the bot "creep blocking." For those unfamiliar with dota, players can use the model of the unit they control to obstruct the movement of allied computer controlled units in order to gain a favorable position.

I suppose it's possible that over millions and millions of matches played against itself, the OpenAI bot "invented" this behavior for itself. But it seems more likely to me that the programmers "built that behavior in."

It would be pretty much impossible for the programmers to "build the behavior in" to the neural network, unless you mean training on supervised data or something.

It's not impossible, it's called inverse reinforcement learning, where they learn a value function from an external demonstration. Then they use this value function for teaching the bot an action policy. Intuitively, the idea is to learn first what are a good state and a bad state, based on external demonstrations, then use that to teach the bot how to act.

This kind of learning is similar to GANs, where the discriminator learns from real data and the generator learns from the discriminator.

Very interesting! Thanks for sharing -- I'll look more into this.

Given that they said explicitly that the bot invented all of its behavior for itself from scratch, it seems more likely to me that it did so.

MEH. I am not impressed. Restricting the number of parameters and variables in order to produce a bot that can do 1 sub-set of tasks really well is nothing special imo.

Complexity and simulations of what you have not yet encountered is something humans can do with ease. This is not something a bot can do in a complex environment like DOTA.

I'll be the first to admit that I was wrong and that AI is truly a thing once I see a bot like this one beat a pro-team in a 5v5. Until then, meh...

is an improvement against the other bots, it can be really good for training players, I would love to test it my self, but yeah dota is biggest that this, not only 5 v5, also 100+ heroes, and professional player can play a variety of heroes on mid with different matchups and its nuisances, the bot still need it a lot to even dominate 1v1.

Even with a lot of restrictions, I can't imagine how much variables this bot has to take into account; and then generate the output within a sub-millisecond time period. The most interesting part was aggressive positioning of the bot, and faking the spells to scare the shit out of Dendi (the pro player competing against it). Will definitely follow the progress of this project.

I really don't think this is a big accomplishment. Dota 2 is a team game where 5 players all work together. I went in thinking I would see a 5v5 against bots. They promised a 5v5 against bots next year so we'll see how that plays out.

I wonder what the input for the bot was. Just a screengrab like in other reinforcement learning examples (like Pong or GTA etc.) or did they use some Dota API to make their lives a bit more easy?

Impressive nonetheless, like others said, knowing the reward function would be interesting.

How do things like reaction time, and actions per second work with something like this? Is it just an assumed advantage the ai gets, or does it simulate the limitations of a human? If it doesn't, how big of an advantage is it in an ai versus human competition?

That was pretty much the entire reason it won. It did some standard high-level 1v1 laning techniques, but if human "conditions"[1] were implemented it would look a lot different.

Just the blocking of the lane creeps at the start is already super-human and gives the bot a huge advantage.

[1] like lower actions per minute, latency, occasionally missclicking and not being 100% certain about distances

Seems like the AI was beaten at least 50 times.


This is really cool outreach on OpenAI's part. So many young people watch The International and I'm guessing that at least a few of them are more interested in CS and STEM from seeing this.

Where's the paper describing the exact methods used?

I love both Dota and ML, this is awesome. I would love to know whether they release the source code for us to play around.

It is OpenAI, after all.

OT: Is backdooring prevented by the game engine yet? Annoyed me that a useful tactic was "against the rules"

It's always been prevented in Dota 2. Buildings have backdoor protection, which makes them regenerate lost HP from recent attacks unless there are creeps nearby.

Not really "prevented", backdoor protection only provides 90HP/s.

Backdoor protection also gives a substantial damage reduction - 25%, and much more against illusions.

T1 towers don't have backdoor protection though.

this whole emphasis that openai and deepmind put on human/robot collaboration is a paper thin pr move in my opinion. the robots will be better than any human, humans will not be able to contribute a single thing soon. but they try to make us all feel safe by making it look like they benefit from our brains.

I know it's not the same type of AI, but in chess there's a whole scene for computer + human play. A chess engine on its own can have trouble seeing strategic ideas that humans can recognize (e.g. opposite-colored bishop endgames, certain closed positions, and fortresses) so an engine on its own will lose to that engine being assisted by a skilled human player. In other words, humans are still capable for contributing at least a little. That said, it's not much - I think a chess GM paired with an engine will probably only be able to beat an engine rated ~100 or so points higher than their own.

It will be interesting for deep learning, though, where the ideas are a bit more abstract. Perhaps humans will be useful for a while longer.

AI uses many simulations to train its network. It's appropriate for chess or Dota, but you can't do that for real war, for example, there were not enough wars to learn from them and you can't simulate war good enough. Or making a business plan for Oracle corporation: it's unique situation, you don't have millions of Oracles bankrupting to learn from it. Humanity has this knowledge, but it's encoded in books, teachers and experts. So AI will need to communicate with people to adapt to our society, they can offer advices to experts, but they need to learn from those experts first.

Very impressive, even if there are some limitations. I look forward to more progress for a team of bots.

Well the presentation was a little meh but very impressive tech they built, excited to see the 5v5.

To OpenAI folks: are you planning to publish a paper with implementation details?

When will the match happen or did it already happen? I did not find any vods.

OK it's happening right now, I saw machine the host's announcement

Weird website... No indication that a match is happening/happened/ or going to happen or with times or not.

I guess I missed it and the stream link is now just a general link to the TI stream, which confused me for a bit.

They probably dont expect the dota tournament to have such a large influence scope...

It looks like it's over…? Someone has a replay of the relevant part(s)?

As a long time dota2 player, it was simply absurd just how good it was.

Is this the start of the end of online gaming? If bots outperform humans, what to prevent someone from a running bots for online games (even Poker and Chess) and being sure that everyone loses?

Or, quite the opposite.

I played DotA a few dozen hours, mostly against the computer, to learn. Then I played only a few matches against people, but I generally didn't like them.

Why? Because there were expectations that I would know certain things, or that I would perform at a certain level, and the other players were, on average, a bunch of young kids with tons of time to play DotA and their own weird jargon and acronyms for me.

I would strongly prefer to play against bots only, and be able to adjust the difficulty, to make the game always challenging and interesting as I progress in skill level. It could also make the "competitor" in me happy by knowing what my (objective) ranking would be.

Is there a research paper on this?

this is pretty awesome, hopefully they'll release some more details about how it was implemented

Figuring teamplay out is going to be a lot more complicated but a strong 1v1 bot is already very impressive. Props to openAI, that universe didn't take off was a bummer, it was a pretty cool project as well.

Anyone else interested in seeing this bot Vs itself?


Applications are open for YC Winter 2020

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact