
Automated Testing for League of Legends - jsnell
https://engineering.riotgames.com/news/automated-testing-league-legends
======
swanson
A great anecdote in the 'we can't test this code' discussion -- for an
aggressive two-week release cycle of this massively popular game, this level
of testing has probably paid for itself many times over.

~~~
dimino
In this case? Probably. But the idea that catching a bug in production is a
huge problem is a myth, pushed by research that was later completely
withdrawn.

~~~
developer2
Depends on the bug. Sticking to League of Legends as an example, is it really
the worst thing if they have to disable a champion for a day or two to fix a
bug? Not really. It still happens fairly often, and the impact is inconvenient
but it's not the end of the world. Now how about a bug that prevents logins or
crashes the backend? This completely take the game offline.

These sort of downtimes hurt customer satisfaction and the bottom line. If
serious issues arise often enough, customers lose confidence and patience and
may leave your product for a competitor. And while you're offline trying
desperately to rush a fix, you're losing revenue.

If the service/product you're offering has competitors, quality matters.
Sometimes it's not even the measurable quality, but the perception your
customers have. Find me a business owner who thinks major bugs in production
are not "a huge problem".

~~~
dimino
> Find me a business owner who thinks major bugs in production are not "a huge
> problem".

"Move fast and break things" \- Mark Zuckerberg.

~~~
Ralfp
They moved away from that, didn't they?

~~~
developer2
Indeed. Quote from Mark:

>> We used to have this famous mantra... and the idea here is that as
developers, moving quickly is so important that we were even willing to
tolerate a few bugs in order to do it. What we realized over time is that it
wasn't helping us to move faster because we had to slow down to fix these bugs
and it wasn't improving our speed.

~~~
dimino
They grew as a company to a place where they could slow down.

------
derFunk
Wow, this is pretty sophisticated and commendable. I'd love to have the
resources to do something similar. In contrast to 'standard' application
development, automated testing is really rare in the non-AAA games industry.
At least in terms of logic-/active gameplay testing. It's pure luxury, you can
only do it if you can afford it. In a project-based work for hire game shop
this is an almost unthinkable thing to do, because you don't get it sold to
your contractors/customers. They just won't pay for the effort you don't our
directly into the game. The only thing you can do is to develop your own
automated testing framework over time and over projects, which is a tedious
thing to do because you cannot really focus on it (because it's not a first
class citizen in your project schedule).

~~~
andrepd
And yet they can't rewrite their ridiculously inefficient launcher properly
:^)

~~~
quaunaut
Actually, the launcher was replaced under 2 years ago. The matchmaking client
has entered alpha, and is accompanied with rewrites to a lot of Riot systems
to help it all work together.

------
amenod
Nice writeup! Sounds pretty similar to the way serious web apps are being
tested (using Selenium & co.). I find it interesting that they built their own
testing system though - couldn't they have used some existing framework?

~~~
badloginagain
A third party solution would be as hard, or harder, than developing an in-
house solution. The challenge is setting up the interactions between all
gameplay elements- it's going to very unique to each game.

The amount you'd need to abstract to make it reusable for other developers
would make it nearly useless.

------
amleszk
Wow this is the first time I've seen automation testing applied to a video
game. 2 things struck me as really great from this:

\- Staging area for their tests, so many times I've lost confidence in our
test suite because of a flaky test. We then delete that test forever. Adding
tests to a staging area to very that they are stable is a great compromise

\- each test is a class, with 'setup', 'execute' 'verify' \- makes the test a
lot easier to read and refactor

------
yokohummer7
A release cycle of two weeks looks _miracle_ to an outsider like me. Good to
see what's going on internally. I loved the video showing automated champion
moves, show me more! Also,

> In Wood 5 we don't use wards anyway, so I see no problem with this critical
> failure

100% agreed. There's no point in changing to lens because nobody buys a ward!

------
kyberias
Are these all the automated test they run? The article doesn't seem to mention
unit tests for example. Do they write and run unit tests? The test class
example seems to assert a bunch of stuff that might be easier (and cleaner) to
test in a set of unit tests.

Update: In particular this is weird:

> "Tests make use of remote procedure call (RPC) endpoints exposed on the
> client and the server in order to issue commands and monitor game state. For
> the most part, tests consist of a fairly linear set of instructions and
> queries—existing tests cover everything from champion abilities to vision
> rules to the expected rewards for a minion kill. "

They test "rules for expected rewards" with an out-of-process python test
program that connects to the client and server via RPC. Seems unnecessary
complex way to test specific game rules.

Is this a symptom of a separate QA team and no developer-written unit-tests?

~~~
badloginagain
It's useful as a framework for testing all game interactions. When you think
about the amount of unique abilities for the hundred(?) unique characters and
how they interact against eachother, the test matrix that results is massive.

Additionally, you want to use production server/client code as much has
possible, while getting through the tests as quickly as possible (skip front
end flow, matchmaking, etc.).

Using client/server endpoint calls are great because it allows these
integration tests to be a list of instructions (the same instructions used by
production code) to create a very deep test suite.

Integration testing is critical for games, where you have so many independent
units interacting and changing each other's states.

~~~
developer2
>> When you think about the amount of unique abilities for the hundred(?)
unique characters and how they interact against eachother, the test matrix
that results is massive.

Definitely. They're at 130 champions right now. Each with four spell cast
abilities, as well as a passive ability, and unique auto attack mechanics on
some of them. There are indeed many interactions with specific abilities
between champions. Add in interaction with the map and terrain itself, like
unit collision and NPC minions and monsters. Then deal with the 154 items (the
current count on the main map) players can add to their champion, many of
which also interact with abilities, and some of which introduce additional
abilities.

It must be both a nightmare and yet very interesting to handle creating a new
champion for the game. Once they get past the step of even deciding _how_ the
new champion's abilities should interact with other champions, they then need
to code it all and manage to verify that everything works according to
expectations. Every once in awhile, there are still bugs with how one ability
interacts with another. You can plan it all out, but it has got to be easy to
miss something. Too many interactions! :)

~~~
badloginagain
That's why the BVS they've built becomes so important. The amount of test-
cases you have to satisfy go beyond the human mind's ability to hold.

But if you have a farm that tests v1 of your new champs abilities against all
others, you know exactly where the 20% of problems will lie. This is where
designers earn their paycheck- two abilities that won't resolve by design, and
where the programmer puts 80% of their work.

80/20 rule scales with the product, I suppose :)

------
BuckRogers
With LoL being the most popular video game in the world right now, it's also
the most relevant game of all time. How they build and test it is fascinating.

I've played this game for 7+ years now and as a programmer love to hear about
all this. A Rioter said a couple years ago that they used Erlang[0] and as a
longtime Erlang (and now Elixir) admirer and tinkerer that was great to know
my favorite game uses it.

[0][https://erlangcentral.org/scaling-league-of-legends-chat-
to-...](https://erlangcentral.org/scaling-league-of-legends-chat-
to-70-million-players-by-michal-ptaszek/)

------
cpeterso
I'd love to hear more about automated stress testing and LoL's release
schedule. Do they have beta releases or code freeze before the two week
release dates?

~~~
huac
They have a public beta environment that is set up for feature testing, rather
than stress testing. The PBE does see code rollbacks (if features are
incomplete) before releases, so I believe the game logic on the PBE represents
basically what will be moved to the live server.

------
Ameo
Oh man with that kind of control of the game they could /certainly/ create
some kind of sandbox mode.

The APIs that they have for testing alone are pretty incredible; I'd love to
be able to play in a mode where you could change level on-demand, create dummy
enemies, etc.

------
bruxis
This is beautiful.

------
mangix
too bad dota 2 is a better game

~~~
kawsper
Funnily enough, the latest patch was rushed out by Valve, and it contained a
lot of bugs and strange behavior.

Lots of bugs in Dota 2 seems to be happening because of manual testing.

~~~
errantspark
Not to say that DotA wouldn't benefit from more rigorous testing, but in
Valve's defense I'd say that the complexity of interactions in DotA is far
beyond LoL. Some of the bugs you're seeing might only arise in extremely rare
situations.

DotA is balanced with creativity in mind. When an exploitative strategy is
found the game is not changed to remove it but instead through some
combination of tweaks and modifications, often to seemingly unrelated aspects
it's made into something that's only situationally viable.

The reason the game is so deep/interesting is because this philosophy has been
in place for many (10+) years now resulting in a game that's much more complex
than pretty much anything else out there. DotA 2 retains edge case behavior
that was present in the original Warcraft 3 mod. Things that were once engine
limitations are now part essential parts of gameplay. Preserving these weird
interactions instead of attempting to make the game more
regular/comprehensible is one of the things that allows such a high level of
creativity in competitive play. There's always a way you can outplay your
opponent that's clever. In go parlance: there's always new tesuji to be found.

All that being said I think Riot's efforts here are quite laudable and DotA
could certainly find benefit from more rigorous testing.

------
finishingmove
We see the test checks three things:

    
    
      Verifies:
        - KogMaw deals less damage to non-lane minions
        - KogMaw deals percentile magic damage
        - KogMaw deals normal damage to lane minions
    

and the _verify_ method has three assertions. In my opinion, this should be
three tests, each with only one assertion.

~~~
d0100
They'd soon be drowning in individual tests. And it wouldn't necessarily fit
to separate them, considering all those 3 have to be tested on each skill
usage. One shot has to obey those 3 rules.

Making each an individual test would triple the test time, pollute the test
codebase and not bring any advantage.

~~~
chrislloyd
The advantage in splitting up the assertions is that you get more visibility.
If one test fails and not the other two that's extra information you can use
to debug you wouldn't have if you have all 3 in one test.

~~~
chii
If that test fails, the dev will just run it locally with a debugger attached.
The cost of extra time in the test for the normal passing case isn't worth it.

