
Predicting outcomes for games of skill by redefining what it means to win - morelandjs
https://arxiv.org/abs/1802.00527
======
macromaniac
I like the way elo is computed. Win/loss is the most fair metric because it by
definition doesn't overvalue any particular in game statistic, everything is
weighted perfectly. Still good read though.

~~~
cjslep
I prefer Glicko and Glicko-2 over Elo. It's been 10 years since I implemented
it and played with the math, but if I remember right it is far more accurate
in games that have significant amounts of RNG that leads to lucky wins/loss
events. But if a player proves they weren't lucky by being consistent it still
is just as malleable as Elo. So perfect for a game like Hearthstone.

~~~
sukruh
Also relevant is TrueSkill[1] where the game is between teams of people who
are individually ranked.

[1] [http://trueskill.org](http://trueskill.org)

------
highd
Regarding this and previous discussions on this topic on HN, it seems to me
that one of the primary motivating factors when constructing a new ranking
system should be the possibility of cyclical dominance a la
rock/paper/scissors, which we should expect to see in many modern complex
games. If one wanted to solve that problem, I think it would require some form
of multidimensional ranking. Then one would have added flexibility so you
could accurate predict winners from the scoring with some function f(x_1,x_2)
even in cyclical cases. This would be less interpret-able, but it would have a
fair shot at actually modelling/predicting/depicting the dominance
relationships between players.

~~~
morelandjs
I agree this is a critical feature that Elo ratings omit. Elo ratings are
transitive, i.e. if I know the probability that A beats B and the probability
that B beats C, then I know the probability that A beats C. It's almost
certainly the case that non-linear "rock-paper-scissors" effects exist in
professional sports.

The tricky bit is trying to estimate these effects with a finite number of
samples or observations. Elo thrives on limited data.

------
everdev
For me the most frustrating part of predicting outcomes (elections, games,
etc.) is that unless you're predicting with 100% confidence (which I've never
seen), that no one can essentially call your prediction wrong.

If a prediction model was consistently accurate, the creators would be
breaking Vegas instead of publishing predictions on a blog.

~~~
richard___
No one's breaking vegas because most games and payouts are designed
probabilistically for the house to win

~~~
joosters
I think they are referring to sports betting rather than games of chance.
There’s no point calculating an elo rating for number 5 on the roulette wheel.

------
carbocation
Since Elo and PageRank can sometimes (but not always) be used in similar
circumstances, is there a PageRank generalization that is similar to this
approach for Elo?

~~~
thanatropism
PageRank would weigh all games the same -- unless graph weights are somehow
adjusted. Elo naturally weighs down older games.

Put differently: PageRank models stable dominance relations, while Elo better
reflects evolution in skill.

~~~
carbocation
Succinct and clear. Thanks!

------
pokoleo
Cleaner version of the paper is on arxiv-vanity:

[https://www.arxiv-vanity.com/papers/1802.00527/](https://www.arxiv-
vanity.com/papers/1802.00527/)

~~~
occamrazor
I get an error page (on mobile)

