
Strongest chess player, ever - gren
http://en.lichess.org/blog/U4mtoEQAAEEAgZRL/strongest-chess-player-ever
======
asdfologist
This is extremely impressive - it's cool that talented programmers are pushing
the limits of computer science to advance the state of the art of chess
engines.

However, I also wish that there were comparable efforts to create AIs that
train humans. Basically, figure out a way to systematically, efficiently, and
scalably train amateurs into masters. That IMO would be absolutely amazing
(and something I'd gladly pay for).

~~~
ronaldx
In my opinion, this would be a lot more difficult. I am slightly naive as to
exactly how stockfish works, but:

Computers can do a lot of accurate brute-forcing; humans must see the position
in a more holistic, intuitive way.

Excellent human players and excellent computer players are presumably doing
completely different calculation tasks.

I would suggest that computers are _still_ bad at approaching the task in a
human-like way, but they will always be able to improve their method via
Moore's law (at a minimum) where humans are stuck at their current level.

Stockfish might be able to tell you what it was doing, but not in a way that
it would be reasonable for a human to follow.

What you are looking for is a teacher. You can pay for those ;) The closest we
have got to a teacher app is perhaps a MOOC: not much computational progress
has been made.

~~~
runeks
> Stockfish might be able to tell you what it was doing, but not in a way that
> it would be reasonable for a human to follow.

I (a non-chess player) just tried to play a game against the highest level AI
(and lost, obviously).

Doing an analysis of the game afterwards, this is exactly what I experience: I
do "f4" and I'm told (through the analysis tool) that the best move was "Nf3".

Now, the obvious question this leads to is: why? Why was this a better move? I
don't think that, as a human being, memorizing "best moves" is going to lead
to much improvement: we need to know WHY that move was the best move.

I'm sure there is a human-friendly way to explain why one move is the best
move, and why my move wasn't, but the computer probably doesn't know this
explanation, because it's approaching it from a brute-force perspective.

Surely, a chess computer can brute force all possible combinations, and deduce
that this was the best move. But when this is not possible for a human being,
just informing the me that "what you just did was not the best move", doesn't
really do much to help me (as an amateur player).

The game, for reference:
[http://en.lichess.org/ehWjHnIc#0](http://en.lichess.org/ehWjHnIc#0)

~~~
joverholt
"f4" creates a hole on "e4" in your pawn structure. This hole becomes a
valuable outpost where your opponent can position a knight (or other piece)
without it being harassed by one of your pawns. This means to get rid of it,
you need to trade a piece for it, and when your opponent recaptures, it will
give them a passed pawn.

"Nf3" protects your d4 pawn and also threatens the e5 square. Probably most
importantly, it clears a piece so you can castle kingside.

------
thejteam
What I would love to see is work on chess engines having better strength
levels for amateurs. For me at least there is a line where everything below a
certain level I can beat 95 percent of the time and everything at or above
that level I lose to 95 percent of the time.

~~~
notacoward
One of my long-term goals is to write a chess program that plays more like a
lower-level human. Every program I've played seems to have the same pattern:
play like a master for ten moves, then make an insultingly obvious blunder to
let me catch up. Lather, rinse, repeat. It's neither realistic nor fun. Might
as well play with a handicap.

Real players at each level tend to have _characteristic_ weaknesses that are
expressible in terms of the same factors used in a program's evaluation
function. Consider some of the following, which players at a certain level
will exhibit and then get past as they improve.

* Bringing the queen out too early.

* Missing pins and discoveries.

* Failing to contest the center.

* Creating bad bishops.

* Bad pawn structure.

* Failing to use the king effectively in the endgame.

These are all quantifiable. They could all be used to create a more realistic
and satisfying opponent at 1200, 1500, 1800, etc. All it takes is some basic
machine learning applied to a corpus of lower-level games, and a way to plug
the discovered patterns into the playing engine.

------
kristopolous
note that it's just a chess "engine" and doesn't come with a human-usable
front-end.

But it's not hard to get the thing to use xboard in linux.

This is what i use to invoke it: xboard -fUCI -fcp stockfish -sUCI -scp
stockfish&

it's very challenging - and in my experience, will unfortunately peg a core
unless you explicitly pause the front end (the "P" button in between the two
arrows in the upper right)

~~~
y4mi
what do you mean? just click play a game

[http://en.lichess.org/](http://en.lichess.org/)

~~~
kristopolous
Yes that way also works!

But the xboard way allows me to modify the engine, get debug output, try
various different weights for pieces, put two engines against each other to
see how specific features affect game play...

It helps me demystify such sophisticated software.

~~~
y4mi
heh, did you ever upload a match between two engines to the site and let the
online engine analyze it?

~~~
toxik
Very interesting idea!

------
basic_stat
_> During the final event, after playing 64 games against Komodo, Stockfish
won with the score of 35½-28½. No doubt is further allowed: Stockfish is the
best chess player ever!_

From a statistical point of view, this isn't actually significant, despite the
fact that draws help reduce the variance.

45 of those games are draws, leaving a 13-6 score in favor of Stockfish.
Considering a null hypothesis of a binomial distribution with n=19 and equal
chance of winning, the two-sided p-value for that score is 0.115. Unless you
already have strong evidence that Stockfish is better than Komodo, you
shouldn't conclude anything about which one is best.

~~~
eterm
But it fundamentally _isn't_ binomial across 19 games, because of draws. You
can't just ignore draws from the analysis, to do so is terrible application of
statistics.

~~~
basic_stat
Once you condition on the number of draws, you _do_ get that binomial
distribution.

Suppose you have a coin, which gives a random outcome X. But you can only
observe the outcome of X when another independent binary random variable Y is
true. How can you tell if X is biased? Since X and Y are independent, the
observations where Y is false are irrelevant since they don't tell you
anything about X. So you just keep the observations where Y is true, and from
there you can apply a binomial statistical test to the observations of X.

[ In case you're wondering whether applying statistical tests to variable
sample sizes is valid, the answer is yes: a p-value is a uniform random
variable from the set of observables (augmented by a continuous random
variable, since our set of observables is discrete) to [0,1]. Our p-value is a
mixture of p-values on smaller sample sizes, so it is still uniform. ]

This is exactly what happens here: consider a random outcome {win,lose,draw}.
If you don't have a draw, let Y be true and X be the outcome of the game. If
you have a draw, let Y be false and X be a random coin with the same
distribution as for non-drawn games. Then X and Y are independent random
variables and the above applies.

Informally: draws are not useful information in determining whether there are
more wins than losses.

~~~
millstone
I'm not sure that discounting draws is the right thing to do either. For
example, Petrosian was not the strongest attacking player but was very, very
tough to beat.

This also calls into the question the notion of "strongest chess player." Who
is strongest, the flashy attacking player that wins half the time and loses
the other half, or the stonewall that poses little threat but that you can
never beat?

------
runeks
Has anyone else watched [http://en.lichess.org/tv](http://en.lichess.org/tv)?

If this is two humans playing against each other in real-time, that is really
impressive! It's so mind-boggling fast (mind you I'm not a chess player).

~~~
doughj3
ChessNetwork on YouTube[1] has many videos of blitz chess tournaments where he
provides commentary as he plays. Very interesting to watch!

[1]
[https://www.youtube.com/watch?v=7YWYS209ydE](https://www.youtube.com/watch?v=7YWYS209ydE)

~~~
electrotype
Wow, very cool!

Can someone explain what's going on at 20:24 :
[http://youtu.be/7YWYS209ydE?t=20m20s](http://youtu.be/7YWYS209ydE?t=20m20s) ?

He forks the king and the queen, then his opponent moved his queen!?! This is
not a legal move right?

~~~
doughj3
Had to re-watch a couple times: on that move, white had already moved pawn
from g2 to g3, the knight from d6 to e8 is a pre-move that happens very
quickly (you can see the red square); while white was setting up that pre-
move, black moved the queen.

Does that clarify or did I miss what you are asking?

~~~
electrotype
Got it, thanks!

------
anatoly
Can anyone comment on what makes stockfish different from other chess engines?
If I'm curious about state of the art in computer chess, is it worthwhile to
study its source? What interesting ideas should I expect to see there beyond
what I vaguely know to be the standard approach from introductory AI courses,
i.e. some sort of alpha-beta pruning search?

~~~
dfan
You should indeed study Stockfish's source, not because it is the strongest
computer chess engine, but because it is exceedingly well-written. Most
engines that are public are a big hacky mess, but the Stockfish maintainers
have made cleanliness a really high priority (sometimes proposed changes that
increase the engine's strength are rejected because they're too much code),
and it has clearly paid off.

~~~
icpmacdo
>sometimes proposed changes that increase the engine's strength are rejected
because they're too much code

that seems insane.

~~~
dfan
I know, doesn't it? But sometimes someone finds a way to simplify the patch so
it's less code and almost as effective, and sometimes future improvements are
made easier because the existing code is easier to work with. And it's hard to
argue with the results.

------
EGreg
I bet it would be unnerving for someone like Kasparov or Magnus Carlsen to
play this program, where it would have 1 minute on the clock and they could
have the whole day. It would make many of the moves in under a second and
they'd be better than the grandmasters' moves!

~~~
_flag
You can watch Magnus Carlsen play a chess program on his phone here:

[https://www.youtube.com/watch?v=pNvVWeHZG00](https://www.youtube.com/watch?v=pNvVWeHZG00)

As you can tell from his commentary, he thinks the machine is rather stupid
(and he is winning after the opening) but the machine is a much better
calculator than he is and when the situation becomes more concrete he has to
force a draw.

Of course, the computer on his phone is considerably worse than the best chess
engines, but top chess players generally consider computers to be excellent
calculators but dumb in terms of general strategy.

~~~
ars
> but top chess players generally consider computers to be excellent
> calculators but dumb in terms of general strategy.

Is computer assisted chess a thing? Perhaps with standardized hardware, but
any software.

In chess I'm good at strategy but terrible at calculating and I miss obvious
stuff all the time in my fight for strategy. I always thought I'd do great
with computer assistance to look for the obvious stuff, and me telling the
computer the long term strategy.

~~~
pakitan
> In chess I'm good at strategy but terrible at calculating

Chess is 99% tactics/calculation. What we call "strategy" is just a set of
heuristics that we use to avoid having to do endless calculations. However, a
lot of those heuristics are already included in most chess playing software.
So, if you're weak player as a whole, even if you have some strategy acumen,
your contribution in an assisted chess setting will be negligible. The
computer will be doing all the work anyway.

My rating is around 2000 and I have done some assisted chess playing and I can
tell you that it's extremely hard to not just take computer's suggestion at
every move. The chance that I'll come up with some brilliant move that the
computer missed is very slim.

~~~
ars
I think I misunderstood what "calculating" means in a chess setting. I thought
it meant checking the current position of the pieces and making sure you are
not about to be attacked.

But googling it suggests it's more about thinking of the value of each move
relative to others. If that's the case I'm not actually bad at that.

> I can tell you that it's extremely hard to not just take computer's
> suggestion at every move.

Is that how it works? The computer just basically plays and shows you some
moves it likes?

That's not what I meant, I was thinking that you tell the computer something
like: I want to capture piece X using Y 10 to 20 moves from now, perhaps by
going via this direction. Tell me the best series of moves to get there while
avoiding traps.

Or even better give it 2 or 3 such scenarios and have it tell you how
dangerous each one would be so you can pick one.

Basically really narrow down the permutations the computer has to calculate.

------
pk2200
Stockfish 5 was released yesterday:
[http://stockfishchess.org/](http://stockfishchess.org/)

As mentioned elsewhere in this thread, Stockfish is just an engine - you must
install a GUI separately. XBoard is well known, but there are better
alternatives:

[http://chessx.sourceforge.net/](http://chessx.sourceforge.net/)

[http://scidvspc.sourceforge.net/](http://scidvspc.sourceforge.net/)

------
icambron
I always thought of the heuristics for evaluating a chess position as the
really hard part of building a chess engine; i.e. how do you capture all of
the positional subtleties in a number to feed into minimax? But looking at the
source, it's not really that complicated [1]. Can someone who knows more than
me comment on that? Is it that the innovations are elsewhere? That good chess
really can be boiled down to < 1000 LOC? That the numbers in this heuristic
are just super expertly tuned?

[1]
[https://github.com/mcostalba/Stockfish/blob/master/src/evalu...](https://github.com/mcostalba/Stockfish/blob/master/src/evaluate.cpp)

~~~
glinscott
Nearly all the evaluation terms in Stockfish have been extensively tuned.
Joona Kiiski used the SPSA[1] algorithm on a lot of them. Others have been
hand-tuned using tens of thousands of games per attempt on fishtest [2].
Fishtest actually just recently got support for running SPSA tuning as well.

There is also a strong bias towards simplification, so if an evaluation
feature is not proven to be an improvement it will be removed. Over the last
few Stockfish versions, the # of lines has actually decreased in each version.

[1]
[http://en.wikipedia.org/wiki/Simultaneous_perturbation_stoch...](http://en.wikipedia.org/wiki/Simultaneous_perturbation_stochastic_approximation)
[2]
[http://tests.stockfishchess.org/tests](http://tests.stockfishchess.org/tests)

~~~
icambron
That's helpful, thanks.

------
DanBC
Are chess engines still trying to play against humans in an interesting way?
(I understand they beat human players, but that people feel computers play in
dull ways).

Is there a Turing Test for computer chess, where humans and computers play
each other and they, and commentators, analyse the play, but no-one knows who
is a computer or human until after the commentary is published?

And if we ignore humans are people playing computers against other computers
for some kind of machine learning play?

And how optimized for speed is the software? Do they really crunch out all
performance they can?

(Sorry for the barrage of questions but I don't know enough about this space
to do efficient websearches).

~~~
omaranto
> I understand they beat human players, but that people feel computers play in
> dull ways.

Without knowing anything about chess, I'd say that makes humans sounds like
sore losers. I wonder if this will lead to having tournaments where you don't
get points for winning matches but rather get judged on style by a panel of
(human) judges...

~~~
brownbat
In the 1800s, most chess was much more aggressive, and it was frowned upon to
reject an offered exchange of pieces. Consequently, games were faster, moves
were riskier, and everything was arguably more interesting.

In other words, modern humans have been playing in dull, more reliable ways
for over a century, so even if not sore loserish, it's at least a bit like the
pot calling the kettle black.

------
beefman
TCEC is cool but has less statistical power than many ratings lists out there,
which show that Houdini, Komodo and Stockfish are very closely matched, with
Houdini having a slight edge at long time controls and a moderate edge at
quick time controls. Stockfish does release more frequently and I'm not sure
which version competed in TCEC, but until the lists catch up this article is
fluff.

~~~
onestone
Houdini had a slight edge against Stockfish DD, which is a few months old. The
latest Stockfish version (v5, very similar to the version that competed in the
TCEC Superfinal) is ~60 Elo stronger, and demolishes Houdini at all time
controls. It should soon start hitting the more conservative ratinglists.

~~~
beefman
"Demolishes"? Hardly. It is so far tied with H4 on IPON. Even if it winds up
on top, it will not be by a large margin.

------
jtth
(Side note: what the hell is that website trying to do to my computer? It's
trying to open so many connections on weird ports!)

~~~
showerst
Looks like they're using a bunch of websockets for something. Probably that
live player count up in the navigation.

------
meowface
I'm rather surprised at how relatively simple and small the codebase is:
[https://github.com/mcostalba/Stockfish/tree/master/src](https://github.com/mcostalba/Stockfish/tree/master/src)

~~~
QuantumChaos
Especially interesting to see everything hardcoded into C++. Quite different
from most modern codebases that use a mix of languages, configuration files,
etc.

------
alien3d
Just try havin fun..
[http://en.lichess.org/SMVJP07p](http://en.lichess.org/SMVJP07p) The strongest
player.. me kinda noobs..

------
rmbe
lichess doesn't seem to be giving it enough juice to perform at its stated
levels at the moment.

On my first try I managed to draw the highest AI level, rated at 2510, while
my rating is under 2000 irl. (I was unable to find an "offer draw" button so
relied on the 50-move rule) Against stockfish running on my PC that would be
impossible. [http://en.lichess.org/reDfuSvI](http://en.lichess.org/reDfuSvI)

~~~
ornicar
Have you actually read the article? It's all explained
[http://en.lichess.org/blog/U4mtoEQAAEEAgZRL/strongest-
chess-...](http://en.lichess.org/blog/U4mtoEQAAEEAgZRL/strongest-chess-player-
ever)

~~~
rmbe
I did. I was complaining about the rated strength of the AI levels, they're
off by a wide margin.

------
arikrak
That site's pretty impressive. They're open source and give away for free what
Chess.com charges for.

------
arek2
I don't think that the present situation, when the top chess playing program
is free and open-source, is good for innovation.

I estimate that it would take me six months of work to get to the top-20 in
the world, and I don't see how I can justify that work to myself.

~~~
DanBC
I did not downvote you.

Your comment seems either ignorant of the amount of work involved, or very
arrogant about your skill, but does not provide any information for me to
judge.

Why do you think you could beat even rybka with just six months of work?

~~~
arek2
Rybka is better than top-20, so I did not claim that.

I was a half-pro ~10 years ago. I won the championship of my country. I wrote
an M.Sc. thesis on evaluation tuning. If I'd use the standard approach that I
already know, 6 months is a conservative estimate. The question is - what for?

~~~
hollerith
>The question is - what for?

Uh, to win prize money?

