Hacker News new | past | comments | ask | show | jobs | submit login
Sunfish: A simple but strong chess engine in 111 lines of Python (github.com)
291 points by ff_ on Aug 4, 2014 | hide | past | web | favorite | 95 comments



Óscar Toledo G. has notably written various tiny, strong and highly obfuscated chess engines in C (winning the IOCCC twice) and JavaScript (2nd place in the first JS1k). He's even written a 170 page book to serve as a reference to the 1326-byte "Nanochess" program, his strongest small chess engine.

http://www.nanochess.org/chess.html


Amazingly, the Dutch program Micro-Max by Geert Muller

http://home.hccnet.nl/h.g.muller/max-src2.html

is even shorter and stronger than Toledo's.


That's crazy. Does look more like APL than C though...


Reading through Micro-Max takes a while, but the ideas in there are great! Like Sunfish it also doesn't do underpromotion: "which I considered a dull input problem".


Amazing indeed.


I was looking at Oscar Toledo's family website (http://www.biyubi.com/eng_principal.html) and it's rather interesting, apparently they assemble their own PC's have their own OS and Browser. Has anyone here tried any of these out?


Their OS is virus proof! I'm on board.


Pretty neat stuff. Looks like the move generator is missing a few features like underpromotion but it is very concise. I wrote a rudimentary engine + move generator[1] using bitboards[2] a couple of years ago. Unfortunately python poorly suited for bitmath optimizations because it doesn't support fixed width integers. Once I saw how slow my movegen went compared to a c version, I gave up on finishing up the minimax search to complete the engine. It does run quite a bit faster in pypy but still no easy way to force 64 bit integers.

[1] https://github.com/vishvananda/ivory [2] https://chessprogramming.wikispaces.com/Bitboards


You can either use numpy ints in pypy (I think they're nicely optimized) or there is __pypy__.intop module that does something. Come to #pypy on freenode we'll sort you out ;-)


Cython? Numba?


The problem with Cython, in this case, is that in order to get close to C/C++ speed, you end up having to write code that looks an awful lot like C/C++. All of the Python black magic that let's you write this in 111 lines goes away.


"class Position(namedtuple('Position', 'board score wc bc ep kp')):"

Whoa. I've literally never seen this idiom before.

But I immediately like it.


Mmmhmm! Just be aware of some fun things to watch out for when it comes to the __init__ and __new__ functions:

http://stackoverflow.com/questions/4071765/in-python-how-do-...


It's kind of a weird idiom because you end up with a class 'Position' inheriting from another class named 'Position'


I have not seen half the stuff here -- black magic! Awesome work Thomas!


If you are interested in this sort of thing, the chess programming wiki is an excellent resource:

http://chessprogramming.wikispaces.com/


Correct me if I'm wrong, but that looks like 388 lines of Python?

Still, very cool. I spent awhile trying to write my own and failed miserably. :)


Yes, that is 388 lines, however, the title makes reference to the chess engine itself. The rest of the code, such as setting up the board, handling moves, and ui, are not specific to the sunfish implementation, they can be used without the engine.


It would be much easier to read if the 111 lines where separated into their own file, though. Otherwise it's hard to discuss what is really important for the engine and what isn't.

Sidenote: independent from the discussions about the lines I consider it an awesome piece of work. Even in 388 lines it's still much better than what I could deliver.


The comments and empty lines amount to around 200 lines, OP is not counting those I suppose.


No:

     wget 'https://raw.githubusercontent.com/thomasahle/sunfish/master/sunfish.py'
     egrep -v '^$|^ *#' sunfish.py > sustrip.py
     wc -l *py
     388 sunfish.py
     263 sustrip.py
(or: equivalently:)

    egrep -c '^$|^ *#' sunfish.py 
    125 #Comments, empty lines
    egrep -c '^$|^ *#' sunfish.py  -v
    263 #Non-empty lines, non-comment lines
I believe the actual chess engine might be 111 lines, though.


I'm guessing it comes to 111 if you remove the comments.


Whether or not the constitute "Python" code is debatable. But it is common practice to not count blanks and comments when stating LoC.


I haven't played it yet, but I am curious, the variable pst seems to give the score of a piece given the square it is, independently of the position? How it that possible, am I missing something? (You only use pst to calculate the score of a move, if there is no check, checkmates or capture).

Having said that, the code is very neat and readable. Thank you for showing it here!


Correct, that's what it is doing. It is a very fast and surprisingly strong way of evaluating the position. Of course stronger engines do a hell of a lot more than this and gives weightings based on the positions of other pieces, but most of them have something like this as a baseline.


Exactly! Indeed some engines (the Micro-Max engine) just tries to move all pieces towards the center, and it still works pretty well! (http://home.hccnet.nl/h.g.muller/pcsqr.html) I wanted to add the table though, as I thought it made the games more interesting, and gave a nice handle on tweaking the engine.


PST stand for piece-square-table. You keep the static evaluation simple and fast, and let the search hash out the dynamic details. The search is surprising effective at its job, and trying to evaluate a position statically is surprisingly difficult.


Unfortunately this program doesn't play legal chess. When I played the program it did not move its king out of attack when checked. And yes, I've played enough chess to know an illegal move when I see one.


> And yes, I've played enough chess to know an illegal move when I see one.

It's not like Chess is some weird, esoteric game where there is a gray area of whether or not moves are legal or not.

Maybe I'm biased. I'm currently working on a toy chess engine myself.


As an experienced chess player, there are some situations in which people who know the rules still might get tripped up on. For example, consider the situation:

    K R r
    _ _ _
    k _ _
Can `k` move to the right? Someone who isn't familiar with this situation might think so, because `R` is pinned to `K` by `r`, but it turns out that even though `R` can't move to the square right of `k`, `R` can still give check to it, so the move would be illegal.

Then you also get people who don't know about rules regarding castling through check, etc.

Basically what I'm getting at is, there are moves that even people who casually play chess might misunderstand, so OP was implying that they were not one of these people.


I learned the rules of chess from a children's book in third grade, but have never played a full game as far as I can remember. Yet this situation seems extremely obvious to me. One of the most fundamental rules from my children's book was that you cannot make a move that puts yourself in check. Moving k to the right would clearly do that. And it's pretty obvious when you think through what would happen if k moved to the right. R would capture k, which would not put R in check, because the game would be over.


It only trips people up because the R can't actually move to that square. So if the k were say, a queen, the queen would be safe on that square. The opposing rook would not be allowed to capture it, because they would be putting their own king in check.

If one thinks about chess as if the goal were to capture the opponent's king before one's own king is captured, this makes perfect sense. But sometimes the concept of checkmate muddles peoples' intuitions.


It would only trip people up if they memorize the rule: "You can't make a move that exposes your king to check" but then, for some strange reason, forget that it applies to the other guy too.


Sorry, could you describe what the colors and pieces are? I'm unclear as to what you are illustrating.

Edit: Thanks for the clarifications. lowercase = one color, uppercase = other color. K = king, R = rook

The question is... does the rook R still provide attack/check even though it is pinned by rook r to King K?

The answer is yes.


I think he's saying there are two kings and two rooks, one color in capitals and the other in lower case.

His situation is correct, however I don't think any experienced chess player would not know that.


I use that example specifically because it came up on /r/chess the other day, and I've had newer players ask me once or twice. I agree that an experienced player should know, but casual players may not.


Lowercase and capitals, just like the python program in the post uses.


Yeah, this might be a stretch, I had never heard about the checking piece needing to be able to actually capture the King for the check to be valid.


If the King cannot escape AND he's not checked it's a draw.

It's a common mistake that is likely to be made in finals from beginners.


>It's not like Chess is some weird, esoteric game where there is a gray area of whether or not moves are legal or not.

Actually it is exactly like that.

If you play tournament chess, like USCF tournaments, then there's no doubt what the rules are. However if you play casual chess, like with some random guy in the park, then be prepared for some misunderstandings.

Some things casual players think are rules:

    Pawns move two squares only on the first move of the *game* not the first move of the pawn

    You can't promote a pawn the move after it arrives on the 7th rank, you have to wait a move

    Perpetual check is illegal (somewhat like the Ko rule in go)

    Capturing En Passant is generally not known

So when I'm playing a stranger I always ask if they've played any tournaments, not to gauge how strong they are, but to see if they'll know the rules of chess.


I don't know if I would consider people being blatantly ignorant of the rules evidence that the rules are a "grey area." I learned how pieces move and capture from a children's book in third grade, and I distinctly remember it covering those rules, except for the third, which I have never heard of.

As far as I know, these rules are extremely standardized in most or all areas of the world where chess is known. I have heard about some special rules for competitions that can differ, like timing limitations and end-game rules. I'm also aware of many variations of chess, but those are always clearly discussed as variants.


I haven't played in any chess tournaments, but I started doing this myself after my cousin captured my bishop with an en passant. The game was ruined by the ensuing argument...


You can only capture a pawn En passant and only if the pawn moves 2 squares in 1 go.


Perhaps an en passant lead to a position in which the bishop could not avoid being captured. If not, OPs cousin is indeed wrong.


Yeah, my cousin thought if his pawn was on the 5th row he could capture any piece next to it by performing an "en passant". Of course, it wasn't a proper en passant, but the way he moved his pawn was the same, it's just he was capturing a bishop instead of another pawn.


Not understanding the rules of something, doesn't make the rules a 'gray area'. That makes the understanding a gray area. The rules are very explicit.


> Pawns move two squares only on the first move of the game not the first move of the pawn

Can you elaborate on this? I've never seen any claim to this effect, only that a pawn on its starting rank has the option of moving two squares (but cannot use that to jump over or capture a piece) (e.g. as stated http://en.wikipedia.org/wiki/Pawn_(chess)). Are these different claims or am I misunderstanding?

> You can't promote a pawn the move after it arrives on the 7th rank, you have to wait a move

Same here. "The new piece replaces the pawn on the same square, as part of the same move" (e.g. http://en.wikipedia.org/wiki/Promotion_(chess))

I'm only showing wikipedia cites to indicate that if these are wrong they are widely spread, and hoping that you have better cites.


He was demonstrating a misconception. The truth is pawns can move 2 squares on their first move at any point in the game.


Yeah, but it isn't a grey area, the rules are well defined and actually pretty easy to understand. I've had people not know the rules before, but they wouldn't call themselves experienced.


> Capturing En Passant

I actually did not know this one, and I thought I played enough Chess to know all the rules.


I've known the rule since i was a kid, but i've never seen it in an actual game.


En passant is a bit funny in that, once you learn it, it's tempting to use it when it would actually be better not to.

When I played tournaments in high school, most players knew about it, but most players were still quite amateur, so one could offer up a position in which the opponent could perform an en passant, but the en passant was actually fairly detrimental, and more often than not the opponent will take it, because "hey cool I can do that weird pawn move".


Oh good, I'm glad I'm not the only one who did that in tournaments when I was in school!


This is a clear bug. I bet the author would appreciate if you open a ticket on the project page.


Reporting a bug, or a pull request with a failing test would help the author no doubt.

It would also add some credence to your argument which lacks any reproduction details.


Author here. I'm sorry things are not working out. I'd be very interested if you could open a report with the position involved. Sunfish move generation is tested pretty thoroughly, generating tens of millions of positions and comparing with other engines. But of course it's still young and glitches may happen :)


Doesn't that just mean that you win? Why is it important that it's not legal?


No. While in check, the only legal moves are those that take you out of check. It is checkmate (and you lose) if and only if there is no legal move that would take you out of check.

Also, a move that would place yourself in check is never legal.


Yes I understand that. But why is it an issue, can't you just take his king in the next move? Either way he's forfeit the game.


He is required to make a move that takes him out of check. If he tries to make some other move, you inform him that it's illegal and roll it back.

Also you never actually take the king, you just get the point where you could and it has no escape.


I've a VIC-20 here running chess. 3.5k RAM. And people even got it to run on the ZX-81, with 1k. Now that's impressive.



The greatest program ever written [1]

[1] https://www.kuro5hin.org/story/2001/8/10/12620/2164


I had a ZX-81 and it was not poorly documented. Also, the Z80 assembly language was quite advanced compared to others from the same era (fe 6502). It had 16 some bit registers and 2 register sets (with at that time plenty of registers) that could be swapped.

The machine had a Basic interpreter that used bytecodes (to save memory), so programmers regularly mixed adopted a mixed style : basic + asm .

Anyway, to write a chess program for that machine is however rather impressive.


Sorry, but holy shit.


I wrote a chess game on a PIC16c-something. It had 176 bytes of RAM and only a 8 level return stack. I implemented recursive minimax by managing my own data stack since the compiler claimed to not support recursion. Code size was IIRC around 2K. You could play it on a battery charge controller we were making via UART in a terminal program, it printed the board and accepted key strokes. It declared checkmate when it was a certainty but before it actually happened...


Yuck, just caught it out with:

  e2e4 g8f6
  e4e5 f6d5
  d2d4 b8c6
  f2f4 e7e6
  c2c4 d8h4?
  g2g3 f8b4
  c1d2      -- 2 pieces en-prise at this point.
       h4h6
  c4d5 e6d5
  d2b4 c6b4
  a2a3 b4c6
  b1c3 c6e7
Just a piece down with very little compensation.


Auch, I guess it tried to do too many things at once, given its limited search depth :)


Pretty cool. I once wrote a C++ program to play me in chess. It was ~5000 lines of code! I can appreciation this script.

I think a more interesting problem now is to create computer algorithms that can be "taught" the rules of a board game --- a problem that falls squarely in the domain of supervised learning. In the studies I've found, the algorithms were provided with a lot of prior knowledge about the specific board game, so there may be a lot of room for progress.


There are approaches. One is General Game Playing where the rules are given as a set of logic formulas that basically describe a state machine

http://en.wikipedia.org/wiki/General_game_playing

http://www.general-game-playing.de/literature.html

GGP is then about deriving knowledge about the game and its state evaluation using a) the rules directly, b) the represented state tree or c) past matches in that very game.

The other one I know of is a mechanism where you use reinforcement learning and assume one state in the beginning. With more info, you start splitting the state into several states using decision tree split criteria, such as cross entropy, and you end up obtaining a game state tree together with the knowledge to play reasonably well. Problem: I don't remember how it's called.


You might be interested in 'General Game Playing'. The idea is that players are presented with a machine-readable description of game rules (in a prolog-like locigal language), and they then play the game.

Since you can't use human-tuned heuristics for every game, you have to use more general meta-heuristics. There's also a lot of room for logically parsing the game, and trying to separate out individual components, obviously meaningless moves, etc.

see http://en.wikipedia.org/wiki/General_game_playing or https://www.coursera.org/course/ggp

an example game: http://ggpserver.general-game-playing.de/ggpserver/public/vi...

it basically defines an initial game state, the moves each player can make and how they alter the game state, and the end-conditions and goals of the game.


My ego is saved by its weak endgame but it plays a pretty decent opening all on its own.


I found it quite weak in the opening, well, just about as bad as any chess engine without an opening book to go on. 1. e2e4 g8f6 2. e4e5 f6d5 3. c2c4 d5f4 (weird), 4. d2d4 f4d6. This is probably some opening like the Snake, or the Vulture or something. but beat it quite solidly opening up the f-file and hitting f7 with the queen and rook.

Second game I decided to avoid opening theory and head into a King's Indian Attack. Black's opening moves were sensible, knight and bishops to the right squares, pawns on e5 and d5, but it really wasn't taking my queenside pawn expansion seriously enough, giving up a knight for a pawn. But it's quite resourceful, it's managed to win an exchange, although it's in a desperate position. Currently it's hanging on this position.. oh. No, it's finally crashed. This is the position it died on:

  r . . . r . . .
  . . . . . . k p
  . b q . . . p .
  . R p . p . . .
  P . Q p P . . .
  B N . P . . P B
  . . . . . P . P
  . . . . . . K .
White's last move: f5h3. Took about 10 minutes and made the move h8g7, and then crashed with

  Your move: Traceback (most recent call last):
    File "sunfish.py", line 388, in <module>
      main()
    File "sunfish.py", line 365, in main
      move = parse(crdn[0:2]), parse(crdn[2:4])
    File "sunfish.py", line 345, in parse
      fil, rank = ord(c[0]) - ord('a'), int(c[1]) -1
  ValueError: invalid literal for int() with base 10: ''

So, not strong, but surprisingly strong for the size of the code. Remarkable.


Yes, endgame knowledge is hard to encode succinctly. Clearly having separate piece tables for start- and end game would help a lot, piece tables too, but apart from that I'd be interested in ideas :)


I don't care that much for chess engines, but the code looks really pythonic. I learned a lot from reading it! I already started to believe that real pythonic projects don't exist.


I thought I understood "pythonic." Aren't his variable names the exact opposite of what one would expect in pythonic code?


Tried it. Typed e4. Got this:

    Traceback (most recent call last):
      File "sunfish.py", line 388, in <module>
        main()
      File "sunfish.py", line 365, in main
        move = parse(crdn[0:2]), parse(crdn[2:4])
      File "sunfish.py", line 345, in parse
        fil, rank = ord(c[0]) - ord('a'), int(c[1]) - 1
    IndexError: string index out of range


Try saying e2e4 instead.


Yeah, that's the correct way.


This is because of the way he coded. To move a pawn you have to append with a "P". (I think, I have not tried it yet to be sure)


As a fellow chess coder I find this very cool, but we should be careful with the use of the word "strong." What ELO (for the non-chess people, rating) does this play at? I very much doubt it's truly what most would consider "strong" nowadays. In fact I doubt it plays above even 1500 - that's not strong. Great example of a simple chess engine in Python though? Surely!


You are right of course :) Sunfish is nowhere the level of 'real' engines. It is only strong compared to fellow Python engines, which I get from Ruxy Sylwyka who 'promoted' it to play in the 'Java league' http://www.talkchess.com/forum/viewtopic.php?topic_view=thre... ;) Still it's probably less than 1100 ELO. Maybe 1200 with pypy.


Here's a simple but strong suicide chess engine in literally 162 lines of C.

http://www0.us.ioccc.org/2001/dgbeards.c


Mentioning "strong" without benchmarks seems subjective.


Python is the nicest language to read, right?


I've noticed the odd trend of people bragging about how few lines it took to do X. Seems like people are trying to equate less loc == better code, which is not usually the case (better/more robust error handing and correcting, more complicated AI logic, etc).

It just means you are using a more abstracted version of whatever.

In reality, a 111 lines of Python program is likely thousands of lines if you were to count all of the standard Python libraries and/or any 3rd party libraries used.

What if I took this program, wrapped it in say, 5 lines of Python, then said I implemented chess in 5 lines of Python? Did I really? Of course not... but it makes for a good attention grabber.


> Seems like people are trying to equate less loc == better code

No. It's a tradition that goes back a long way. When it started it was about making code better - more efficient; faster; smaller. Part of that was the limited hardware available.

Now, when huge computing power is available to almost anyone, it's about the fun of finding that shortcut or that weird trick to remove a few lines of code.


Ah, that does make sense. Like Code Golf. ;)


That's exactly what it is. Code Golf is pretty fun to look at, check out the questions here: http://codegolf.stackexchange.com/

There's some cool techniques different people use to solve some seemingly major problems in just a few characters.


The best code is the code you don't have to write. I like these short cuts and I played and lost a game against this beauty after reading the code. He did not use any chess-specific libs. Just pure and simple python.

I am proud to not contribute much to LOC counts on most of the teams I worked as I clean up a lot after my colleagues to keep code duplications low and pointless code out. Less code is normally easier to maintain as long as you follow some other coding practices and don't go and inline everything.


You are of course right that less is not better. Indeed Sunfish often takes a more lengthy approach than say Micro-Max, because Sunfish is also meant to teach chess programming. Examples include:

- Using an actual text representation of the board, rather than a binary. - Having separate functions for move generation, search and evaluation. - Using a python hashmap rather than a simple zobrist hash. - Using a direct piece square table rather than some compressed/generated nonsense.

I do however also think, that trying to shorten the move generation of some programs, which can be hundreds or thousands of lines, to something like this, is a good exercise in finding patterns, and has a certain beauty ;)


"My point today is that, if we wish to count lines of code, we should not regard them as "lines produced" but as "lines spent": the current conventional wisdom is so foolish as to book that count on the wrong side of the ledger." -- Dijkstra (https://www.cs.utexas.edu/users/EWD/transcriptions/EWD10xx/E...)

Clearly there's a line beyond which the code is no longer readable, but that is not to say that more lines is better, or that there is no elegance in implementing things more concisely.


It is an indicator about the Kolmogorov complexity of chess.

http://en.wikipedia.org/wiki/Kolmogorov_complexity


Why not offer a specific criticism of how this program leverages Python, though?


I wasn't attempting to criticize this code - but merely comment on the trend I was noticing... however as DanBC pointed out, this is really more of a Code Golf exercise, which now makes more sense!




Registration is open for Startup School 2019. Classes start July 22nd.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: