Is it possible to compare the skill of Bobby Fisher and Emanuel Lasker? Several years ago Jeff Sonas improved upon the Elo chess rating system and then created the wonderful chess statistical analysis website Chessmetrics, but the ratings of the great players were still based upon how they played against their contemporaries. Mr. Sonas and I briefly discussed over email that it might be better if we could rate their play using computer chess engines. Back then, the computer chess programs were not quite as strong as the best humans, but it would have been interesting. Well time has past and now Houdini and Rybka are significantly stronger than the best human player, the young Norwegian Magnus Carlsen. Other people had the same idea for comparing the best chess players of all time and it seems that José Capablanca was one of the best if not the best chess player of all time. The best analyses of this type were done by Guid and Bratko in “Computer Analysis of World Chess Champions” (2006) and “Using Heuristic-Search Based Engines for Estimating Human Skill at Chess” (2011). They have written several easy-to-read popular articles for Chess Base including:
- “The quality of play at the Candidates” (April 2013)
- “A computer program to identify beauty in problems and studies” (2012)
- “Using chess engines to estimate human skill” (2011)
- “Computers choose: who was the strongest player?” (2006)
- “Computer analysis of world champions” (2006)
Here is how they describe their method of rating the players.
- The analysis of each game starts at move 12.
- The chess engine evaluates the best moves (according to the computer) and the moves played by the player.
- All engine’s evaluations are obtained at the same depth of search.
- The score is then the average difference between evaluations of the best moves and the moves played.
- If the player’s mistake (as seen by the engine) at particular move is greater than 3.00, the score for this particular move becomes 300 “centipawns” (to avoid unreasonably high penalties for gross mistakes).
- Moves where both the move played and the move suggested by the computer had an evaluation outside the interval [-2.00, 2.00], are discarded. (In clearly won positions players are tempted to play a simple safe move instead of a stronger, but risky one. Such “inferior” moves are, from a practical viewpoint, perfectly justified. Similarly, in lost positions players sometimes deliberately play an objectively worse move.)
- All the scores are given in “centipawns”.