Historic trends in chess AI

The Elo rating of the best chess program measured by the Swedish Chess Computer Association did not contain any greater than 10-year discontinuities between 1984 and 2018. A four year discontinuity in 2008 was notable in the context of otherwise regular progress.

Details

This case study is part of AI Impacts’ discontinuous progress investigation.

Background

The history of chess-playing computers is long and rich, partly because chess-playing ability has long been thought (by some) to be a sign of general intelligence.1 The first two ‘chess-playing machines’ were in fact fakes, with small human chess-players crouching inside.2 It was not until 1951 that a program was published (by Alan Turing) that could actually play the full game.3 There has been fairly regular progress since then.4

In 1997 IBM’s chess machine Deep Blue beat Gary Kasparov, world chess champion at the time, under standard tournament time controls.5 This was seen as particularly significant in light of the continued popular association between chess AI and general AI.6 The event marked the point at which chess AI became superhuman, and received substantial press coverage.7

The Swedish Chess Computer Association (SSDF) measures computer chess software performance by playing chess programs against one another on standard hardware.8

Figure 1: Deep Blue9

Trends

SSDF Elo Ratings

According to Wikipedia10:

The Swedish Chess Computer Association (Swedish: Svenska schackdatorföreningen, SSDF) is an organization that tests computer chess software by playing chess programs against one another and producing a rating list…The SSDF list is one of the only statistically significant measures of chess engine strength, especially compared to tournaments, because it incorporates the results of thousands of games played on standard hardware at tournament time controls. The list reports not only absolute rating, but also error bars, winning percentages, and recorded moves of played games.

Data

We took data from Wikipedia’s list of SSDF Ratings11 (which we have not verified) and added it to this spreadsheet. See Figure 2 below.

Figure 2: Elo ratings of the best program on SSDF at the end of each year.
Discontinuity measurement

Looking at the data, we assume a linear trend in Elo.12 There are no discontinuities of 10 or more years.

Minor discontinuity

There is a four year discontinuity in 2008. While this is below the scale of interest for our discontinuous progress investigation, it strikes us as notable in the context of otherwise very regular progress.13 We’ve tabulated a number of other potentially relevant metrics for this discontinuity in the ‘Notable discontinuities less than 10 years’ tab here.14

This jump appears to have been partially caused by the introduction of new hardware in the contest, as well as software progress.15

Notes

  1. For a good history of chess-playing computers, see this article. It says: “It was in this context that Turing, Von Neumann, and Shannon posed an ancient question in a now modern guise, in what came to be called “Artificial Intelligence” in the coming decade: can a machine be made to think like a person? And the answer to the question—the question of machine intelligence—was from the start tied to the question of whether a machine could be made to play chess. Turing began the investigation of chess playing computers with a system written out with paper and pencil, where he played the role of the machine. Later Shannon extended Turing’s work in a 1949 paper, explaining about his interest in chess that: “Although of no practical importance, the question is of theoretical interest, and it is hoped that…this problem will act as a wedge in attacking other problems—of greater significance.” As became clear in later writing by the two computer pioneers, “greater significance” was no less than the quest to “build a brain,” as Turing had put it. The quest for Artificial Intelligence, then, began with the question of whether a computer could play chess. Could it?”
    Best_Schools. “A Brief History of Computer Chess.” TheBestSchools.org. September 18, 2018. Accessed July 18, 2019. https://thebestschools.org/magazine/brief-history-of-computer-chess/.

    Another example: The tenth Turing Lecture, available here, mentions chess 20 times and uses it as a central example of how the field of artificial intelligence has progressed over the years. Newell, Allen, and Herbert A. Simon. “Computer Science as Empirical Inquiry: Symbols and Search.” ACM Turing Award Lectures: 1975. doi:10.1145/1283920.1283930.

  2. 1769 – Wolfgang von Kempelen builds the Automaton Chess-Player, containing a human chess player hidden inside, in what becomes one of the greatest hoaxes of its period.
    1868 – Charles Hooper presented the Ajeeb automaton — which also had a human chess player hidden inside.

    “Computer Chess.” Wikipedia. July 10, 2019. Accessed July 18, 2019. https://en.wikipedia.org/wiki/Computer_chess.

  3. “1951 – Alan Turing is first to publish a program, developed on paper, that was capable of playing a full game of chess (dubbed Turochamp).[1][2]
    “Computer Chess.” Wikipedia. July 10, 2019. Accessed July 18, 2019. https://en.wikipedia.org/wiki/Computer_chess.
  4. See Wikipedia’s page on the history of computer chess.
    “Computer Chess.” Wikipedia. July 10, 2019. Accessed July 18, 2019. https://en.wikipedia.org/wiki/Computer_chess.
  5. “Deep Blue was then heavily upgraded, and played Kasparov again in May 1997.[1] Deep Blue won game six, therefore winning the six-game rematch 3½–2½ and becoming the first computer system to defeat a reigning world champion in a match under standard chess tournament time controls.[2]”

    “Deep Blue (Chess Computer).” In Wikipedia, June 26, 2019. https://en.wikipedia.org/w/index.php?title=Deep_Blue_(chess_computer)&oldid=903491291.
  6. “Computer scientists believed that playing chess was a good measurement for the effectiveness of artificial intelligence, and by beating a world champion chess player, IBM showed that they had made significant progress. After the loss, Kasparov said that he sometimes saw deep intelligence and creativity in the machine’s moves, suggesting that during the second game, human chess players had intervened on behalf of the machine…” “Computer Chess.” Wikipedia. July 10, 2019. Accessed July 18, 2019. https://en.wikipedia.org/wiki/Computer_chess.
  7. “The studio seated about five hundred people, and was sold-out for each of the six games. It seemed that the entire world was watching ”
    Best_Schools. “A Brief History of Computer Chess.” TheBestSchools.org. September 18, 2018. Accessed July 18, 2019. https://thebestschools.org/magazine/brief-history-of-computer-chess/.
  8. “The Swedish Chess Computer Association (Swedish: Svenska schackdatorföreningen, SSDF) is an organization that tests computer chess software by playing chess programs against one another and producing a rating list. […] The SSDF list is one of the only statistically significant measures of chess engine strength, especially compared to tournaments, because it incorporates the results of thousands of games played on standard hardware at tournament time controls. The list reports not only absolute rating, but also error bars, winning percentages, and recorded moves of played games.”

    “Swedish Chess Computer Association”. 2009. En.Wikipedia.Org. Accessed June 19 2019. https://en.wikipedia.org/w/index.php?title=Swedish_Chess_Computer_Association&oldid=891692663.

  9. From Wikimedia Commons: James the photographer [CC BY 2.0 (https://creativecommons.org/licenses/by/2.0)]
  10. “Swedish Chess Computer Association.” In Wikipedia, April 9, 2019. https://en.wikipedia.org/w/index.php?title=Swedish_Chess_Computer_Association&oldid=891692663.
  11. “Swedish Chess Computer Association”. 2009. En.Wikipedia.Org. Accessed June 19 2019. https://en.wikipedia.org/w/index.php?title=Swedish_Chess_Computer_Association&oldid=891692663.
  12. See our methodology page for more details.
  13. See our methodology page for more details, and our spreadsheet for our calculation.
  14. See our methodology page for more details.
  15. ‘The jump perfectly corresponds to moving from all programs running on an Arena 256 MB Athlon 1200 MHz to some programs running on a 2 GB Q6600 2.4 GHz computer, suggesting the change in hardware accounts for the observed improvement. However, it also corresponds perfectly to Deep Rybka 3 overtaking Rybka 2.3.1. This latter event corresponds to huge jumps in the CCRL and CEGT records at around that time, and they did not change hardware then. The average program in the SSDF list gained 120 points at that time (Karlsson 2008), which is roughly the difference between the size of the jump in the SSDF records and the jump in records from other rating systems. So it appears that the SSDF introduced Rybka and new hardware at the same time, and both produced large jumps.’ – Grace, Katja. Algorithmic Progress in Six Domains. Report. December 9, 2013. Accessed June 19, 2019. https://intelligence.org/files/AlgorithmicProgress.pdf, p19

We welcome suggestions for this page or anything on the site via our feedback box, though will not address all of them.