The slow traversal of ‘human-level’

By Katja Grace, 21 January 2015

Once you have normal-human-level AI, how long does it take to get Einstein-level AI? We have seen that a common argument for ‘not long at all’ based on brain size does not work in a straightforward way, though a more nuanced assessment of the evidence might. Before we get to that though, let’s look at some more straightforward evidence (from our new page on the range of human intelligence).

In particular, let’s look at chess. AI can play superhuman-level chess, so we can see how it got there. And how it got there is via about four decades of beating increasingly good players, starting at beginners and eventually passing Kasparov (note that a beginner is something like level F or below, which doesn’t make it onto this graph):

Figure 1: Chess AI progress compared to human performance, from Coles 2002. The original article was apparently written before 1993, so note that the right of the graph (after ‘now’) is imagined, though it appears to be approximately correct.
Figure 1: Chess AI progress compared to human performance, from Coles 2002. The original article was apparently written before 1993, so note that the right of the graph (after ‘now’) is imagined, though it appears to be approximately correct.

Something similar is true in Go (where -20 on this graph is a good beginner score, and go bots are not yet superhuman, but getting close):

From Grace 2013.
From Grace 2013.

Backgammon and poker AI’s seems to have progressed similarly, though backgammon took about 2 rather than 4 decades (we will soon post more detailed descriptions of progress in board games).

Go, chess, poker, and backgammon are all played using different algorithms. But the underlying problems are sufficiently similar that they could easily all be exceptions.

Other domains are harder to measure, but seem basically consistent with gradual progress. Machine translation seems to be gradually moving through the range of human expertise, as does automatic driving. There are fewer clear cases where AI abilities took years rather than decades to move from subhuman to superhuman, and the most salient cases are particularly easy or narrow problems (such as arithmetic, narrow perceptual tasks, or easy board games).

If narrow AI generally traverses the relevant human range slowly, this suggests that general AI will take some time to go from minimum minimum wage competency to—well, at least to AI researcher competency. If you combine many narrow skills, each progressing gradually through the human spectrum at different times, you probably wouldn’t end up with a much more rapid change in general performance. And it isn’t clear that a more general method should tend to progress faster than narrow AI.

However, we can point to ways that general AI might be different from board game AI.

Perhaps progress in chess and go has mostly been driven by hardware progress, while progress in general AI will be driven by algorithmic improvements or acquiring more training data.

Perhaps the kinds of algorithms people really use to think scale much better than chess algorithms. Chess algorithms only become 30-60 Elo points stronger with each doubling of hardware, whereas a very rough calculation suggests human brains become more like 300 Elo points better per doubling in size.

In humans, brain size has roughly a 1/3 correlation with intelligence. Given that the standard deviation of brain size is about 10% of the size of the brain (p. 39), this suggests that a doubling of brain size leads to a relatively large change in chess-playing ability. On a log scale, a doubling is a 7 standard deviation change in brain size, which would suggest a ~2 standard deviation change in intelligence. It’s hard to know how this relates to chess performance, but in Genius in Chess Levitt gives an unjustified estimate of 300 Elo points. This is what we would expect if intelligence were responsible for half of variation in performance (neglecting the lower variance of chess player intelligence), since a standard deviation of chess performance is about 2000/7 ~ 300 Elo. Each of these correlations is problematic but nevertheless suggestive.

If human intelligence in general scales much better with hardware than existing algorithms, and hardware is important relative to software, then AI based on an understanding of human intelligence may scale from sub-human to superhuman more quickly than the narrow systems we have seen. However these are both open questions.

(Image: The Chess Players, by Honoré Daumier)

We welcome suggestions for this page or anything on the site via our feedback box, though will not address all of them.


  1. I guess you do not get the right perspective. The different and multiple abilities of Hum_Int are derived from the essence of human intelligence. Replicating them DO NOT help you in understanding THAT VERY ESSENCE. For a better understanding of this paradoxical situation, let’s use an analogy: study of the Sun. Man was able to emulate the ability of Sun to emit infrared rays – heat – through fire. However, only after the discovery of the real nature of the atom was man able to understand how Sun creates its heat. The fact that man was able to create heat didn’t help him a bit in understanding the essence of Sun. In a similar way, what the AI scientists do is recreating different Hum_Int abilities in a computer, but that has nothing to do with Human Intelligence – just as a catapulted object has nothing to do with human ability to jump over obstacles, other then the general laws of ballistics.

    • Dan’s comment is probably accurate for plenty of AI research, but let me express it somewhat differently to highlight the fact that there are multiple possible conclusions that can be drawn, and that his comment does not hold for all research: Creating an intelligent computer system, even if it has human or super-human, intelligence (in either a narrow or broad area) does not NECESSARILY tell you anything about how the brain works. There is probably more than one architecture that will allow for intelligence, and different architectures could, at least hypothetically, be so different that one essentially tells you nothing about another.

      However, AI may tell you a great deal about how the brain works, depending on the nature of the techniques used to implement the AI. For example, one would be hard pressed to assert that the creation of an intelligent machine using neuromorphic chips, organized as the brain is organized, using in silico “synapses” as the lowest level computational elements (whether simulated or actual) instead of transistors, doesn’t tell us anything about the brain.

  2. This website is just awesome. I’ve search these
    informations a whole lot and I realised that is good written, fast to comprehend.
    I congratulate you because of this research that
    I am going to recommend to prospects friends.
    I ask you to visit the site where each
    scholar or university student can calculate results gpa marks.

    All good!

  3. Quick thought on this: It’s worth distinguishing the part of someone’s performance that comes from their general intelligence and the part that comes from task-specific training (what’s called “fine-tuning” in machine learning). It’s easy to find that humans who are intelligent AND well-trained will play much better chess than someone who lacks one or both of these. But that doesn’t mean that the human range of general intelligence is wide. It could be that the best humans are just much better trained at chess.

    The human chess data above could be generated as follows if the humans were neural nets instead: You pre-train a model on language prediction (a very general task). Then you fine-tune it to answer questions about British history. You compare before and after fine-tuning and find there is a large difference. In fact, maybe it took 20 years of AI progress to go from the performance of the general model to that of the fine-tuned one.

    This number tells you something about how the distance from some arbitrary people to people who are very smart AND very trained. But you don’t know if most of that difference comes from task-specific training. And the AI progress we’re interested in doesn’t come from task-specific training. So I’d be more interested in how quickly you cross the human spectrum if you vary the quality of pre-training and keep fine-tuning constant (e.g. zero fine-tuning).

    (Side note: competitive human tasks like chess may be chosen so that you can get a lot better with training.)

3 Trackbacks / Pingbacks

  1. 1p – The slow traversal of ‘human-level’ – Exploding Ads
  2. 2p – The slow traversal of ‘human-level’ –
  3. If you want to write about intelligence explosion…

Leave a Reply

Your email address will not be published.


This site uses Akismet to reduce spam. Learn how your comment data is processed.