Accuracy of AI Predictions

It is unclear how informative we should expect expert predictions about AI timelines to be. Individual predictions are undoubtedly often off by many decades, since they disagree with each other. However their aggregate may still be quite informative. The main potential reason we know of to doubt the accuracy of expert predictions is that experts are generally poor predictors in many areas, and AI looks likely to be one of them. However we have not investigated how accurate ‘poor’ is, or whether AI really is such a case.

Predictions of AI timelines are likely to be biased toward optimism by roughly decades, especially if they are voluntary statements rather than surveys, and especially if they are from populations selected for optimism. We expect these factors account for less than a decade and around two decades’ difference in median predictions respectively.

Support

Considerations regarding accuracy

A number of reasons have been suggested for distrusting predictions about AI timelines:

  • Models of areas where people predict well
    Research has produced a characterization of situations where experts predict well and where they do not. See table 1 here. AI appears to fall into several classes that go with worse predictions. However we have not investigated this evidence in depth, or the extent to which these factors purportedly influence prediction quality.
  • Expert predictions are generally poor
    Experts are notoriously poor predictors. However our impression is that this is because of their disappointing inability to predict some things well, rather than across the board failure. For instance, experts can predict the Higgs boson’s existence, outcomes of chemical reactions, and astronomical phenomena. So the question falls back to where AI falls in the spectrum of expert predictability, discussed in the last point.
  • Disparate predictions
    One sign that AI predictions are not very accurate is that they differ over a range of a century or so. This strongly suggests that many individual predictions are inaccurate, though not that the aggregate distribution is uninformative.
  • Similarity of old and new predictions
    Older predictions seem to form a fairly similar distribution to more recent predictions, except for very old predictions. This is weak evidence that new predictions are not strongly affected by evidence, and are therefore more likely to be inaccurate.
  • Similarity of expert and lay opinions
    Armstrong and Sotala found that expert and non-expert predictions look very similar.1 This finding is in doubt at the time of writing, due to errors in the analysis. If it were true, this would be weak evidence against experts having relevant expertise, since if they did, this might cause a difference with the opinions of lay-people. Note that it may also not, if the laypeople go to experts for information.
  • Predictions are about different things and often misinterpreted
    Comments made around predictions of human-level AI suggest that predictors are sometimes thinking about different events as ‘AI arriving’.2 Even when they are predictions about the same event, ‘prediction’ can mean different things. One person might ‘predict’ the year when they think human-level AI is more likely than not, while another ‘predicts’ the year that AI seems almost certain.

This list is not necessarily complete.

Purported biases

A number of biases have been posited to affect predictions of human-level AI:

  • Selection biases from optimistic experts
    Becoming an expert is probably correlated with independent optimism about the field, and experts make most of the credible predictions. We expect this to push median estimates earlier by less than a few decades.
  • Biases from short-term predictions being recorded
    There are a few reasons to expect recorded public predictions to be biased toward shorter timescales. Overall these probably make public statements less than a decade more optimistic.
  • Maes-Garreau law
    The Maes-Garreau law is a posited tendency for people to predict important technologies not long before their own likely death. It probably doesn’t afflict predictions of human-level AI substantially.
  • Fixed period bias
    There is a stereotype that people tend to predict AI in 20-30 years. There is weak evidence of such a tendency around 20 years, though little evidence that this is due to a bias (that we know of).

Conclusions

AI appears to exhibit several qualities characteristic of areas that people are not good at predicting. Individual AI predictions appear to be inaccurate by many decades in virtue of their disagreement. Other grounds for particularly distrusting AI predictions seem to offer weak evidence against them, if any. Our current guess is that AI predictions are less reliable than many kinds of prediction, though still potentially fairly informative.

Biases toward early estimates appear to exist, as a result of optimistic people becoming experts, and optimistic predictions being more likely to be published for various reasons. These are the only plausible substantial biases we know of.

  1. ‘Using a database of 95 AI timeline predictions, it will show that these expectations are borne out in practice: expert predictions contradict each other considerably, and are indistinguishable from non-expert predictions and past failed predictions.’ – Armstrong and Sotala 2012, p1
  2. For instance, in an interview with Alexander Kruel, Pei Wang says ‘Here by “roughly as good as humans” I mean the AI will follow roughly the same principles as human in information processing, though it does not mean that the system will have the same behavior or capability as human, due to the difference in body, experience, motivation, etc.’Nils Nilson interprets the question differently: ‘Because human intelligence is so multi-faceted, your question really should be divided into each of the many components of intelligence…A while back I wrote an essay about a replacement for the Turing test. It was called the “Employment Test.”  (See: http://ai.stanford.edu/~nilsson/OnlinePubs-Nils/General_Essays/AIMag26-04-HLAI.pdf)  How many of the many, many jobs that humans do can be done by machines?  I’ll rephrase your question to be: When will AI be able to perform around 80% of these jobs as well or better than humans perform?These researchers were asked for their predictions in a context conducive to elaboration. Had they been surveyed more briefly (as in most surveys), or chosen not to elaborate, at least one would have been misunderstood. It is an open question whether 80% of jobs being automated will roughly coincide with artificial minds using similar information processing principles to humans.