We expect predictions that human-level AI will come sooner to be recorded publicly more often, for a few reasons. Public statements are probably more optimistic than surveys because of such effects. The difference appears to be less than a decade, for median predictions.
Below we outline five reasons for expecting earlier predictions to be stated and publicized more than later ones. We do not know of compelling reasons to expect longer term predictions to be publicized more, unless they are so distant as to also fit under the first bias discussed below.
Bias from not stating the obvious
In many circumstances, people are disproportionately likely to state beliefs that they think others do not hold. For example, “homeopathy works” gets more Google hits than “homeopathy doesn’t work”, though this probably doesn’t reflect popular beliefs on the matter. Making public predictions seems likely to be a circumstance with this character. Predictions are often made in books and articles which are intended to be interesting and surprising, rather than by people whose job it is to report on AI forecasts regardless of how far away they are. Thus we expect people with unusual positions on AI timelines to be more likely to state them. This should produce a bias toward both very short and very long predictions being published.
Bias from the near future being more concerning
Artificial intelligence will arguably be hugely important, whether as a positive or negative influence on the world. Consequently, people are motivated to talk about its social implications. The degree of concern motivated by impending events tends to increase sharply with proximity to the event. Thus people who expect human-level AI in a decade will tend to be more concerned about it than people who expect human-level AI to take a century, and so will talk about it more. Similarly, publishers are probably more interested in producing books and articles making more concerning claims.
Bias from ignoring reverse predictions
If you search for people predicting AI by a given date, you can get downwardly biased estimates by taking predictions from sources where people are asked about certain specific dates, and respond that AI will or will not have arrived by that date. If people respond ‘AI will arrive by X’ and ‘AI will not arrive by X’ as appropriate, the former can look like ‘predictions’ while the latter do not.
This bias affected some data in the MIRI dataset, though we have tried to minimize it now. For example, this bet (“By 2029 no computer – or “machine intelligence” – will have passed the Turing Test.”) is interpreted in the above collection as Kurzweil making a prediction, but not as Kapor making a prediction. It also contained several estimates of 70 years, taken from a group who appear to have been asked whether AI would come within 70 years, much later, or never. The ‘within 70 years’ estimates are recorded as predictions, while the others ignored, producing ’70 years’ estimates, almost regardless of the overall opinions of the group surveyed. In a population of people with a range of beliefs, this method of recording predictions would produce ‘predictions’ largely determined by which year was asked about.
The aforementioned bias arises from an error that can be avoided in recording data, where predictions and reverse predictions are available. However similar types of bias may exist more subtly. Such bias could arise where people informally volunteer opinions in a discussion about some period in the future. People with shorter estimates who can make a positive statement might feel more as though they have something to say, while those who believe there will not be AI at that time do not. For instance, suppose ten people write books about the year 2050, and each predicts AI in a different decade in the 21st Century. Those who predict it prior to 2050 will mention it, and be registered as a prediction of before 2050. Those who predict it after 2050 will not mention it, and not be registered as making a prediction. This could also be hard to avoid if predictions reach you through a filter of others registering them as predictions.
Selection bias from optimistic experts
Main article: Selection bias from optimistic experts
Some factors that cause people to make predictions about AI are likely to correlate with expectations of human-level AI arriving sooner. Experts are better positioned to make credible predictions about their field of expertise than more distant observers are. However since people are more likely to join a field if they are more optimistic about progress there, we might expect their testimony to be biased toward optimism.
Measuring these biases
These forms of bias (except the last) seem to us as if they should be much weaker in survey data than voluntary statements, for the following reasons:
- Surveys come with a default of answering questions, so one does not need a strong reason or social justification for doing so (e.g. having a surprising claim, or wanting to elicit concern).
- One can assess whether a survey ignores reverse predictions, and there appears to be little risk of invisible reverse predictions.
- Participation in surveys is mostly determined before the questions are viewed, for a large number of questions at once. This allows less opportunity for views on the question to affect participation.
- Participation in surveys is relatively cheap, so people who care little about expressing any particular view are likely to participate for reasons of orthogonal incentives, whereas costly communications (such as writing a book) are likely to be sensible only for those with a strong interest in promoting a specific message.
- Participation in surveys is usually anonymous, so relatively unsatisfactory for people who particularly want to associate with a specific view, further aligning the incentives of those who want to communicate with those who don’t care.
- Much larger fractions of people participate in surveys when requested than volunteer predictions in highly publicized arenas, which lessens the possibility for selection bias.
We think publication biases such as those described here are reasonably likely on theoretical grounds. We are also not aware of other reasons to expect surveys and statements to differ in their optimism about AI timelines. Thus we can compare the predictions of statements and surveys to estimate the size of these biases. Survey data appears to produce median predictions of human-level AI somewhat later than similar public statements do: less than a decade, at a very rough estimate. Thus we think some combination of these biases probably exist, and introduce less than a decade of error to median estimates.
Accuracy of AI predictions: AI predictions made in statements are probably biased toward being early, by less than a decade. This suggests both that predictions overall are probably slightly earlier than they would be otherwise, and surveys should be trusted more relative to statements (though there may be other considerations there).
Collecting data: When collecting data about AI predictions, it is important to avoid introducing bias by recording opinions that AI is before some date while ignoring opinions that it is after that date.
MIRI dataset: The earlier version of the MIRI dataset is somewhat biased due to ignoring reverse predictions, however this has been at least partially resolved.