An interesting thing about the survey data on timelines to human-level AI is the apparent incongruity between answers to ‘when will human-level AI arrive?’ and answers to ‘how much of the way to human-level AI have we come recently?‘
In particular, human-level AI will apparently arrive in thirty or forty years, while in the past twenty years most specific AI subfields have apparently moved only five or ten percent of the remaining distance to human-level AI, with little sign of acceleration.
Some possible explanations:
- The question about how far we have come has hardly been asked, and the small sample size has hit slow subfields, or hard-to-impress researchers, perhaps due to a different sampling of events.
- Hanson (the only person who asked how far we have come) somehow inspires modesty or agreement in his audience. His survey methodology is conversational, and the answers do agree with his own views.
- The ‘inside view‘ is overoptimistic: if you ask a person directly when their project will be done, they tend to badly underestimate. Taking the ‘outside view‘ – extrapolating from similar past situations – helps to resolve these problems, and is more accurate. The first question invites the inside view, while the second invites the outside view.
- Different people are willing to answer the different questions.
- Estimating ‘how much of the way between where we were twenty years ago and human-level capabilities’ is hopelessly difficult, and the answers are meaningless.
- Estimating ‘when will we have human-level AI?’ is hopelessly difficult, and the answers are meaningless.
- When people answer the ‘how far have we come in the last twenty years?…’ question, they use a different scale to when they answer the ‘…and are we accelerating?’ question, for instance thinking of where we are as a fraction of what is left to do in the first case, and expecting steady exponential growth in that fraction, but not thinking of steady exponential growth as ‘acceleration’.
- AI researchers expect a small number of fast-growing subfields to produce AI with the full range of human-level skills, rather than for it to combine contributions from many subfields.
- Researchers have further information not captured in the past progress and acceleration estimates. In particular, they have reason to expect acceleration.
Since the two questions have so far yielded very different answers, it would be nice to check whether the different answers come from the different kinds of questions (rather than e.g. the small and casual nature of the Hanson survey), and to get a better idea of which kind of answer is more reliable. This might substantially change the message we get from looking at the opinions of AI researchers.
Luke Muehlhauser and I have written before about how to conduct a larger survey like Hanson’s. One might also find or conduct experiments comparing these different styles of elicitation on similar predictions that can be sooner verified. There appears to be some contention over which method should be more reliable, so we could also start by having that discussion.