AGI-09 Survey

Baum et al. surveyed 21 attendees of the AGI-09 conference, on AGI timelines with and without extra funding. They also asked about other details of AGI development such as social impacts, and promising approaches.

Their findings include the following:

The median dates when participants believe there is a 10% , 50% and 90% probability that AI will pass a Turing test are 2020, 2040, and 2075 respectively.
Predictions changed by only a few years when participants were asked to imagine $100 billion (or sometimes $1 billion, due to a typo) in funding.
There was apparently little agreement on the ordering of milestones (‘turing test’, ‘third grade’, ‘Nobel science’, ‘super human’), except that ‘super human’ AI would not come before the other milestones.
A strong majority of participants believed ‘integrative designs’ were more likely to contribute critically to creation of human-level AGI than narrow technical approaches.

Contents

Details

Detailed results

Median confidence levels for different milestones

Table 1 shows median dates given for different confidence levels of AI reaching four benchmarks: able to pass an online third grade test, able to pass a Turing test, able to produce science that would win a Nobel prize, and ‘super human’.

Best guess times for various milestones

Figure 2 shows the distribution of participants’ best guesses – probably usually interpreted as 50 percent confidence points – for the timing of these benchmarks, given status quo levels of funding.

Individual confidence intervals for each milestone

Figure 4 shows all participants’ confidence intervals for all benchmarks. Participant 17 appears to be interpreting ‘best guess’ as something other than fiftieth percentile of probability, though the other responses appear to be consistent with this interpretation.

Expected social impacts

Figure 6 illustrates responses to three questions about social impact. The participants were asked about the probability of negative social impact, if the first AGI that can pass the Turing test is created by an open source project, by the United States military, or by a private company focused on commercial profit. The paper summarises that the experts lacked consensus.

baumetalsocialimpact copy — ‘Fig. 6. Probability of a negative-to-humanity outcome for different development scenarios. The three development scenarios are if the first AGI that can pass the Turing test is created by an open source project (x’s), the United States military (squares), or a private company focused on commercial profit (triangles). Participants are displayed in the same order as in figure 4, such that Participant 1 in figure 6 is the same person as Participant 1 in figure 4.’

Methodological details

The survey contained a set of standardized questions, plus individualized followup questions. It can be downloaded from here.

It included questions on:

when AI would meet certain benchmarks (passing third grade, turing test, Nobel quality research, superhuman), with and without billions of dollars of additional funding. Participants were asked for confidence intervals (10%, 25%, 75%, 90%) and ‘best estimates’ (interpreted above as 50% confidence levels).
Embodiment of the first AGIs (physical, virtual, minimal)
What AI software paradigm the first AGIs would be based on (formal neural networks, probability theory, uncertain logic, evolutionary learning, a large hand-coded knowledge-base, mathematical theory, nonlinear dynamical systems, or an integrative design combining multiple paradigms)
Probability of strongly negative-to-humanity outcome if the first AGIs were created by different parties (an open-source project, the US military, or a private for-profit software company)
If quantum computing or hypercomputing would be required for AGI.
Whether brain emulations would be conscious
The experts’ area of expertise

Participants

Most of the participants were actively involved in AI research. The paper describes them:

Study participants have a broad range of backgrounds and experience, all with significant prior thinking about AGI. Eleven are in academia, including six Ph.D. students, four faculty members, and one visiting scholar, all in AI or allied fields. Three lead research at independent AI research organizations and three do the same at information technology organizations. Two are researchers at major corporations. One holds a high-level administrative position at a relevant non-profit organization. One is a patent attorney. All but four participants reported being actively engaged in conducting AI research.

According to the website, the AGI-09 conference gathers “leading academic and industry researchers involved in serious scientific and engineering work aimed directly toward the goal of artificial general intelligence”. While these people are expert in the field, they are also probably highly selected for being optimistic about the timing of human-level AI. This seems likely to produce some bias.

Meaning of ‘Turing test’

Several meanings of ‘Turing test’ are prevalent, and it is unclear what distribution of them is being used by participants. The authors note that some participants asked about this ambiguity, and were encouraged verbally to consider the ‘one hour version’ instead of the ‘five minute version’, because the shorter one might be gamed by chat-bots (p6). The authors also write, ‘Using human cognitive development as a model, one might think that being able to do Nobel level science would take much longer than being able to conduct a social conversation, as in the Turing Test’ (p8). Both of these points suggest that the authors at least were thinking of a Turing test as a test of normal social conversation rather than a general test of human capabilities as they can be observed via a written communication channel.

Details

Detailed results

Median confidence levels for different milestones

Best guess times for various milestones

Individual confidence intervals for each milestone

Expected social impacts

Methodological details

Participants

Meaning of ‘Turing test’

Related Articles

Historical economic growth trends

Human-Level AI

Possible Empirical Investigations