Glossary of AI Risk Terminology and common AI terms

Contents

Terms

A

AI timeline

An expectation about how much time will lapse before important AI events, especially the advent of human-level AI or a similar milestone. The term can also refer to the actual periods of time (which are not yet known), rather than an expectation about them.

Artificial General Intelligence (also, AGI)

Skill at performing intellectual tasks across at least the range of variety that a human being is capable of. As opposed to skill at certain specific tasks (‘narrow’ AI). That is, synonymous with the more ambiguous Human-Level AI for some meanings of the latter.

Artificial Intelligence (also, AI)

Behavior characteristic of human minds exhibited by man-made machines, and also the area of research focused on developing machines with such behavior. Sometimes used informally to refer to human-level AI or another strong form of AI not yet developed.

Associative value accretion

A hypothesized approach to value learning in which the AI acquires values using some machinery for synthesizing appropriate new values as it interacts with its environment, inspired by the way humans appear to acquire values (Bostrom 2014, p189-190)¹.

Anthropic capture

A hypothesized control method in which the AI thinks it might be in a simulation, and so tries to behave in ways that will be rewarded by its simulators (Bostrom 2014 p134).

Anthropic reasoning

Reaching beliefs (posterior probabilities) over states of the world and your location in it, from priors over possible physical worlds (without your location specified) and evidence about your own situation. For an example where this is controversial, see The Sleeping Beauty Problem. For more on the topic and its relation to AI, see here.

Augmentation

An approach to obtaining a superintelligence with desirable motives that consists of beginning with a creature with desirable motives (eg, a human), then making it smarter, instead of designing good motives from scratch (Bostrom 2014, p142).

B

Backpropagation

A fast method of computing the derivative of cost with respect to different parameters in a network, allowing for training neural nets through gradient descent. See Neural Networks and Deep Learning² for a full explanation.

Boxing

A control method that consists of constructing the AI’s environment so as to minimize interaction between the AI and the outside world. (Bostrom 2014, p129).

C

Capability control methods

Strategies for avoiding undesirable outcomes by limiting what an AI can do (Bostrom 2014, p129).

Cognitive enhancement

Improvements to an agent’s mental abilities.

Collective superintelligence

“A system composed of a large number of smaller intellects such that the system’s overall performance across many very general domains vastly outstrips that of any current cognitive system” (Bostrom 2014, p54).

Computation

A sequence of mechanical operations intended to shed light on something other than this mechanical process itself, through an established relationship between the process and the object of interest.

The common good principle

“Superintelligence should be developed only for the benefit of all of humanity and in the service of widely shared ethical ideals” (Bostrom 2014, p254).

Crucial consideration

An idea with the potential to change our views substantially, such as by reversing the sign of the desirability of important interventions.

D

Decisive strategic advantage

Strategic superiority (by technology or other means) sufficient to enable an agent to unilaterally control most of the resources of the universe.

Direct specification

An approach to the control problem in which the programmers figure out what humans value, and code it into the AI (Bostrom 2014, p139-40).

Domesticity

An approach to the control problem in which the AI is given goals that limit the range of things it wants to interfere with (Bostrom 2014, p140-1).

E

Emulation modulation

Starting with brain emulations with approximately normal human motivations (see ‘Augmentation’), and modifying their motivations using drugs or digital drug analogs.

Evolutionary selection approach to value learning

A hypothesized approach to the value learning problem which obtains an AI with desirable values by iterative selection, the same way evolutionary selection produced humans (Bostrom 2014, p187-8).

Existential risk

Risk of an adverse outcome that would either annihilate Earth-originating intelligent life or permanently and drastically curtail its potential (Bostrom 2002)

F

Feature

A dimension in the vector space of activations in a single layer of a neural network (i.e. a neuron activation or linear combination of activations of different neurons)

First principal-agent problem

The well-known problem faced by a sponsor wanting an employee to fulfill their wishes (usually called ‘the principal agent problem’).

G

Genie

An AI that carries out a high level command, then waits for another (Bostrom 2014, p148).

H

Hardware overhang

A situation where large amounts of hardware being used for other purposes become available for AI, usually posited to occur when AI reaches human-level capabilities.

Human-level AI

An AI that matches human capabilities in virtually every domain of interest. Note that this term is used ambiguously; see our page on human-level AI.

Human-level hardware

Hardware that matches the information-processing ability of the human brain.

Human-level software

Software that matches the algorithmic efficiency of the human brain, for doing the tasks the human brain does.

I

Impersonal perspective

The view that one should act in the best interests of everyone, including those who may be brought into existence by one’s choices (see Person-affecting perspective).

Incentive methods

Strategies for controlling an AI that consist of setting up the AI’s environment such that it is in the AI’s interest to cooperate. e.g. a social environment with punishment or social repercussions often achieves this for contemporary agents (Bostrom 2014, p131).

Incentive wrapping

Provisions in the goals given to an AI that allocate extra rewards to those who helped bring the AI about (Bostrom 2014, p222-3).

Indirect normativity

An approach to the control problem in which we specify a way to specify what we value, instead of specifying what we value directly (Bostrom, p141-2).

Instrumental convergence thesis

We can identify ‘convergent instrumental values’. That is, subgoals that are useful for a wide range of more fundamental goals, and in a wide range of situations (Bostrom 2014, p109).

Intelligence explosion

A hypothesized event in which an AI rapidly improves from ‘relatively modest’ to superhuman level (usually imagined to be as a result of recursive self-improvement).

M

Macrostructural development accelerator

An imagined lever used in thought experiments which slows the large scale features of history (e.g. technological change, geopolitical dynamics) while leaving the small scale features the same.

Mind crime

The mistreatment of morally relevant computations.

Moore’s Law

Any of several different consistent, many-decade patterns of exponential improvement that have been observed in digital technologies. The classic version concerns the number of transistors in a dense integrated circuit, which was observed to be doubling around every year when the ‘law’ was formulated in 1965. Price-Performance Moore’s Law is often relevant to AI forecasting.

Moral rightness (MR) AI

An AI which seeks to do what is morally right.

Motivational scaffolding

A hypothesized approach to value learning in which the seed AI is given simple goals, and these goals are replaced with more complex ones once it has developed sufficiently sophisticated representational structure (Bostrom 2014, p191-192).

Multipolar outcome

A situation after the arrival of superintelligence in which no single agent controls most of the resources.

O

Optimization power

The strength of a process’s ability to improve systems.

Oracle

An AI that only answers questions (Bostrom 2014, p145).

Orthogonality thesis

Intelligence and final goals are orthogonal: more or less any level of intelligence could in principle be combined with more or less any final goal.

P

Person-affecting perspective

The view that one should act in the best interests of everyone who already exists, or who will exist independent of one’s choices (see Impersonal perspective).

Perverse instantiation

A solution to a posed goal (eg, make humans smile) that is destructive in unforeseen ways (eg, paralyzing face muscles in the smiling position).

Price-Performance Moore’s Law

The observed pattern of relatively consistent, long term, exponential price decline for computation.

Principle of differential technological development

“Retard the development of dangerous and harmful technologies, especially the ones that raise the level of existential risk; and accelerate the development of beneficial technologies, especially those that reduce the existential risk posed by nature or by other technologies” (Bostrom 2014, p230).

Principle of epistemic deference

“A future superintelligence occupies an epistemically superior vantage point: it’s beliefs are (probably, on most topics) more likely than our to be true. We should therefore defer to the superintelligence’s position whenever feasible” (Bostrom 2014, p226).

Q

Quality superintelligence

“A system that is at least as fast as a human mind and vastly qualitatively smarter” (Bostrom 2014, p56).

R

Recalcitrance

How difficult a system is to improve.

Recursive self-improvement

The envisaged process of AI (perhaps a seed AI) iteratively improving itself.

Reinforcement learning approach to value learning

A hypothesized approach to value learning in which the AI is rewarded for behaviors that more closely approximate human values (Bostrom 2014, p188-9).

S

Second principal-agent problem

The emerging problem of a developer wanting their AI to fulfill their wishes.

Seed AI

A modest AI which can bootstrap into an impressive AI by improving its own architecture.

Singleton

An agent that is internally coordinated and has no opponents.

Sovereign

An AI that acts autonomously in the world, in pursuit of potentially long range objectives (Bostrom 2014, p148).

Speed superintelligence

“A system that can do all that a human intellect can do, but much faster” (Bostrom 2014, p53).

State risk

A risk that comes from being in a certain state, such that the amount of risk is a function of the time spent there. For example, the state of not having the technology to defend from asteroid impacts carries risk proportional to the time we spend in it.

Step risk

A risk that comes from making a transition. Here the amount of risk is not a simple function of how long the transition takes. For example, traversing a minefield is not safer if done more quickly.

Stunting

A control method that consists of limiting the AI’s capabilities, for instance as by limiting the AI’s access to information (Bostrom 2014, p135).

Superintelligence

Any intellect that greatly exceeds the cognitive performance of humans in virtually all domains of interest (Bostrom 2014, p22).

T

Takeoff

The event of the emergence of a superintelligence, often characterized by its speed: ‘slow takeoff’ takes decades or centuries, ‘moderate takeoff’ takes months or years and ‘fast takeoff’ takes minutes to days.

Technological completion conjecture

If scientific and technological development efforts do not cease, then all important basic capabilities that could be obtained through some possible technology will be obtained (Bostrom 2014, p127).

Technology coupling

A predictable timing relationship between two technologies, such that hastening of the first technology will hasten the second, either because the second is a precursor or because it is a natural consequence (Bostrom 2014, p236-8) e.g. brain emulation is plausibly coupled to ‘neuromorphic’ AI, because the understanding required to emulate a brain might allow one to more quickly create an AI on similar principles.

Tool AI

An AI that is not ‘like an agent’, but like a more flexible and capable version of contemporary software. Most notably perhaps, it is not goal-directed (Bostrom 2014, p151).

U

Utility function

A mapping from states of the world to real numbers (‘utilities’), describing an entity’s degree of preference for different states of the world. Given the choice between two lotteries, the entity prefers the lottery with the highest ‘expected utility’, which is to say, sum of utilities of possible states weighted by the probability of those states occurring.

V

Value learning

An approach to the value loading problem in which the AI learns the values that humans want it to pursue (Bostrom 2014, p207).

Value loading problem

The problem of causing the AI to pursue human values (Bostrom 2014, p185).

W

Wise-Singleton Sustainability Threshold

A capability set exceeds the wise-singleton threshold if and only if a patient and existential risk-savvy system with that capability set would, if it face no intelligent opposition or competition, be able to colonize and re-engineer a large part of the accessible universe (Bostrom 2014, p100).

Whole-brain emulation

Machine intelligence created by copying the computational structure of the human brain.

Word embedding

A mapping of words to high-dimensional vectors that has been trained to be useful in a word task such that the arrangement of words in the vector space is meaningful. For instance, words near one other in the vector-space are related, and similar relationships between different pairs of words correspond to similar vectors between them, so that e.g. if E(x) is the vector for the word ‘x’, then E(king) – E(queen) ≈ E(woman) – E(man). Word embeddings are explained in more detail here.

Notes

Bostrom, Nick. Superintelligence: Paths, Dangers, Strategies. 1st edition. Oxford: Oxford University Press, 2014.
Nielsen, Michael A. “Neural Networks and Deep Learning,” 2015. http://neuralnetworksanddeeplearning.com.