Joscha Bach on remaining steps to human-level AI

Joscha Bach (from Wikimedia commons)

Joscha Bach (photos from Wikimedia commons )

Last year John and I had an interesting discussion with Joscha Bach about what ingredients of human-level artificial intelligence we seem to be missing, and how to improve AI forecasts more generally.

Thanks to Connor Flexman’s summarizing efforts, you can now learn about Joscha’s views on these questions without the effort of organizing an interview or reading a long and messy transcript.

(It’s been a while since the conversation, but I checked with Joscha that this is not an objectionably obsolete account of his views.)

Here are the notes.

Here is Connor’s shorter summary:

  • Before we can implement human-level artificial intelligence (HLAI), we need to understand both mental representations and the overall architecture of a mind
  • There are around 12-200 regularities like backpropagation that we need to understand, based on known unknowns and genome complexity
  • We are more than reinforcement learning on computronium: our primate heritage provides most interesting facets of mind and motivation
  • AI funding is now permanently colossal, which should update our predictions
  • AI practitioners learn the constraints on which elements of science fiction are plausible, but constant practice can lead to erosion of long-term perspective
  • Experience in real AI development can lead to both over- and underestimates of the difficulty of new AI projects in non-obvious ways


Tom Griffiths on Cognitive Science and AI

Tom Griffiths

Tom Griffiths

This is a guest post by Finan Adamson

Prof. Tom Griffiths is the director of the Computational Cognitive Science Lab and the Institute of Cognitive and Brain Sciences at UC Berkeley. He studies human cognition and is involved with the Center for Human Compatible Artificial Intelligence. I asked him for insight into the intersection of cognitive science and AI. He offers his thoughts on the historical interaction of the fields and what aspects of human cognition might be relevant to developing AI in the future.

The conversation notes are here (pdf).

What if you turned the world’s hardware into AI minds?

In a classic ‘AI takes over the world’ scenario, one of the first things an emerging superintelligence wants to do is steal most of the world’s computing hardware and repurpose it to running the AI’s own software. This step takes one from ‘super-proficient hacker’ levels of smart to ‘my brain is one of the main things happening on Planet Earth’ levels of smart. There is quite a bit of hardware in the world, so this step in the takeover plan is kind of terrifying.

How terrifying exactly depends on A) how much computing hardware there is in the world at the time, and B) how efficiently hardware can be turned into AI at the time. We have some tentative answers to A)—probably at least a couple of hundred exaFLOPS now, growing somewhere between not at all and very fast. However B) is harder, in the absence of any idea how to get the efficiency of hardware-to-general-AI conversions above zero. Nonetheless, I think there are a couple of interesting reference points we can look at.

The one I’ll discuss now is the efficiency of the human brain. What if we could use about as much hardware as the human brain represents (in some sense) to run AI about as smart as a human brain? This is an interesting point to look at for a few reasons. We know brains are somewhere in the range of efficiency with which hardware can produce intelligent behavior, because they are an instance of that. And looking at one datapoint in the range is better than none. Also, for some means of building artificial intelligence—most obviously, brain emulation—we might expect to get something roughly as efficient as a human brain, give or take some.

So, we can think of the human brain as representing a pile of (fairly application specific) computing hardware. And we can estimate its computing power, in terms of FLOPS. People have done this (very inaccurately— their estimates are twelve orders of magnitude apart, but running through this calculation with such an uncertain number still seems informative). According to different sources, brain seems to be worth between about 3 x 1013 FLOPS and 1025 FLOPS. The median estimate is 1018 FLOPS.

So we can ask, if you turned all of the world’s two hundred exaFLOPS or more of computing hardware into brains, how many brains would you get?

This graph shows the answers over time, for a variety of assumptions about brain FLOPS, world FLOPS, and global computing hardware growth rates. Probably the most plausible line is the lower green one (brains median, world hardware high).

Figure: Projected number of human brains equivalent

Figure: Projected number of human brains equivalent to global hardware under various assumptions. For brains, ‘small’ = 3 x 10^ 13, ‘median’ = 10^18, ‘large’ = 10^25. For ‘world hardware’, ‘high’ =2 x 10^20, ‘low’ = 1.5 x 10^21. ‘Growth’ is growth in computing hardware, the unlabeled default used in most projections is 25% per annum (our estimate above), ‘high’ = 86% per annum (which would mean shifting to the highest growth rate we know of for related hardware—that of ASIC hardware in around 2007, which does not plausibly persist).

The basic answer is, if you turned all of the world’s computing hardware into AI as efficient as human brains right now, you would get less than a hundred million extra brains, or 1% of the population of the world. Probably a whole lot less. For the median estimates of brain computing power, you would get about a hundred or a thousand extra brains worth of AI.

That means, for instance, that if we figured out how to make uploads right now, and they were roughly as efficient as the median brains estimate, and then someone acquired all of the hardware in the world for them, they would only have about as many additional minds as a project willing to spend a few hundred million dollars per year on wages, e.g. Facebook. Which would really be something. But not something overwhelmingly outscaling everything else going on in the world.

If you trust the projections of hardware growth fifty years into the future at all (which you shouldn’t, but suppose you did) the most plausible (median brain size, low growth) lines don’t even reach the world population line by then, though they would certainly make for an incredible AI research project, if that was the direction to which the additional mental effort was directed.

Remember, all of this is very sketchy and probably inaccurate and you should maybe think about it a bit more if your decisions depend on it much (or ask us nicely to). But I strongly favor sketchy projections over none.

Image: Planetary Brain, Adrian Kenyon, some rights reserved.

Friendly AI as a global public good

A public good, in the economic sense, can be (roughly) characterized as a desirable good that is likely to be undersupplied, or not supplied at all, by private companies. It generally falls to the government to supply such goods. Examples include infrastructure networks, or a country’s military. See here for a more detailed explanation of public goods.

The provision of public goods by governments can work quite well at the national level. However, at the international level, there is no global government with the power to impose arbitrary legislation on countries and enforce it. As a result, many global public goods, such as carbon emission abatement, disease eradication, and existential risk mitigation, are partially provided or not provided.

Scott Barrett, in his excellent book Why Cooperate? The Incentive to Supply Global Public Goods, explains that not all global public goods are created equal. He develops a categorization scheme (Table 1), identifying important characteristics that influence whether they are likely to be provided, and what tools can be used to improve their likelihood of provision.

For example:

  • Climate change mitigation is classified as an “aggregate effort” global public good, since its provision depends on the aggregate of all countries’ CO2eq emissions. Provision is difficult, as countries each individually face strong incentives to pollute.
  • Defense against large Earth-bound asteroids is classified as a “single best effort” global public good, since provision requires actions by only one country (or coalition of countries). Providing this global public good unilaterally is likely to be in the interests and within the capabilities of at least one individual country, and so it is likely to be provided.
  • Nuclear non-proliferation is classified as a “mutual restraint” public good, since it is provided by countries refraining from doing something. Provision is difficult as many countries individually face strong incentives to maintain a nuclear deterrent (despite the associated economic cost).
Single best effort Weakest link Aggregate effort Mutual restraint Coordination
Supply depends on… The single best (unilateral or collective) effort The weakest individual effort The total effort of all countries Countries not doing something Countries doing the same thing
Examples Asteroid defense, knowledge, peacekeeping, suppressing an infectious disease outbreak at its source, geoengineering Disease eradication, preventing emergence of resistance and new diseases, securing nuclear materials, vessel reflagging Climate change mitigation, ozone layer protection Non-use of nuclear weapons, non-proliferation, bans on nuclear testing and biotechnology research Standards for the measurement of time, for oil tankers, and for automobiles
International cooperation needed? Yes, in many cases, to determine what should be done, and which countries should pay Yes, to establish universal minimum standards Yes, to determine the individual actions needed to achieve an overall outcome Yes, to agree on what countries should not do Yes, to choose a common standard
Financing and cost sharing needed? Yes, when the good is provided collectively Yes, in some cases Yes, with industrialized countries helping developing countries No No
Enforcement of agreement challenging? Not normally Yes, except when provision requires only coordination Yes Yes No, though participation will need to pass a threshold
International institutions for provision Treaties in some cases; international organizations, such as the UN, in other cases Consensus (World Health Assembly) or Security Council resolutions, customary law Treaties Treaties, norms, customary law Non-binding resolutions; treaties in some cases

Table 1: Simple Taxonomy of Global Public Goods
Source: Scott Barrett (2010), Why Cooperate? The Incentive to Supply Global Public Goods (location 520 of Kindle edition)

Applying the Barrett framework to friendly AI

Artificial Intelligence (AI) technology is likely to progress until the eventual creation of AI that vastly surpasses human cognitive capabilities—artificial superintelligence (ASI). The possibility of an intelligence explosion means that the first ASI system, or those that control it, might possess an unprecedented ability to shape the world according to their preferences. This event could define our entire species, leading rapidly to the full realization of humanity’s potential or causing our extinction. Since “friendly AI”—safe ASI deployed for the benefit of humanity—is a global public good, it may be informative to apply Barrett’s global public good classification scheme to analyse the different facets of this challenge.

Since this framework focuses on the incentives faced by national governments, it is most relevant to situations where ASI development is largely driven by governments, which will therefore be the focus of this article. This government-led scenario is distinct from the current situation of technology industry-led development of AI. Governments might achieve this high level of control through large-scale state-sponsored projects and regulation of private activities.

As with many global public goods, the development of friendly AI can be broken down into many components, each of which may conform to a different category within Barrett’s taxonomy. Here I will focus on those that I believe are most important for long term safety.

Arguably, one of the most concerning problems in the government-led scenario is the potential for the benefits of ASI to be captured by some subset of humanity. Humans are unfortunately much more strongly motivated by self-interest than by the common good, and this is reflected in national and international politics. This mean that, given the chance, leaders whose governments control the development of ASI might seek to capture the benefits for their country only, or some subset of their country such as their political allies, or other groups. This could be achieved by instilling values in the ASI system that favor such groups, or through the direct exertion of control over the ASI system. Protection against this possibility constitutes a “mutual restraint” public good, since its provision relies upon countries refraining from doing so. Failing to prevent this possibility may, depending on the preferences of those that control ASI, cause an existential catastrophe, for example in the form of “flawed realization” or “shriek”.

Because of this, and given the current anarchical state of international relations, any ASI-developing country is likely to be perceived as a significant security threat by other countries. Fears that any country succeeding at creating ASI would gain a large strategic advantage over other countries could readily lead to an ASI development race. In this scenario, speed may be prioritized at the expense of safety measures, for example those necessary to solve the value-loading problem (Ch. 12) and the control problem (Ch. 9). This would compound the risks of misuse of ASI explored in the previous paragraph by increasing the possibility of humanity losing control of this creation. The likelihood of an ASI development race is somewhat supported by Chalmers 2010 (footnote, p. 29).

Further, given that ASI may only be achievable on a timescale of decades, the global order prevailing when ASI is within reach may be truly multi-polar. For example, this timescale may allow both China and India to far surpass the USA in terms of economic weight, and may allow countries such as Brazil and Russia to rival the influence of the USA. With a diverse mix of world powers with differing national values, attempts at coordination and restraint could easily be undermined by mistrust.

Another facet of the global public good of friendly AI is the aforementioned technical challenges, including the value-loading problem and the control problem, which currently receive much attention in discussions of long-term AI safety. In isolation, these technical challenges can be considered a “single best effort” global public good in Barrett’s taxonomy, similar to asteroid defense or geoengineering, where it is often in the interests of some countries to unilaterally provide the good. Therefore, a substantial attempt would probably be made to solve these challenges in the government-led scenario, if race dynamics were not present. In reality, any additional advance work on this technical front is likely to be highly beneficial.

What can be done?

Without aiming to present a robust solution, this section briefly explores some of the available options, informed by insights presented by Barrett regarding mutual restraint global public goods.

A “silver bullet” solution to these institutional challenges could be achieved through the emergence of a world government capable of providing global public goods. Although this may eventually be possible, it seems unlikely within the timeframe in which ASI may be developed. Supporting progression towards this outcome may help to provide the global public goods identified above, but such action is probably insufficient alone.

In relation to mutual restraint public goods generally, Barrett identifies treaties, norms and customary law as institutional tools for provision. If a treaty requiring the necessary restraint could be enforced—Shulman mentions (p. 3) some ways in which one might be—it could be effective. However, this would still rely on countries’ willingness to voluntarily join the agreement.

Norms and custom can help achieve mutual restraint. In his book, Barrett analyses (location 2506 of Kindle edition) an important example; the taboo on the use of nuclear weapons. Thanks to strong aversion towards any destructive use of nuclear weapons, such use has not occurred since 1945. This has occurred despite numerous situations in which it would have been militarily advantageous to use a nuclear weapon, e.g. when a nuclear power was at war with a non-nuclear state. In the presence of such attitudes, any benefits to a country from using nuclear weaponry must be weighed against the costs of severe loss of international reputation, or in the extreme, the end of the taboo and consequent nuclear war.

The taboo on the use of nuclear weapons was not inevitable, but arose partly because of mutual understanding of the seriousness of the threat of nuclear war. If the potential effects of ASI are similarly well understood by all powers seeking to develop it, it is possible that a similar taboo could be created, perhaps with the help of a carefully designed treaty between those countries with meaningful ASI development capabilities. The purpose of such an arrangement would be not only to mandate the adoption of proper safety measures, but also to ensure that the benefits of ASI would be spread fairly amongst all of humanity.


To achieve positions of power, all political leaders depend heavily on their ability to amass resources and influence. Upon learning of the huge potential of ASI, such individuals may instinctively attempt to capture control of its power. They will also expect their rivals to do the same, and will strategize accordingly. Therefore, in the event of government-led ASI development, mutual restraint by ASI-developing nations would be needed to avoid attempts to capture the vast benefits of ASI for a small subset of humanity, and to avoid the harmful effects of a race to develop ASI.