Crossposted from the AI Alignment Forum. May contain more technical jargon than usual.
Crossposted from the AI Alignment Forum. May contain more technical jargon than usual.
[Epistemic status: Argument by analogy to historical cases. Best case scenario it’s just one argument among many. Edit: Also, thanks to feedback from others, especially Paul, I intend to write a significantly improved version of this post in the next two weeks.]
I have on several occasions heard people say things like this:
The original Bostrom/Yudkowsky paradigm envisioned a single AI built by a single AI project, undergoing intelligence explosion all by itself and attaining a decisive strategic advantage as a result. However, this is very unrealistic. Discontinuous jumps in technological capability are very rare, and it is very implausible that one project could produce more innovations than the rest of the world combined. Instead we should expect something more like the Industrial Revolution: Continuous growth, spread among many projects and factions, shared via a combination of trade and technology stealing. We should not expect any one project or AI to attain a decisive strategic advantage, because there will always be other projects and other AI that are only slightly less powerful, and coalitions will act to counterbalance the technological advantage of the frontrunner. (paraphrased)
Proponents of this view often cite Paul Christiano in support. Last week I heard him say he thinks the future will be “like the Industrial Revolution but 10x-100x faster.”
In this post, I assume that Paul’s slogan for the future is correct and then nevertheless push back against the view above. Basically, I will argue that even if the future is like the industrial revolution only 10x-100x faster, there is a 30%+ chance that it will involve a single AI project (or a single AI) with the ability to gain a decisive strategic advantage, if they so choose. (Whether or not they exercise that ability is another matter.)
Why am I interested in this? Do I expect some human group to take over the world? No; instead what I think is that (1) an unaligned AI in the leading project might take over the world, and (2) A human project that successfully aligns their AI might refrain from taking over the world even if they have the ability to do so, and instead use their capabilities to e.g. help the United Nations enforce a ban on unauthorized AGI projects.
National ELO ratings during the industrial revolution and the modern era
In chess (and some other games) ELO rankings are used to compare players. An average club player might be rank 1500; the world chess champion might be 2800; computer chess programs are even better. If one player has 400 points more than another, it means the first player would win with ~90% probability.
We could apply this system to compare the warmaking abilities of nation-states and coalitions of nation-states. For example, in 1941 perhaps we could say that the ELO rank of the Axis powers was ~300 points lower than the ELO rank of the rest of the world combined (because what in fact happened was the rest of the world combining to defeat them, but it wasn’t a guaranteed victory). We could add that in 1939 the ELO rank of Germany was ~400 points higher than that of Poland, and that the ELO rank of Poland was probably 400+ points higher than that of Luxembourg.
We could make cross-temporal fantasy comparisons too. The ELO ranking of Germany in 1939 was probably ~400 points greater than that of the entire world circa 1910, for example. (Visualize the entirety of 1939 Germany teleporting back in time to 1910, and then imagine the havoc it would wreak.)
Claim 1A: If we were to estimate the ELO rankings of all nation-states and sets of nation-states (potential alliances) over the last 300 years, the rank of the most powerful nation-state at at a given year would on several occasions be 400+ points greater than the rank of the entire world combined 30 years prior.
Claim 1B: Over the last 300 years there have been several occasions in which one nation-state had the capability to take over the entire world of 30 years prior.
I’m no historian, but I feel fairly confident in these claims.
- In naval history, the best fleets in the world in 1850 were obsolete by 1860 thanks to the introduction of iron-hulled steamships, and said steamships were themselves obsolete a decade or so later, and then those ships were obsoleted by the Dreadnought, and so on… This process continued into the modern era. By “Obsoleted” I mean something like “A single ship of the new type could defeat the entire combined fleet of vessels of the old type.”
- A similar story could be told about air power. In a dogfight between planes of year 19XX and year 19XX+30, the second group of planes will be limited only by how much ammunition they can carry.
- Small technologically advanced nations have regularly beaten huge sprawling empires and coalitions. (See: Colonialism)
- The entire world has been basically carved up between the small handful of most-technologically advanced nations for two centuries now. For example, any of the Great Powers of 1910 (plus the USA) could have taken over all of Africa, Asia, South America, etc. if not for the resistance that the other great powers would put up. The same was true 40 years later and 40 years earlier.
I conclude from this that if some great power in the era kicked off by the industrial revolution had managed to “pull ahead” of the rest of the world more effectively than it actually did–30 years more effectively, in particular–it really would have been able to take over the world.
Claim 2: If the future is like the Industrial Revolution but 10x-100x faster, then correspondingly the technological and economic power granted by being 3 – 0.3 years ahead of the rest of the world should be enough to enable a decisive strategic advantage.
The question is, how likely is it that one nation/project/AI could get that far ahead of everyone else? After all, it didn’t happen in the era of the Industrial Revolution. While we did see a massive concentration of power into a few nations on the leading edge of technological capability, there were always at least a few such nations and they kept each other in check.
The “surely not faster than the rest of the world combined” argument
Sometimes I have exchanges like this:
- Me: Decisive strategic advantage is plausible!
- Interlocutor: What? That means one entity must have more innovation power than the rest of the world combined, to be able to take over the rest of the world!
- Me: Yeah, and that’s possible after intelligence explosion. A superintelligence would totally have that property.
- Interlocutor: Well yeah, if we dropped a superintelligence into a world full of humans. But realistically the rest of the world will be undergoing intelligence explosion too. And indeed the world as a whole will undergo a faster intelligence explosion than any particular project could; to think that one project could pull ahead of everyone else is to think that, prior to intelligence explosion, there would be a single project innovating faster than the rest of the world combined!
This section responds to that by way of sketching how one nation/project/AI might get 3 – 0.3 years ahead of everyone else.
Toy model: There are projects which research technology, each with their own “innovation rate” at which they produce innovations from some latent tech tree. When they produce innovations, they choose whether to make them public or private. They have access to their private innovations + all the public innovations.
It follows from the above that the project with access to the most innovations at any given time will be the project that has the most hoarded innovations, even though the set of other projects has a higher combined innovation rate and also a larger combined pool of accessible innovations. Moreover, the gap between the leading project and the second-best project will increase over time, since the leading project has a slightly higher rate of production of hoarded innovations, but both projects have access to the same public innovations
This model leaves out several important things. First, it leaves out the whole “intelligence explosion” idea: A project’s innovation rate should increase as some function of how many innovations they have access to. Adding this in will make the situation more extreme and make the gap between the leading project and everyone else grow even bigger very quickly.
Second, it leaves out reasons why innovations might be made public. Realistically there are three reasons: Leaks, spies, and selling/using-in-a-way-that-makes-it-easy-to-copy.
Claim 3: Leaks & Spies: I claim that the 10x-100x speedup Paul prophecies will not come with an associated 10x-100x increase in the rate of leaks and successful spying. Instead the rate of leaks and successful spying will be only a bit higher than it currently is.
This is because humans are still humans even in this soft takeoff future, still in human institutions like companies and governments, still using more or less the same internet infrastructure, etc. New AI-related technologies might make leaking and spying easier than it currently is, but they also might make it harder. I’d love to see an in-depth exploration of this question because I don’t feel particularly confident.
But anyhow, if it doesn’t get much easier than it currently is, then going 3 years to 0.3 years without a leak is possible, and more generally it’s possible for the world’s leading project to build up a 0.3-3 year lead over the second-place project. For example, the USSR had spies embedded in the Manhattan Project but it still took them 4 more years to make their first bomb.
Claim 4: Selling etc. I claim that the 10x-100x speedup Paul prophecies will not come with an associated 10x-100x increase in the budget pressure on projects to make money fast. Again, today AI companies regularly go years without turning a profit — DeepMind, for example, has never turned a profit and is losing something like a billion dollars a year for its parent company — and I don’t see any particularly good reason to expect that to change much.
So yeah, it seems to me that it’s totally possible for the leading AI project to survive off investor money and parent company money (or government money, for that matter!) for five years or so, while also keeping the rate of leaks and spies low enough that the distance between them and their nearest competitor increases rather than decreases. (Note how this doesn’t involve them “innovating faster than the rest of the world combined.”)
Suppose they could get a 3-year lead this way, at the peak of their lead. Is that enough?
Well, yes. A 3-year lead during a time 10x-100x faster than the Industrial Revolution would be like a 30-300 year lead during the era of the Industrial Revolution. As I argued in the previous section, even the low end of that range is probably enough to get a decisive strategic advantage.
If this is so, why didn’t nations during the Industrial Revolution try to hoard their innovations and gain decisive strategic advantage?
England actually did, if I recall correctly. They passed laws and stuff to prevent their early Industrial Revolution technology from spreading outside their borders. They were unsuccessful–spies and entrepreneurs dodged the customs officials and snuck blueprints and expertise out of the country. It’s not surprising that they weren’t able to successfully hoard innovations for 30+ years! Entire economies are a lot more leaky than AI projects.
What a “Paul Slow” soft takeoff might look like according to me
At some point early in the transition to much faster innovation rates, the leading AI companies “go quiet.” Several of them either get huge investments or are nationalized and given effectively unlimited funding. The world as a whole continues to innovate, and the leading companies benefit from this public research, but they hoard their own innovations to themselves. Meanwhile the benefits of these AI innovations are starting to be felt; all projects have significantly increased (and constantly increasing) rates of innovation. But the fastest increases go to the leading project, which is one year ahead of the second-best project. (This sort of gap is normal for tech projects today, especially the rare massively-funded ones, I think.) Perhaps via a combination of spying, selling, and leaks, that lead narrows to six months midway through the process. But by that time things are moving so quickly that a six months’ lead is like a 15-150 year lead during the era of the Industrial Revolution. It’s not guaranteed and perhaps still not probable, but at least it’s reasonably likely that the leading project will be able to take over the world if it chooses to.
Objection: What about coalitions? During the industrial revolution, if one country did successfully avoid all leaks, the other countries could unite against them and make the “public” technology inaccessible to them. (Trade does something like this automatically, since refusing to sell your technology also lowers your income which lowers your innovation rate as a nation.)
Reply: Coalitions to share AI research progress will be harder than free-trade / embargo coalitions. This is because AI research progress is much more the result of rare smart individuals talking face-to-face with each other and much less the result of a zillion different actions of millions of different people, as the economy is. Besides, a successful coalition can be thought of as just another project, and so it’s still true that one project could get a decisive strategic advantage. (Is it fair to call “The entire world economy” a project with a decisive strategic advantage today? Well, maybe… but it feels a lot less accurate since almost everyone is part of the economy but only a few people would have control of even a broad coalition AI project.)
Anyhow, those are my thoughts. Not super confident in all this, but it does feel right to me. Again, the conclusion is not that one project will take over the world even in Paul’s future, but rather that such a thing might still happen even in Paul’s future.
Thanks to Magnus Vinding for helpful conversation.
By Daniel Kokotajlo