AI Impacts talked to economist Robin Hanson about his views on AI risk and timelines. With his permission, we have posted and transcribed this interview.
Participants
- Robin Hanson — Associate Professor of Economics, George Mason University
- Asya Bergal – AI Impacts
- Robert Long – AI Impacts
Summary
We spoke with Robin Hanson on September 5, 2019. Here is a brief summary of that conversation:
- Hanson thinks that now is the wrong time to put a lot of effort into addressing AI risk:
- We will know more about the problem later, and there’s an opportunity cost to spending resources now vs later, so there has to be a compelling reason to spend resources now instead.
- Hanson is not compelled by existing arguments he’s heard that would argue that we need to spend resources now:
- Hanson famously disagrees with the theory that AI will appear very quickly and in a very concentrated way, which would suggest that we need to spend resources now because won’t have time to prepare.
- Hanson views the AI risk problem as essentially continuous with existing principal agent problems, and disagrees that the key difference—the agents being smarter—should clearly worsen such problems.
- Hanson thinks that we will see concrete signatures of problems before it’s too late– he is skeptical that there are big things that have to be coordinated ahead of time.
- Relatedly, he thinks useful work anticipating problems in advance usually happens with concrete designs, not with abstract descriptions of systems.
- Hanson thinks we are still too far away from AI for field-building to be useful.
- Hanson thinks AI is probably at least a century, perhaps multiple centuries away:
- Hanson thinks the mean estimate for human-level AI arriving is long, and he thinks AI is unlikely to be ‘lumpy’ enough to happen without much warning :
- Hanson is interested in how ‘lumpy’ progress in AI is likely to be: whether progress is likely to come in large chunks or in a slower and steadier stream.
- Measured in terms of how much a given paper is cited, academic progress is not lumpy in any field.
- The literature on innovation suggests that innovation is not lumpy: most innovation is lots of little things, though once in a while there are a few bigger things.
- Hanson is interested in how ‘lumpy’ progress in AI is likely to be: whether progress is likely to come in large chunks or in a slower and steadier stream.
- From an outside view perspective, the current AI boom does not seem different from previous AI booms.
- We don’t have a good sense of how much research needs to be done to get to human-level AI.
- If we don’t expect progress to be particularly lumpy, and we don’t have a good sense of exactly how close we are, we have good reason to think we are not e.g. five-years away rather than halfway.
- Hanson thinks we shouldn’t believe it when AI researchers give 50-year timescales:
- Rephrasing the question in different ways, e.g. “When will most people lose their jobs?” causes people to give different timescales.
- People consistently give overconfident estimates when they’re estimating things that are abstract and far away.
- Hanson thinks the mean estimate for human-level AI arriving is long, and he thinks AI is unlikely to be ‘lumpy’ enough to happen without much warning :
- Hanson thinks AI risk takes up far too large a fraction of people thinking seriously about the future.
- Hanson thinks more futurists should be exploring other future scenarios, roughly proportionally to how likely they are with some kicker for extremity of consequences.
- Hanson doesn’t think that AI is that much worse than other future scenarios in terms of how much future value is likely to be destroyed.
- Hanson thinks the key to intelligence is having many not-fully-general tools:
- Most of the value in tools is in more specific tools, and we shouldn’t expect intelligence innovation to be different.
- Academic fields are often simplified to simple essences, but real-life things like biological organisms and the industrial world progress via lots of little things, and we should expect intelligence to be more similar to the latter examples.
- Hanson says the literature on human uniqueness suggests cultural evolution and language abilities came from several modest brain improvements, not clear differences in brain architecture.
- Hanson worries that having so many people publicly worrying about AI risk before it is an acute problem will mean it is taken less seriously when it is, because the public will have learned to think of such concerns as erroneous fear mongering.
- Hanson would be interested in seeing more work on the following things:
- Seeing examples of big, lumpy innovations that made a big difference to the performance of a system. This could change Hanson’s view of intelligence.
- In particular, he’d be influenced by evidence for important architectural differences in the brains of humans vs. primates.
- Tracking of the automation of U.S. jobs over time as a potential proxy for AI progress.
- Seeing examples of big, lumpy innovations that made a big difference to the performance of a system. This could change Hanson’s view of intelligence.
- Hanson thinks there’s a lack of engagement with critics from people concerned about AI risk.
- Hanson is interested in seeing concrete outside-view models people have for why AI might be soon.
- Hanson is interested in proponents of AI risk responding to the following questions:
- Setting aside everything you know except what this looks like from the outside, would you predict AGI happening soon?
- Should reasoning around AI risk arguments be compelling to outsiders outside of AI?
- What percentage of people who agree with you that AI risk is big, agree for the same reasons that you do?
- Hanson thinks even if we tried, we wouldn’t now be able to solve all the small messy problems that insects can solve, indicating that it’s not sufficient to have insect-level amounts of hardware.
- Hanson thinks that AI researchers might argue that we can solve the core functionalities of insects, but Hanson thinks that their intelligence is largely in being able to do many small things in complicated environments, robustly.
Small sections of the original audio recording have been removed. The corresponding transcript has been lightly edited for concision and clarity.
Audio
Transcript
Asya Bergal: Great. Yeah. I guess to start with, the proposition we’ve been asking people to weigh in on is whether it’s valuable for people to be expending significant effort doing work that purports to reduce the risk from advanced AI. I’d be curious for your take on that question, and maybe a brief description of your reasoning there.
Robin Hanson: Well, my highest level reaction is to say whatever effort you’re putting in, probably now isn’t the right time. When is the right time is a separate question from how much effort, and in what context. AI’s going to be a big fraction of the world when it shows up, so it certainly at some point is worth a fair bit of effort to think about and deal with. It’s not like you should just completely ignore it.
You should put a fair bit of effort into any large area of life or large area of the world, anything that’s big and has big impacts. The question is just really, should you be doing it way ahead of time before you know much about it at all, or have much concrete examples, know the– even structure or architecture, how it’s integrated in the economy, what are the terms of purchase, what are the terms of relationships.
I mean, there’s just a whole bunch of things we don’t know about. That’s one of the reasons to wait–because you’ll know more later. Another reason to wait is because of the opportunity cost of resources. If you save the resources until later, you have more to work with. Those considerations have to be weighed against some expectation of an especially early leverage, or an especially early choice point or things like that.
For most things you expect that you should wait until they show themselves in a substantial form before you start to envision problems and deal with them. But there could be exceptions. Mostly it comes down to arguments that this is an exception.
Asya Bergal: Yeah. I think we’re definitely interested in the proposition that you should put in work now as opposed to later. If you’re familiar with the arguments that this might be an exceptional case, I’d be curious for your take on those and where you disagree .
Robin Hanson: Sure. As you may know, I started involving in this conversation over a decade ago with my co-blogger Eliezer Yudkowsky, and at that point, the major argument that he brought up was something we now call the Foom Argument.
That argument was a very particular one, that this would appear under a certain trajectory, under a certain scenario. That was a scenario where it would happen really fast, would happen in a very concentrated place in time, and basically once it starts, it happens so fast, you can’t really do much about it after that point. So the only chance you have is before that point.
Because it’s very hard to predict when or where, you’re forced to just do stuff early, because you’re never sure when is how early. That’s a perfectly plausible argument given that scenario, if you believe that it shows up in one time and place all of a sudden, fully formed and no longer influenceable. Then you only have the shot before that moment. If you are very unsure when and where that moment would be, then you basically just have to do it now.
But I was doubting that scenario. I was saying that that wasn’t a zero probability scenario, but I was thinking it was overestimated by him and other people in that space. I still think many people overestimate the probability of that scenario. Over time, it seems like more people have distanced themselves from that scenario, yet I haven’t heard as many substitute rationales for why we should do any of this stuff early.
I did a recent blog post responding to a Paul Christiano post and my title was Agency Failure AI Apocalypse?, and so at least I saw an argument there that was different from the Foom argument. It was an argument that you’d see a certain kind of agency failure with AI, and that because of that agency failure, it would just be bad.
It wasn’t exactly an argument that we need to do effort early, though. Even that argument wasn’t per se a reason why you need to do stuff way ahead of time. But it was an argument of why the consequences might be especially bad I guess, and therefore deserving of more investment. And then I critiqued that argument in my post saying he was basically saying the agency problem, which is a standard problem in all human relationships and all organizations, is exasperated when the agent is smart.
And because the AI is, by assumption, very smart, then it’s a very exasperated agency problem; therefore, it goes really bad. I said, “Our literature on the agency problem doesn’t say that it’s a worse problem when they’re smart.” I just denied that basic assumption, pointing to what I’ve known about the agency literature over a long time. Basically Paul in his response said, “Oh, I wasn’t saying there was an agency problem,” and then I was kind of baffled because I thought that was the whole point of his post that I was summarizing.
In any case, he just said he was worried about wealth redistribution. Of course, any large social change has the potential to produce wealth redistribution, and so I’m still less clear why this change would be a bigger wealth distribution consequence than others, or why it would happen more suddenly, or require a more early effort. But if you guys have other particular arguments to talk about here, I’d love to hear what you think, or what you’ve heard are the best arguments aside from Foom.
Asya Bergal: Yeah. I’m at risk of putting words in other people’s mouth here, because we’ve interviewed a bunch of people. I think one thing that’s come up repeatedly is-
Robin Hanson: You aren’t going to name them.
Asya Bergal: Oh, I definitely won’t give a name, but-
Robin Hanson: I’ll just respond to whatever-
Asya Bergal: Yeah, just prefacing this, this might be a strawman of some argument. One thing people are sort of consistently excited about is– they use the term ‘field building,’ where basically the idea is: AI’s likely to be this pretty difficult problem and if we do think it’s far away, there’s still sort of meaningful work we can do in terms of setting up an AI safety field with an increasing number of people who have an increasing amount of–the assumption is useful knowledge–about the field.
Then sort of there’s another assumption that goes along with that that if we investigate problems now, even if we don’t know the exact specifics of what AGI might look like, they’re going to share some common sub problems with problems that we may encounter in the future. I don’t know if both of those would sort of count as field building in people’s lexicon.
Robin Hanson: The example I would give to make it concrete is to imagine in the year 1,000, tasking people with dealing with various of our major problems in our society today. Social media addiction, nuclear war, concentration of capital and manufacturing, privacy invasions by police, I mean any major problem that you could think of in our world today, imagine tasking people in the year 1,000 with trying to deal with that problem.
Now the arguments you gave would sound kind of silly. We need to build up a field in the year 1,000 to study nuclear annihilation, or nuclear conflict, or criminal privacy rules? I mean, you only want to build up a field just before you want to use a field, right? I mean, building up a field way in advance is crazy. You still need some sort of argument that we are near enough that the timescale on which it takes to build a field will match roughly the timescale until we need the field. If it’s a factor of ten off or a thousand off, then that’s crazy.
Robert Long: Yeah. This leads into a specific question I was going to ask about your views. You’ve written based on AI practitioners estimates of how much progress they’ve been making that an outside view calculation suggests we probably have at least a century to go, if maybe a great many centuries at the current rates of progress in AI. That was in 2012. Is that still roughly your timeline? Are there other things that go into your timelines? Basically in general what’s your current AI timeline?
Robin Hanson: Obviously there’s a median estimate and a mean estimate, and then there’s a probability per-unit-time estimate, say, and obviously most everyone agrees that the median or mean could be pretty long, and that’s reasonable. So they’re focused on some, “Yes, but what’s the probability of an early surprise.”
That isn’t directly addressed by that estimate, of course. I mean, you could turn that into a per-unit time if you just thought it was a constant per-unit time thing. That would, I think, be overly optimistic. That would give you too high an estimate I think. I have a series of blog posts, which you may have seen on lumpiness. A key idea here would be we’re getting AI progress over time, and how lumpy it is, is extremely directly relevant to these estimates.
For example, if it was maximally lumpy, if it just shows up at one point, like the Foom scenario, then in that scenario, you kind of have to work ahead of time because you’re not sure when. There’s a substantial… if like, the mean is two centuries, but that means in every year there’s a 1-in-200 chance. There’s a half-a-percent chance next year. Half-a-percent is pretty high, I guess we better do something, because what if it happens next year?
Okay. I mean, that’s where extreme lumpiness goes. The less lumpy it is, then the more that the variance around that mean is less. It’s just going to take a long time, and it’ll take 10% less or 10% more, but it’s basically going to take that long. The key question is how lumpy is it reasonable to expect these sorts of things. I would say, “Well, let’s look at how lumpy things have been. How lumpy are most things? Even how lumpy has computer science innovation been? Or even AI innovation?”
I think those are all relevant data sets. There’s general lumpiness in everything, and lumpiness of the kinds of innovation that are closest to the kinds of innovation postulated here. I note that one of our best or most concrete measures we have of lumpiness is citations. That is, we can take for any research idea, how many citations the seminal paper produces, and we say, “How lumpy are citations?”
Interestingly, citation lumpiness seems to be field independent. Not just time independent, but field independent. Seems to be a general feature of academia, which you might have thought lumpiness would vary by field, and maybe it does in some more fundamental sense, but as it’s translated into citations, it’s field independent. And of course, it’s not that lumpy, i.e. most of the distribution of citations is papers with few citations, and the few papers that have the most citations constitute a relatively small fraction of the total citations.
That’s what we also know for other kinds of innovation literature. The generic innovation literature says that most innovation is lots of little things, even though once in a while there are a few bigger things. For example, I remember there’s this time series of the best locomotive at any one time. You have that from 1800 or something. You can just see in speed, or energy efficiency, and you see this point—.
It’s not an exactly smooth graph. On the other hand, it’s pretty smooth. The biggest jumps are a small fraction of the total jumpiness. A lot of technical, social innovation is, as we well understand, a few big things, matched with lots of small things. Of course, we also understand that big ideas, big fundamental insights, usually require lots of complementary, matching, small insights to make it work.
That’s part of why this trajectory happens this way. That smooths out and makes more effectively less lumpy the overall pace of progress in most areas. It seems to me that the most reasonable default assumption is to assume future AI progress looks like past computer science progress and even past technical progress in other areas. I mean, the most concrete example is AI progress.
I’ve observed that we’ve had these repeated booms of AI concern and interest, and we’re in one boom now, but we saw a boom in the 90s. We saw a boom in the 60s, 70s, we saw a boom in the 30s. In each of these booms, the primary thing people point to is, “Look at these demos. These demos are so cool. Look what they can do that we couldn’t do before.” That’s the primary evidence people tend to point to in all of these areas.
They just have concrete examples that they were really impressed by. No doubt we have had these very impressive things. The question really is, for example, well, one question is, do we have any evidence that now is different? As opposed to evidence that there will be a big difference in the future. So if you’re asking, “Is now different,” then you’d want to ask, “Are the signs people point to now, i.e. AlphaGo, say, as a dramatic really impressive thing, how different are they as a degree than the comparable things that have happened in the past?”
The more you understand the past and see it, you saw how impressed people were back in the past with the best things that happened then. That suggests to me that, I mean AlphaGo is say a lump, I’m happy to admit it looks out of line with a smooth attribution of equal research progress to all teams at all times. But it also doesn’t look out of line with the lumpiness we’ve seen over the last 70 years, say, in computer innovation.
It’s on trajectory. So if you’re going to say, “And we still expect that same overall lumpiness for the next 70 years, or the next 700,” then I’d say then it’s about how close are we now? If you just don’t know how close you are, then you’re still going to end up with a relatively random, “When do we reach this threshold where it’s good enough?” If you just had no idea how close you were, how much is required.
The more you think you have an idea of what’s required and where you are, the more you can ask how far you are. Then if you say you’re only halfway, then you could say, “Well, if it’s taken us this many years to get halfway,” then the odds that we’re going to get all the rest of the way in the next five years are much less than you’d attribute to just randomly assigning say, “It’s going to happen in 200 years, therefore it’ll be one in two hundred per year.” I do think we’re in more of that sort of situation. We can roughly guess that we’re not almost there.
Robert Long: Can you say a little bit more about how we should think about this question of how close we are?
Robin Hanson: Sure. The best reliable source on that would be people who have been in this research area for a long time. They’ve just seen lots of problems, they’ve seen lots of techniques, they better understand what it takes to do many hard problems. They have a better sense of, no, they have a good sense of where we are, but ultimately where we have to go.
I think when you don’t understand these things as well by theory or by experience, et cetera, you’re more tempted to look at something like AlphaGo and say, “Oh my God, we’re almost there.” Because you just say, “Oh, look.” You tend more to think, “Well, if we can do human level anywhere, we can do it everywhere.” That was the initial— what people in the 1960s said, “Let’s solve chess, and if we can solve chess, certainly we can do anything.”
I mean, something that can do chess, it’s got to be smart. But they just didn’t fully appreciate the range of tasks, and problems, and problem environments, that you need to deal with. Once you understand the range of possible tasks, task environments, obstacles, issues, et cetera, once you’ve been in AI for a long time and have just seen a wide range of those things, then you have a more of a sense for “I see, AlphaGo, that’s a good job, but let’s list all these simplifying assumptions you made here that made this problem easier”, and you know how to make that list.
Then you’re not so much saying, “If we can do this, we can do anything.” I think pretty uniformly, the experienced AI researchers have said, “We’re not close.” I mean I’d be very surprised if you interviewed any person with a more broad range of AI experience who said, “We’re almost there. If we can do this one more thing we can do everything.”
Asya Bergal: Yeah. I might be wrong about this–my impression is that your estimate of at least a century or maybe centuries might still be longer than a lot of researchers–and this might be because there’s this trend where people will just say 50 years about almost any technology or something like that.
Robin Hanson: Sure. I’m happy to walk through that. That’s the logic of that post of mine that you mentioned. It was exactly trying to confront that issue. So I would say there is a disconnect to be addressed. The people you ask are not being consistent when you ask similar things in different ways. The challenge is to disentangle that.
I’m happy to admit when you ask a lot of people how long it will take, they give you 40, 50 year sort of timescales. Absolutely true. Question is, should you believe it? One way to check whether you should believe that is to see how they answer when you ask them different ways. I mean, as you know, I guess one of those surveys interestingly said, “When will most people lose their jobs?”
They gave much longer time scales than when will computers be able to do most everything, like a factor of two or something. That’s kind of bothersome. That’s a pretty close consistency relation. If computers can do everything cheaper, then they will, right? Apparently not. But I would think that, I mean, I’ve done some writing on this psychology concept called construal-level theory, which just really emphasizes how people have different ways they think about things conceived abstractly and broadly versus narrowly.
There’s a consistent pattern there, which is consistent with the pattern we are seeing here, that is in the far mode where you’re thinking abstractly and broadly, we tend to be more confident in simple, abstract theories that have simple predictions and you tend to neglect messy details. When you’re in the near mode and focus on a particular thing, you see all the messy difficulties.
It’s kind of the difference between will you have a happy marriage in life? Sure. This person you’re in a relationship with? Will that work in the next week? I don’t know. There’s all the things to work out. Of course, you’ll only have a happy relationship over a lifetime if every week keeps going okay for the rest of your life. I mean, if enough weeks do. That’s a near/far sort of distinction.
When you ask people about AI in general and what time scale, that’s a very far mode sort of version of the question. They are aggregating, and they are going on very aggregate sort of theories in their head. But if you take an AI researcher who has been staring at difficult problems in their area for 20 years, and you ask them, “In the problems you’re looking at, how far have we gotten since 20 years ago?,” they’ll be really aware of all the obstacles they have not solved, succeeded in dealing with that, all the things we have not been able to do for 20 years.
That seems to me a more reliable basis for projection. I mean, of course we’re still in a similar regime. If the regime would change, then past experience is not relevant. If we’re in a similar regime of the kind of problems we’re dealing with and the kind of tools and the kind of people and the kind of incentives, all that sort of thing, then that seems to be much more relevant. That’s the point of that survey, and that’s the point of believing that survey somewhat more than the question asked very much more abstractly.
Asya Bergal: Two sort of related questions on this. One question is, how many years out do you think it is important to start work on AI? And I guess, a related question is, now even given that it’s super unlikely, what’s the ideal number of people working about or thinking about this?
Robin Hanson: Well, I’ve said many times in many of these posts that it’s not zero at any time. That is, whenever there’s a problem that it isn’t the right time to work on, it’s still the right time to have some people asking if it’s the right time to work on it. You can’t have people asking a question unless they’re kind of working on it. They’d have to be thinking about it enough to be able to ask the question if it’s the right time to work on it.
That means you always need some core of people thinking about it, at least, in related areas such they are skilled enough to be able to ask the question, “Hey, what do you think? Is this time to turn and work on this area?” It’s a big world, and eventually this is a big thing, so hey, a dozen could be fine. Given how random academia of course and the intellectual world is, the intellectual world is not at all optimized in terms of number of people per topic. It’s really not.
Relative to that standard, you could be not unusually misallocated if you were still pretty random about it. For that it’s more just: for the other purposes that academic fields exist and perpetuate themselves, how well is it doing for those other purposes? I would basically say, “Academia’s mainly about showing people credentialing impressiveness.” There’s all these topics that are neglected because you can’t credential and impress very well via them. If AI risk was a topic that happened to be unusually able to be impressive with, then it would be an unusually suitable topic for academics to work on.
Not because it’s useful, just because that’s what academics do. That might well be true for ways in which AI problems brings up interesting new conceptual angles that you could explore, or pushes on concepts that you need to push on because they haven’t been generalized in that direction, or just doing formal theorems that are in a new space of theorems.
Like pushing on decision theory, right? Certainly there’s a point of view from which decision theory was kind of stuck, and people weren’t pushing on it, and then AI risk people pushed on some dimensions of decision theory that people hadn’t… people had just different decision theory, not because it’s good for AI. How many people, again, it’s very sensitive to that, right? You might justify 100 people if it not only was about AI risk, but was really more about just pushing on these other interesting conceptual dimensions.
That’s why it would be hard to give a very precise answer there about how many. But I actually am less concerned about the number of academics working on it, and more about sort of the percentage of altruistic mind space it takes. Because it’s a much higher percentage of that than it is of actual serious research. That’s the part I’m a little more worried about. Especially the fraction of people thinking about the future. I think of, just in general, very few people seem to be that willing to think seriously about the future. As a percentage of that space, it’s huge.
That’s where I most think, “Now, that’s too high.” If you could say, “100 people will work on this as researchers, but then the rest of the people talk and think about the future.” If they can talk and think about something else, that would be a big win for me because there are tens and hundreds of thousands of people out there on the side just thinking about the future and so, so many of them are focused on this AI risk thing when they really can’t do much about it, but they’ve just told themselves that it’s the thing that they can talk about, and to really shame everybody into saying it’s the priority. Hey, there’s other stuff.
Now of course, I completely have this whole other book, Age of Em, which is about a different kind of scenario that I think doesn’t get much attention, and I think it should get more attention relative to a range of options that people talk about. Again, the AI risk scenario so overwhelmingly sucks up that small fraction of the world. So a lot of this of course depends on your base. If you’re talking about the percentage of people in the world working on these future things, it’s large of course.
If you’re talking percentage of people who are serious researchers in AI risk relative to the world, it’s tiny of course. Obviously. If you’re talking about the percentage of people who think about AI risk, or talk about it, or treat it very seriously, relative to people who are willing to think and talk seriously about the future, it’s this huge thing.
Robert Long: Yeah. That’s perfect. I was just going to … I was already going to ask a follow-up just about what share of, I don’t know, effective altruists who are focused on affecting the long-term future do you think it should be? Certainly you think it should be far less than this, is what I’m getting there?
Robin Hanson: Right. First of all, things should be roughly proportional to probability, except with some kicker for extremity of consequences. But I think you don’t actually know about extremity of consequences until you explore a scenario. Right from the start you should roughly write down scenarios by probability, and then devote effort in proportion to the probability of scenarios.
Then once you get into a scenario enough to say, “This looks like a less extreme scenario, this looks like a more extreme scenario,” at that point, you might be justified in adjusting some effort, in and out of areas based on that judgment. But that has to be a pretty tentative judgment so you can’t go too far there, because until you explore a scenario a lot, you really don’t know how extreme… basically it’s about extreme outcomes times the extreme leverage of influence at each point along the path multiplied by each other in hopes that you could be doing things thinking about it earlier and producing that outcome. That’s a lot of uncertainty to multiply though to get this estimate of how important a scenario is as a leverage to think about.
Robert Long: Right, yeah. Relatedly, I think one thing that people say about why AI should take up a large share is that there’s the sense that maybe we have some reason to think that AI is the only thing we’ve identified so far that could plausibly destroy all value, all life on earth, as opposed to other existential risks that we’ve identified. I mean, I can guess, but you may know that consideration or that argument.
Robin Hanson: Well, surely that’s hyperbole. Obviously anything that kills everybody destroys all value that arises from our source. Of course, there could be other alien sources out there, but even AI would only destroy things from our source relative to other alien sources that would potentially beat out our AI if it produces a bad outcome. Destroying all value is a little hyperbolic, even under the bad AI scenario.
I do think there’s just a wide range of future scenarios, and there’s this very basic question, how different will our descendants be, and how far from our values will they deviate? It’s not clear to me AI is that much worse than other scenarios in terms of that range, or that variance. I mean, yes, AIs could vary a lot in whether they do things that we value or not, but so could a lot of other things. There’s a lot of other ways.
Some people, I guess some people seem to think, “Well, as long as the future is human-like, then humans wouldn’t betray our values.” No, no, not humans. But machines, machines might do it. I mean, the difference between humans and machines isn’t quite that fundamental from the point of view of values. I mean, human values have changed enormously over a long time, we are now quite different in terms of our habits, attitudes, and values, than our distant ancestors.
We are quite capable of continuing to make huge value changes in many directions in the future. I can’t offer much assurance that because our descendants descended from humans that they would therefore preserve most of your values. I just don’t see that. To the extent that you think that our specific values are especially valuable and you’re afraid of value drift, you should be worried. I’ve written about this: basically in the Journal of Consciousness Studies I commented on a Chalmers paper, saying that generically through history, each generation has had to deal with the fact that the next and coming generations were out of their control.
Not just that, they were out of their control and their values were changing. Unless you can find someway to put some bound on that sort of value change, you’ve got to model it as a random walk; you could go off to the edge if you go off arbitrarily far. That means, typically in history, people if they thought about it, they’d realize we got relatively little control about where this is all going. And that’s just been a generic problem we’ve all had to deal with, all through history, AI doesn’t fundamentally change that fact, people focusing on that thing that could happen with AI, too.
I mean, obviously when we make our first AIs we will make them corresponding to our values in many ways, even if we don’t do it consciously, they will be fitting in our world. They will be agents of us, so they will have structures and arrangements that will achieve our ends. So then the argument is, “Yes, but they could drift from there, because we don’t have a very solid control mechanism to make sure they don’t change a lot, then they could change a lot.”
That’s very much true, but that’s still true for human culture and their descendants as well, that they can also change a lot. We don’t have very much assurance. I think it’s just some people say, “Yeah, but there’s just some common human nature that’ll make sure it doesn’t go too far.” I’m not seeing that. Sorry. There isn’t. That’s not much of an assurance. When people can change people, even culturally, and especially later on when we can change minds more directly, start tinkering, start shared minds, meet more directly, or just even today we have better propaganda, better mechanisms of persuasion. We can drift off in many directions a long way.
Robert Long: This is sort of switching topics a little bit, but it’s digging into your general disagreement with some key arguments about AI safety. It’s about your views on intelligence. So you’ve written that there may well be no powerful general theories to be discovered revolutionizing AI, and this is related to your view that most everything we’ve learned about intelligence suggests that the key to smarts is having many not-fully-general tools. Human brains are smart mainly by containing many powerful, not-fully-general modules and using many modules to do each task.
You’ve written that these considerations are one of the main reasons you’re skeptical about AI. I guess the question is, can you think of evidence that might change your mind? I mean, the general question is just to dig in on this train of thought; so is there evidence that would change your mind about this general view of intelligence? And relatedly, why do you think that other people arrive at different views of what intelligence is, and why we could have general laws or general breakthroughs in intelligence?
Robin Hanson: This is closely related to the lumpiness question. I mean, basically you can not only talk about the lumpiness of changes in capacities, i.e., lumpiness in innovations. You can also talk about the lumpiness of tools in our toolkit. If we just look in industry, if we look in academia, if we look in education, just look in a lot of different areas, you will find robustly that most tools are more specific tools.
Most of the value of tools–of the integral–is in more specific tools, and relatively little of it is in the most general tools. Again, that’s true in things you learn in school, it’s true about things you learn on the job, it’s true about things that companies learn that can help them do things. It’s true about nation advantages that nations have over other nations. Again, just robustly, if you just look at what do you know and how valuable is each thing, most of the value is in lots of little things, and relatively few are big things.
There’s a power law distribution with most of the small things. It’s a similar sort of lumpiness distribution to the lumpiness of innovation. It’s understandable. If tools have that sort of lumpy innovation, then if each innovation is improving a tool by some percentage, even a distribution percentage, most of the improvements will be in small things, therefore most of the improvements will be small.
Few of the improvements will be a big thing, even if it’s a big improvement in a big thing, that’ll be still a small part of the overall distribution. So lumpiness in the size of tools or the size of things that we have as tools predicts that, in intelligence as well, most of the things that make you intelligent are lumpy little things. It comes down to, “Is intelligence different?”
Again, that’s also the claim about, “Is intelligence innovation different?” If, of course, you thought intelligence was fundamentally different in there being fewer and bigger lumps to find, then that would predict that in the future we would find fewer, bigger lumps, because that’s what there is to find. You could say, “Well, yes. In the past we’ve only ever found small lumps, but that’s because we weren’t looking at the essential parts of intelligence.”
Of course, I’ll very well believe that related to intelligence, there are lots of small things. You might believe that there are also a few really big things, and the reason that in the past, computer science or education innovation hasn’t found many of them is that we haven’t come to the mother lode yet. The mother lode is still yet to be found. When we find it, boy it’ll be big. The belief, you’ll find that in intelligence innovation, is related to a belief that it exists, that it’s a thing to find, which we can relatedly believe that fundamentally, intelligence is simple.
Fundamentally, there’s some essential simplicity to it that when you find it, the pieces will be … each piece is big, because there aren’t very many pieces, and that’s implied by it being simple. It can’t be simple unless … if there’s 100,000 pieces, it’s not simple. If there’s 10 pieces, it could be simple, but then each piece is big. Then the question is, “What reason do you have to believe that intelligence is fundamentally simple?”
I think, in academia, we often try to find simple essence in various fields. So there’d be the simple theory of utilitarianism, or the simple theory of even physical particles, or simple theory of quantum mechanics, or … so if your world is thinking about abstract academic areas like that, then you might say, “Well, in most areas, the essence is a few really powerful, simple ideas.”
You could kind of squint and see academia in that way. You can’t see the industrial world that way. That is, we have much clearer data about the world of biological organisms competing, or firms competing, or even nations competing. We have much more solid data about that to say, “It’s really lots of little things.” Then it becomes, you might say, “Yeah, but intelligence. That’s more academic.” Because your idea of intelligence is sort of intrinsically academic, that you think of intelligence as the sort of thing that best exemplary happens in the best academics.
If your model is ordinary stupid people, they have a stupid, poor intelligence, but they just know a lot, or have some charisma, or whatever it is, but Von Neumann, look at that. That’s what real intelligence is. Von Neumann, he must’ve had just five things that were better. He couldn’t have been 100,000 things that were better, had to be five core things that were better, because you see, he’s able to produce these very simple, elegant things, and he was so much better, or something like that.
I actually do think this account is true, that many people have these sort of core emotional attitudinal relationships to the concept of intelligence. And that colors a lot of what they think about intelligence, including about artificial intelligence. That’s not necessarily tied to sort of the data we have on variations, and productivity, and performance, and all that sort of thing. It’s more sort of essential abstract things. Certainly if you’re really into math, in the world of math there are core axioms or core results that are very lumpy and powerful.
Of course even there, again, distribution of math citations follows exactly the same distribution as all the other fields. By the citation measure, math is not more lumpy. But still, when you think about math, you like to think about these core, elegant, powerful results. Seeing them as the essence of it all.
Robert Long: So you mentioned Von Neumann and people have a tendency to think that there must be some simple difference between Von Neumann and us. Obviously the other comparison people make which you’ve written about is the comparison between us as a species and other species. I guess, can you say a little bit about how you think about human uniqueness and maybe how that influences your viewpoint on intelligence?
Robin Hanson: Sure. That, we have literatures that I just defer to. I mean, I’ve read enough to think I know what they say and that they’re relatively in agreement and I just accept what they say. So what the standard story is then, humans’ key difference was an ability to support cultural evolution. That is, human mind capacities aren’t that different from a chimpanzee’s overall, and an individual [human] who hasn’t had the advantage of cultural evolution isn’t really much better.
The key difference is that we found a way to accumulate innovations culturally. Now obviously there’s some difference in the sense that it does seem hard, even though we’ve tried today to teach culture to chimps, we’ve also had some remarkable success. But still it’s plausible that there’s something they don’t have quite good enough yet that let’s the mdo that, but then the innovations that made a difference have to be centered around that in some sense.
I mean, obviously most likely in a short period of time, a whole bunch of independent unusual things didn’t happen. More likely there was one biggest thing that happened that was the most important. Then the question is what that is. We know lots of differences of course. This is the “what made humans different” game. There’s all these literatures about all these different ways humans were different. They don’t have hair on their skin, they walk upright, they have fire, they have language, blah, blah, blah.
The question is, “Which of these matter?” Because they can’t all be the fundamental thing that matters. Presumably, if they all happen in a short time, something was more fundamental that caused most of them. The question is, “What is that?” But it seems to me that the standard answer is right, it was cultural evolution. And then the question is, “Well, okay. But what enabled cultural evolution?” Language certainly seems to be an important element, although it also seems like humans, even before they had language, could’ve had some faster cultural evolution than a lot of other animals.
Then the question is, “How big a brain difference or structure difference would it take?” Then it seems like well, if you actually look at the mechanisms of cultural evolution, the key thing is sitting next to somebody else watching what they’re doing, trying to do what they’re doing. So that takes certain observation abilities, and it takes certain mirroring abilities, that is, the ability to just map what they’re doing onto what you’re doing. It takes sort of fine-grained motor control abilities to actually do whatever it is they’re doing.
Those seem like just relatively modest incremental improvements on some parameters, like chimps weren’t quite up to that. Humans could be more up to that. Even our language ability seems like, well, we have modestly different structured mouths that can more precisely control sounds and chimps don’t quite do that, so it’s understandable why they can’t make as many sounds as distinctly. The bottom line is that our best answer is it looks like there was a threshold passed, sort of ability supporting cultural evolution, which included the ability to watch people, the ability to mirror it, the ability to do it yourself, the ability to tell people through language or through more things like that.
It looks roughly like there was just a threshold passed, and that threshold allowed cultural evolution, and that’s allowed humans to take off. If you’re looking for some fundamental, architectural thing, it’s probably not there. In fact, of course people have said when you look at chimp brains and human brains in fine detail, you see pretty much the same stuff. It isn’t some big overall architectural change, we can tell that. This is pretty much the same architecture.
Looks like it’s some tools we are somewhat better at and plausibly those are the tools that allow us to do cultural evolution.
Robert Long: Yeah. I think that might be it for my questions on human uniqueness.
Asya Bergal: I want to briefly go back to, I think I sort of mentioned this question, but we didn’t quite address it. At what timescale do you think people–how far out do you think people should be starting maybe the field building stuff, or starting actually doing work on AI? Maybe number of years isn’t a good metric for this, but I’m still curious for your take.
Robin Hanson: Well, first of all, let’s make two different categories of effort. One category of effort is actually solving actual problems. Another category of effort might be just sort of generally thinking about the kind of problems that might appear and generally categorizing and talking about them. So most of the effort that will eventually happen will be in the first category. Overwhelmingly, most of the effort, and appropriately so.
I mean, that’s true today for cars or nuclear weapons or whatever it is. Most of the effort is going to be dealing with the actual concrete problems right in front of you. That effort, it’s really hard to do much before you actually have concrete systems that you’re worried about, and the concrete things that can actually go wrong with them. That seems completely appropriate to me.
I would say that sort of effort is mostly, well, you see stuff and it goes wrong, deal with it. Ahead of seeing problems, you shouldn’t be doing that. You could today be dealing with computer security, you can be dealing with hackers and automated tools to deal with them, you could be dealing with deep fakes. I mean, it’s fine time now to deal with actual, concrete problems that are in front of people today.
But thinking about problems that could occur in the future, that you haven’t really seen the systems that would produce them or even the scenarios that would play out, that’s much more the other category of effort, is just thinking abstractly about the kinds of things that might go wrong, and maybe the kinds of architectures and kinds of approaches, et cetera. That, again, is something that you don’t really need that many people to do. If you have 100 people doing it, probably enough.
Even 10 people might be enough. It’s more about how many people, again, this mind space in altruistic futurism, you don’t need very much of that mind space to do it at all, really. Then that’s more the thing I complain that there’s too much of. Again, it comes down to how unusual will the scenarios be that are where the problem starts. Today, cars can have car crashes, but each crash is a pretty small crash, and happens relatively locally, and doesn’t kill that many people. You can wait until you see actual car crashes to think about how to deal with car crashes.
Then the key question is, “How far do the scenarios we worry about deviate from that?” I mean, most problems in our world today are like that. Most things that go wrong in systems, we have our things that go wrong on a small scale pretty frequently, and therefore you can look at actual pieces of things that have gone wrong to inform your efforts. There are some times where we exceptionally anticipate problems that we never see. Then anticipate even institutional problems that we never see or even worry that by the time the problem gets here, it’ll be too late.
Those are really unusual scenarios in problems. The big question about AI risk is what fraction of the problems that we will face about AI will be of that form. And then, to what extent can we anticipate those now? Because in the year 1,000, it would’ve been still pretty hard to figure out the unusual scenarios that might bedevil military hardware purchasing or something. Today we might say, “Okay, there’s some kind of military weapons we can build that yes, we can build them, but it might be better once we realize they can be built and then have a treaty with the other guys to have neither of us build them.”
Sometimes that’s good for weapons. Okay. That was not very common 1,000 years ago. That’s a newer thing today, but 1,000 years ago, could people have anticipated that, and then what usefully could they have done other than say, “Yeah, sometimes it might be worse having a treaty about not building a weapon if you figure out it’d be worse for you if you have both.” I’m mostly skeptical that there are sort of these big things that you have to coordinate ahead of time, that you have to anticipate, that if you wait it’s too late, that you won’t see actual concrete signatures of the problems before you have to invent them.
Even today, large systems, you often tend to have to walk through a failure analysis. You build a large nuclear plant or something, and then you go through and try to ask everything that could go wrong, or every pair of things that could go wrong, and ask, “What scenarios would those produce?,” and try to find the most problematic scenarios. Then ask, “How can we change the design of it to fix those?”
That’s the kind of exercise we do today where we imagine problems that most of which never occur. But for that, you need a pretty concrete design to work with. You can’t do that very abstractly with the abstract idea. For that you need a particular plan in front of you, and now you can walk through concrete failure modes of all the combinations of this strut will break, or this pipe will burst, and all those you walk through. It’s definitely true that we often analyze problems that never appear, but it’s almost never in the context of really abstract sparse descriptions of systems.
Asya Bergal: Got you. Yeah. We’ve been asking people a standard question which I think I can maybe guess your answer to. But the question is: what’s your credence that in a world where we didn’t have these additional EA-inspired safety efforts, what’s your credence that in that world AI poses a significant risk of harm? I guess this question doesn’t really get at how much efforts now are useful, it’s just a question about general danger.
Robin Hanson: There’s the crying wolf effect, and I’m particularly worried about it. For example, space colonization is a thing that could happen eventually. And for the last 50 years, there have been enthusiasts who have been saying, “It’s now. It’s now. Now is the time for space colonization.” They’ve been consistently wrong. For the next 50 years, they’ll probably continue to be consistently wrong, but everybody knows there’s these people out there who say, “Space colonization. That’s it. That’s it.”
Whenever they hear somebody say, “Hey, it’s time for space colonization,” they go, “Aren’t you one of those fan people who always says that?” The field of AI risk kind of has that same problem where again today, but for the last 70 years or even longer, there have been a subset of people who say, “The robots are coming, and it’s all going to be a mess, and it’s now. It’s about to be now, and we better deal with it now.” That creates sort of a skepticism in the wider world that you must be one of those crazies who keep saying that.
That can be worse for when there really is, when we really do have the possibility of space colonization, when it is really the right time, we might well wait too long after that, because people just can’t believe it, because they’ve been hearing this for so long. That makes me worried that this isn’t a positive effect. Calling attention to a problem, like a lot of attention to a problem, and then having people experience it as not a problem, when it looks like you didn’t realize that.
Now, if you just say, “Hey, this nuclear power plant type could break. I’m not saying it will, but it could, and you ought to fix that,” that’s different than saying, “This pipe will break, and that’ll happen soon, and better do something.” Because then you lose credibility when the pipe doesn’t usually break.
Robert Long: Just as a follow-up, I suppose the official line for most people working on AI safety is, as it ought to be, there’s some small chance that this could matter a lot, and so we better work on it. Do you have thoughts on ways of communicating that that’s what you actually think so that you don’t have this crying wolf effect?
Robin Hanson: Well, if there are only the 100 experts, and not the 100,000 fans, this would be much easier. That does happen in other areas. There are areas in the world where there are only 100 experts and there aren’t 100,000 fans screaming about it. Then the experts can be reasonable and people can say, “Okay,” and take their word seriously, although they might not feel too much pressure to listen and do anything. If you can say that about computer security today, for example, the public doesn’t scream a bunch about computer security.
The experts say, “Hey, this stuff. You’ve got real computer security problems.” They say it cautiously and with the right degree of caveats that they’re roughly right. Computer security experts are roughly right about those computer security concerns that they warn you about. Most firms say, “Yeah, but I’ve got these business concerns immediately, so I’m just going to ignore you.” So we continue to have computer security problems. But at least from a computer security expert’s point of view, they aren’t suffering from the perception of hyperbole or actual hyperbole.
But that’s because there aren’t 100,000 fans of computer security out there yelling with them. But AI risk isn’t like that. AI risk, I mean, it’s got the advantage of all these people pushing and talking which has helped produce money and attention and effort, but it also means you can’t control the message.
Robert Long: Are you worried that this reputation effect or this impression of hyperbole could bleed over and harm other EA causes or EA’s reputation in general, and if so are there ways of mitigating that effect?
Robin Hanson: Well again, the more popular anything is, the harder it is for any center to mitigate whatever effects there are of popular periphery doing whatever they say and do. For example, I think there are really quite reasonable conservatives in the world who are at the moment quite tainted with the alt-right label, and there is an eager population of people who are eager to taint them with that, and they’re kind of stuck.
All they can do is use different vocabularies, have a different style and tone when they talk to each other, but they are still at risk for that tainting. A lot depends on the degree to which AI risk is seen as central to EA. The more it’s perceived as a core part of EA, then later on when it’s perceived as having been overblown and exaggerated, then that will taint EA. Not much way around that. I’m not sure that matters that much for EA though.
I mean I don’t see EA as driven by popularity or popular attention. It seems it’s more a group of people who– it’s driven by the internal dynamics of the group and what they think about each other and whether they’re willing to be part of it. Obviously in the last century or so, we just had these cycles of hype about AI, so that’s … I expect that’s how this AI cycle will be framed– in the context of all the other concern about AI. I doubt most people care enough about EA for that to be part of the story.
I mean, EA has just a little, low presence in people’s minds in general, that unless it got a lot bigger, it just would not be a very attractive element to put in the story to blame those people. They’re nobody. They don’t exist to most people. The computer people exaggerate. That’s a story that sticks better. That has stuck in the past.
Asya Bergal: Yeah. This is zooming out again, but I’m curious: kind of around AI optimism, but also just in general around any of the things you’ve talked about in this interview, what sort of evidence you think that either we could get now, or might plausibly see in the future would change your views one way or the other?
Robin Hanson: Well, I would like to see much more precise and elaborated data on the lumpiness of algorithm innovations and AI progress. And of course data on whether things are changing different[ly] now. For example, forgetting his name, somebody did a blog post a few years ago right after AlphaGo, saying this Go achievement seemed off trend if you think about it by time, but not if you thought about it by computing resources devoted to the problem. If you looked at past level of Go ability relative to computer resources, then it was on trend, it wasn’t an exception.
Any case, that’s relevant to the lumpiness issue, right? So the more that we could do a good job of calibrating how unusual things are, the more that we could be able talk about whether we are seeing unusual stuff now. That’s kind of often the way this conversation goes is, “Is this time different? Are we seeing unusual stuff now?” In order to do that, you want us to be able to calibrate these progresses as clearly as possible.
Obviously certainly if you could make some metric for each AI progress being such that you could talk about how important it was by some relative weighting in different fields, and relevant weighting of different kinds of advances, and different kinds of metrics for advances, then you can have some statistics of tracking over time the size of improvements and whether that was changing.
I mean, I’ll also make a pitch for the data thing that I’ve just been doing for the last few years, which is the data on automation per job in the US, and the determinants of that and how that’s changed over time, and its impact over time. Basically there’s a dataset called O*NET and they’re broken into 800 jobs categories and jobs in the US, and for each job in the last 20 years, at some random times, some actual people went and rated each job on a one to five scale of how automated it was.
Now we have those ratings. We are able to say what predicts which jobs are how automated, and has that changed over time? Then the answer is, we can predict pretty well, just like 25 variables lets us predict half the variance in which jobs are automated, and they’re pretty mundane things, they’re not high-tech, sexy things. It hasn’t changed much in 20 years. In addition, we can ask when jobs get more or less automated, how does that impact the number of employees and their wages. We find almost no impact on those things.
A data series like that, if you kept tracking it over time, if there were a deviation from trend, you might be able to see it, you might see that the determinants of automation were changing, that the impacts were changing. This is of course just tracking actual AI impacts, not sort of extreme tail possibilities of AI impacts, right?
Of course, this doesn’t break it down into AI versus other sources of automation. Most automation has nothing to do with AI research. It’s making a machine that whizzes and does something that a person was doing before. But if you could then find a way to break that down by AI versus not, then you could more focus on, “Is AI having much impact on actual business practice?,” and seeing that.
Of course, that’s not really supporting the early effort scenario. That would be in support of, “Is it time now to actually prepare people for major labor market impacts, or major investment market impacts, or major governance issues that are actually coming up because this is happening now?” But you’ve been asking about, “Well, what about doing stuff early?” Then the question is, “Well, what signs would you have that it’s soon enough?”
Honestly, again, I think we know enough about how far away we are from where we need to be, and we know we’re not close, and we know that progress is not that lumpy. So we can see, we have a ways to go. It’s just not soon. We’re not close. It’s not time to be doing things you would do when you are close or soon. But the more that you could have these expert judgments of, “for any one problem, how close are we?,” and it could just be a list of problematic aspects of problems and which of them we can handle so far and which we can’t.
Then you might be able to, again, set up a system that when you are close, you could trigger people and say, “Okay, now it’s time to do field building,” or public motivation, or whatever it is. It’s not time to do it now. Maybe it’s time to set up a tracking system so that you’ll find out when it’s time.
Robert Long: On that cluster of issues surrounding human uniqueness, other general laws of intelligence, is there evidence that could change your mind on that? I don’t know. Maybe it could come from psychology, or maybe it could come from anthropology, new theories of human uniqueness, something like that?
Robin Hanson: The most obvious thing is to show me actual big lumpy, lumpy innovations that made a big difference to the performance of the system. That would be the thing. Like I said, for many years I was an AI researcher, and I noticed that researchers often created systems, and systems have architectures. So their paper would have a box diagram for an architecture, and explain that their system had an architecture and that they were building on that architecture.
But it seemed to me that in fact, the architectures didn’t make as much difference as they were pretending. In the performance of the system, most systems that were good, were good because they just did a lot of work to make that whole architecture work. But you could imagine doing counterfactual studies where you vary the effort you go into filling the concept of a system and you vary the architecture. You quantitatively find out how much does architecture matter.
There could be even already existing data out there in some form or other that somebody has done the right sort of studies. So it’s obvious that architecture makes some difference. Is it a factor of two? Is it 10%? Is it a factor of 100? Or is it 1%? I mean, that’s really what we’re arguing about. If it’s a factor of 10% then you say, “Okay, it matters. You should do it. You should pay attention to that 10%. It’s well worth putting the effort into getting that 10%.”
But it doesn’t make that much of a difference in when this happens and how big it happens. Right? Or if architecture is a factor of 10 or 100, now you can have a scenario where somebody finds a better architecture and suddenly they’re a factor of 100 better than other people. Now that’s a huge thing. That would be a way to ask a question, “How much of an advance can a new system get relative to other systems?,” would be to say, “how much of a difference does a better architecture matter?”
And that’s a thing you can actually study directly by having people make systems with different architectures, put different spots of reference into it, et cetera, and see what difference it makes.
Robert Long: Right. And I suspect that some people think that homo sapiens are such a data point, and that it sounds like you disagree with how they’ve construed that. Do you think there’s empirical evidence waiting to change your mind, or do you think people are just sort of misconstruing it, or are ignorant, or just not thinking correctly about what we should make of the fact of our species dominating the planet?
Robin Hanson: Well, there’s certainly a lot of things we don’t know as well about primate abilities, so again, I’m reflecting what I’ve read about cultural evolution and the difference between humans and primates. But you could do more of that, and maybe the preliminary indications that I’m hearing about are wrong. Maybe you’ll find out that no, there is this really big architectural difference in the brain that they didn’t notice, or that there’s some more fundamental capability introduction.
For example, abstraction is something we humans do, and we don’t see animals doing much of it, but this construal-level theory thing I described and standard brain architecture says actually all brains have been organized by abstraction for a long time. That is, we see a dimension of the brain which is the abstract to the concrete, and we see how it’s organized that way. But we humans seem to be able to talk about abstractions in ways that other animals don’t.
So a key question is, “Do we have some extra architectural thing that lets us do more with abstraction?” Because again, most brains are organized by abstraction and concrete. That’s just one of the main dimensions of brains. The forebrain versus antebrain is concrete versus abstraction. Then the more we just knew about brain architecture and why it was there, the more we can concretely say whether there was a brain architectural innovation from primates to humans.
But everything I’ve heard says it seems to be mostly a matter of relevant emphasis of different parts, rather than some fundamental restructuring. But even small parts can be potent. So one way actually to think about it is that most ordinary programs spend most of the time in just a few lines of code. Then so if you have 100,000 lines of code, it could still only be 100 lines, there’s 100 lines of code where 90% of the time is being spent. That doesn’t mean those 100,000 lines don’t matter. When you think about implementing code on the brain, you realize because the brain is parallel, whatever 90% of the code has been, that’s going to be 90% of the volume of the brain.
Those other 100,000 lines of code will take up relatively little space, but they’re still really important. A key issue at the brain is you might find out that you understand 90% of the volume as a simple structure following a simple algorithm and you can still hardly understand anything about this total algorithm, because it’s all the other parts that you don’t understand where stuff isn’t executing very often, but it still needs to be there to make the whole thing work. That’s a very problematic thing about understanding brain organization at all.
You’re tempted to go by volume and try to understand because volume is visible first, and whatever volume you can opportunistically understand, but you could still be a long way off from understanding. Just like if you had any big piece of code and you understood 100 lines of it, out of 100,000 lines, you might not understand very much at all. Of course, if that was the 100 lines that was being executed most often, you’d understand what it was doing most of the time. You’d definitely have a handle on that, but how much of the system would you really understand?
Asya Bergal: We’ve been interviewing a bunch of people. Are there other people who you think have well-articulated views that you think it would be valuable for us to talk to or interview?
Robin Hanson: My experience is that I’ve just written on this periodically over the years, but I get very little engagement. Seems to me there’s just a lack of a conversation here. Early on, Eliezer Yudkowsky and I were debating, and then as soon as he and other people just got funding and recognition from other people to pursue, then they just stopped engaging critics and went off on pursuing their stuff.
Which makes some sense, but these criticisms have just been sitting and waiting. Of course, what happens periodically is they are most eager to engage the highest status people who criticize them. So periodically over the years, some high-status person will make a quip, not very thought out, at some conference panel or whatever, and they’ll be all over responding to that, and sending this guy messages and recruiting people to talk to him saying, “Hey, you don’t understand. There’s all these complications.”
Which is different from engaging the people who are the longest, most thoughtful critics. There’s not so much of that going on. You are perhaps serving as an intermediary here. But ideally, what you do would lead to an actual conversation. And maybe you should apply for funding to have an actual event where people come together and talk to each other. Your thing could be a preliminary to get them to explain how they’ve been misunderstood, or why your summary missed something; that’s fine. If it could just be the thing that started that actual conversation it could be well worth the trouble.
Asya Bergal: I guess related to that, is there anything you wish we had asked you, or any other things sort of you would like to be included in this interview?
Robin Hanson: I mean, you sure are relying on me to know what the main arguments are that I’m responding to, hence you’re sort of shy about saying, “And here are the main arguments, what’s your response?” Because you’re shy about putting words in people’s mouths, but it makes it harder to have this conversation. If you were taking a stance and saying, “Here’s my positive argument,” then I could engage you more.
I would give you a counterargument, you might counter-counter. If you’re just trying to roughly summarize a broad range of views then I’m limited in how far I can go in responding here.
Asya Bergal: Right. Yeah. I mean, I don’t think we were thinking about this as sort of a proxy for a conversation.
Robin Hanson: But it is.
Asya Bergal: But it is. But it is, right? Yeah. I could maybe try to summarize some of the main arguments. I don’t know if that seems like something that’s interesting to you? Again, I’m at risk of really strawmanning some stuff.
Robin Hanson: Well, this is intrinsic to your project. You are talking to people and then attempting to summarize them.
Asya Bergal: That’s right, that’s right.
Robin Hanson: If you thought it was actually feasible to summarize people, then what you would do is produce tentative summaries, and then ask for feedback and go back and forth in rounds of honing and improving the summaries. But if you don’t do that, it’s probably because you think even the first round of summaries will not be to their satisfaction and you won’t be able to improve it much.
Which then says you can’t actually summarize that well. But what you can do is attempt to summarize and then use that as an orienting thing to get a lot of people to talk and then just hand people the transcripts and they can get what they can get out of it. This is the nature of summarizing conversation; this is the nature of human conversation.
Asya Bergal: Right. Right. Right. Of course. Yeah. So I’ll go out on a limb. We’ve been talking largely to people who I think are still more pessimistic than you, but not as pessimistic as say, MIRI. I think the main difference between you and the people we’ve been talking to is… I guess two different things.
There’s a sort of general issue which is, how much time do we have between now and when AI is coming, and related to that, which I think we also largely discussed, is how useful is it to do work now? So yeah, there’s sort of this field building argument, and then there are arguments that if we think something is 20 years away, maybe we can make more robust claims about what the geopolitical situation is going to look like.
Or we can pay more attention to the particular organizations that might be making progress on this, and how things are going to be. There’s a lot of work around assuming that maybe AGI’s actually going to look somewhat like current techniques. It’s going to look like deep reinforcement and ML techniques, plus maybe a few new capabilities. Maybe from that perspective we can actually put effort into work like interpretability, like adversarial training, et cetera.
Maybe we can actually do useful work to progress that. A concrete version of this, Paul Christiano has this approach that I think MIRI is very skeptical of, addressing prosaic –AI that looks very similar to the way AI looks now. I don’t know if you’re familiar with iterated distillation and amplification, but it’s sort of treating this AI system as a black box, which is a lot of what it looks like if they’re in a world that’s close to the one now, because neural nets are sort of black box-y.
Treating it as a black box, there’s some chance that this approach where we basically take a combination of smart AIs and use that to sort of verify the safety of a slightly smarter AI, and sort of do that process, bootstrapping. And maybe we have some hope of doing that, even if we don’t have access to the internals of the AI itself. Does that make sense? The idea is sort of to have an approach that works even with black box sort of AIs that might look similar to the neural nets we have now.
Robin Hanson: Right. I would just say the whole issue is how plausible is it that within 20 years we’ll have human level, broad human-level AI on the basis of these techniques that we see now? Obviously the higher probability you think that is, then the more you think it’s worth doing that. I don’t have any objection at all with conditional on that assumption, his strategies. It would just be, how likely is that? And not only–it’s okay for him to work on that–it’s just more, how big a fraction of mind space does that take up among the wider space of people worried about AI risk?
Asya Bergal: Yeah. Many of the people that we’ve talked to have actually agreed that it’s taking up too much mind space, or they’ve made arguments of the form, “Well, I am a very technical person, who has a lot of compelling thoughts about AI safety, and for me personally I think it makes sense to work on this. Not as sure that as many resources should be devoted to it.” I think at least a reasonable fraction of people would agree with that. [Note: It’s wrong that many of the people we interviewed said this. This comment was on the basis of non-public conversations that I’ve had.]
Robin Hanson: Well, then maybe an interesting follow-up conversation topic would be to say, “what concretely could change the percentage of mind space?” That’s different than … The other policy question is like, “How many research slots should be funded?” You’re asking what are the concrete policy actions that could be relevant to what you’re talking about. The most obvious one I would think is people are thinking in terms of how many research slots should be funded of what sort, when.
But with respect to the mind space, that’s not the relevant policy question. The policy question might be some sense of how many scenarios should these people be thinking in terms of. Or what other scenarios should get more attention.
Asya Bergal: Yeah, I guess I’m curious on your take on that. If you could just control the mind space in some way, or sort of set what people were thinking about or what directions, what do you think it would look like?
Robert Long: Very quickly, I think one concrete operationalization of “mind space resource” is what 80,000 Hours tells people to do, with young, talented people say.
Robin Hanson: That’s even more plausible. I mean, I would just say, study the future. Study many scenarios in the future other than this scenario. Go actually generate scenarios, explore them, tell us what you found. What are the things that could go wrong there? What are the opportunities? What are the uncertainties? Just explore a bunch of future scenarios and report. That’s just a thing that needs to happen.
Other than AI risk. I mean, AI risk is focused on one relatively narrow set of scenarios, and there’s a lot of other scenarios to explore, so that would be a sense of mind space and career work is just say, “There’s 10 or 100 people working in this other area, I’m not going to be that …”
Then you might just say, concretely, the world needs more futurists. If under these … the future is a very important place, but we’re not sure how much leverage we have about it. We just need more scenarios explored, including for each scenario asking what leverage there might be.
Then I might say we’ve had a half-dozen books in the last few years about AI risks. How about a book that has a whole bunch of other scenarios, one of which is AI risk which takes one chapter out of 20, and 19 other chapters on other scenarios? And then if people talked about that and said it was a cool book and recommended it, and had keynote speakers about that sort of thing, then it would shift the mind space. People would say, “Yeah. AI risk is definitely one thing, people should be looking at it, but here’s a whole bunch of other scenarios.”
Asya Bergal: Right. I guess I could also try a little bit to zero in… I think a lot of the differences in terms of people’s estimates for numbers of years are modeling differences. I think you have this more outside view model of what’s going on, looking at lumpiness.
I think one other common modeling choice is to say something like, “We think progress in this field is powered by compute; here’s some extrapolation that we’ve made about how compute is going to grow,” and maybe our estimates of how much compute is needed to do some set of powerful things. I feel like with those estimates, then you might think things are going to happen sooner? I don’t know how familiar you are with that space of arguments or what your take is like.
Robin Hanson: I have read most all of the AI Impacts blog posts over the years, just to be clear.
Asya Bergal: Great. Great.
Robin Hanson: You have a set of posts on that. So the most obvious data point is maybe we’re near the human equivalent compute level now, but not quite there. We passed the mice level a while ago, right? Well, we don’t have machines remotely capable of doing what mice do. So it’s clear that merely having the computing-power equivalent is not enough. We have machines that went past the cockroach far long ago. We certainly don’t have machines that can do all the things cockroaches can do.
It’s just really obvious I think, looking at examples like that, that computing power is not enough. We might hit a point where we have so much computing power that you can do some sort of fast search. I mean, that’s sort of the difference between machine learning and AI as ways to think about this stuff. When you thought about AI you just thought about, “Well, you have to do a lot of work to make the system,” and it was computing. And then it was kind of obvious, well, duh, well you need software, hardware’s not enough.
When you say machine learning people tend to have more hope– Well, we just need some general machine learning algorithm and then you turn that on and then you find the right system and then the right system is much cheaper to execute computationally. The threshold you need is a lot more computing power than the human brain has to execute the search, but it won’t be that long necessarily before we have a lot more.
Then now it’s an issue of how simple is this thing you’re searching for and how close are current machine learning systems to what you need? The more you think that a machine learning system like we have now could basically do everything, if only it were big enough and had enough data and computing power, it’s a different perspective than if you think we’re not even close to having the right machine learning techniques. There’s just a bunch of machine learning problems that we know we’ve solved that these systems just don’t solve.
Asya Bergal: Right.
Robert Long: So on that question, I can’t pull up the exact quote quickly enough, but I may insert it in the transcript, with permission. Paul Christiano has said more or less, in an 80,000 Hours interview, that he’s very unsure, but he suspects that we might be at insect-level capabilities if we devoted, if we wanted to, if people took it upon themselves to take the compute we have and the resources that we have, we could do what insects do.1
He’s interested in maybe concretely testing this hypothesis that you just mentioned, humans and cockroaches. But it sounds like you’re just very skeptical of it. It sounds like you’re already quite confident that we are not at insect level. Can you just say a little bit more about why you think that?
Robin Hanson: Well, there’s doing something a lot like what insects do, and then there’s doing exactly what insects do. And those are really quite different tasks, and the difference is in part how forgiving you are about a bunch of details. I mean, there’s some who may say an image recognition or something, or even Go… Cockroaches are actually managing a particular cockroach body in a particular environment. They’re pretty damn good at that.
If you wanted to make an artificial cockroach that was as good as cockroaches at the thing that the cockroach does, I think we’re a long way off from that. But you might think most of those little details aren’t that important. They’re just a lot of work and that maybe you could make a system that did what you think of as the essential core problems similarly.
Now we’re back to this key issue of the division between a few essential core problems and a lot of small messy problems. I basically think the game is in doing them all. Do it until you do them all. When doing them all, include a lot of the small messy things. So that’s the idea that your brain is 100,000 lines of code, and 90% of the brain volume is 100 of those lines, and then there’s all these little small, swirly structures in your brain that manage the small little swirly tasks that don’t happen very often, but when they do, that part needs to be there.
What percentage of your brain volume would be enough to replicate before you thought you were essentially doing what a human does? I mean, that is sort of an essential issue. If you thought there were just 100 key algorithms and once you got 100 of them then you were done, that’s different than thinking, “Sure, there’s 100 main central algorithms, plus there’s another 100,000 lines of code that just is there to deal with very, very specific things that happen sometimes.”
And that evolution has spent a long time searching in the space of writing that code and found these things and there’s no easy learning algorithm that will find it that isn’t in the environment that you were in. This is a key question about the nature of intelligence, really.
Robert Long: Right. I’m now hijacking this interview to be about this insect project that AI Impacts is also doing, so apologies for that. We were thinking maybe you can isolate some key cognitive tasks that bees can do, and then in simulation have something roughly analogous to that. But it sounds like you’re not quite satisfied with this as a test of the hypothesis, where you can do all the little bee things and control bee body and wiggle around just like bees do and so forth?
Robin Hanson: I mean, if you could attach it to an artificial bee body and put it in a hive and see what happens, then I’m much more satisfied. If you say it does the bee dance, it does the bee smell, it does the bee touch, I’ll go, “That’s cute, but it’s not doing the bee.”
Robert Long: Then again, it just sounds like how satisfied you are with these abstractions, depends on your views of intelligence and how much can be abstracted away–
Robin Hanson: It depends on your view of the nature of the actual problems that most animals and humans face. They’re a mixture of some structures with relative uniformity across a wide range; that’s when abstraction is useful. Plus, a whole bunch of messy details that you just have to get right.
In some sense I’d be more impressed if you could just make an artificial insect that in a complex environment can just be an insect, and manage the insect colonies, right? I’m happy to give you a simulated house and some simulated dog food, and simulated predators, who are going to eat the insects, and I’m happy to let you do it all in simulation. But you’ve got to show me a complicated world, with all the main actual obstacles that insects have to surviving and existing, including parasites and all sorts of things, right?
And just show me that you can have something that robustly works in an environment like that. I’m much more impressed by that than I would be by you showing an actual physical device that does a bee dance.
Asya Bergal: Yeah. I mean, to be clear, I think the project is more about actually finding a counterexample. If we could find a simple case where we can’t even do this with neural networks then it’s fairly … there’s a persuasive case there.
Robin Hanson: But then of course people might a month later say, “Oh, yeah?” And then they work on it and they come up with a way to do that, and there will never be an end to that game. The moment you put up this challenge and they haven’t done it yet–
Asya Bergal: Yeah. I mean, that’s certainly a possibility.
Robert Long: Cool. I guess I’m done for now hijacking this interview to be about bees, but that’s just been something I’ve been thinking about lately.
Asya Bergal: I would love to sort of engage with you on your disagreements, but I think a lot of them are sort of like … I think a lot of it is in this question of how close are we? And I think I only know in the vaguest terms people’s models for this.
I feel like I’m not sure how good in an interview I could be at trying to figure out which of those models is more compelling. Though I do think it’s sort of an interesting project because it seems like lots of people just have vastly different sorts of timelines models, which they use to produce some kind of number.
Robin Hanson: Sure. I suppose you might want to ask people you ask after me sort of the relative status of inside and outside arguments. And who sort of has the burden of proof with respect to which audiences.
Asya Bergal: Right. Right. I think that’s a great question.
Robin Hanson: If we’ve agreed that the outside view doesn’t support short time scales of things happening, and we say, “But yes, some experts think they see something different in their expert views of things with an inside view,” then we can say, “Well, how often does that happen?” We can make the outside view of that. We can say, “Well, how often do inside experts think they see radical potential that they are then inviting other people to fund and support, and how often are they right?”
Asya Bergal: Right. I mean, I don’t think it’s just inside/outside view. I think there are just some outside view arguments that make different modeling choices that come to different conclusions.
Robin Hanson: I’d be most willing to engage those. I think a lot of people are sort of making an inside/outside argument where they’re saying, “Sure, from the outside this doesn’t look good, but here’s how I see it from the inside.” That’s what I’ve heard from a lot of people.
Asya Bergal: Yeah. Honestly my impression is that I think not a lot of people have spent … a lot of people when they give us numbers are like, “this is really a total guess.” So I think a lot of the argument is either from people who have very specific compute-based models for things that are short [timelines], and then there’s also people who I think haven’t spent that much time creating precise models, but sort of have models that are compelling enough. They’re like, “Oh, maybe I should work on this slash the chance of this is scary enough.” I haven’t seen a lot of very concrete models. Partially I think that’s because there’s an opinion in the community that if you have concrete models, especially if they argue for things being very soon, maybe you shouldn’t publish those.
Robin Hanson: Right, but you could still ask the question, “Set aside everything you know except what this looks like from the outside. Looking at that, would you still predict stuff happening soon?”
Asya Bergal: Yeah, I think that’s a good question to ask. We can’t really go back and add that to what we’ve asked people, but yeah.
Robin Hanson: I think more people, even most, would say, “Yeah, from the outside, this doesn’t look so compelling.” That’s my judgement, but again, they might say, “Well, the usual way of looking at it from the outside doesn’t, but then, here’s this other way of looking at it from the outside that other people don’t use.” That would be a compromise sort of view. And again, I guess there’s this larger meta-question really of who should reasonably be moved by these things? That is, if there are people out there who specialize in chemistry or business ethics or something else, and they hear these people in AI risk saying there’s these big issues, you know, can the evidence that’s being offered by these insiders– is it the sort of thing that they think should be compelling to these outsiders?
Asya Bergal: Yeah, I think I have a question about that too. Especially, I think–we’ve been interviewing largely AI safety researchers, but I think the arguments around why they think AI might be soon or far, look much more like economic arguments. They don’t necessarily look like arguments from an inside, very technical perspective on the subject. So it’s very plausible to me that there’s no particular reason to weigh the opinions of people working on this, other than that they’ve thought about it a little bit more than other people have. [Note: I say ‘soon or far’ here, but I mean to say ‘more or less likely to be harmful’.]
Robin Hanson: Well, as a professional economist, I would say, if you have good economic arguments, shouldn’t you bring them to the attention of economists and have us critique them? Wouldn’t that be the way this should go? I mean, not all economics arguments should start with economists, but wouldn’t it make sense to have them be part of the critique evaluation cycle?
Asya Bergal: Yeah, I think the real answer is that these all exist vaguely in people’s heads, and they don’t even make claims to having super-articulated and written-down models.
Robin Hanson: Well, even that is an interesting thing if people agree on it. You could say, “You know a lot of people who agree with you that AI risk is big and that we should deal with something soon. Do you know anybody who agrees with you for the same reasons?”
It’s interesting, so I did a poll, I’ve done some Twitter polls lately, and I did one on “Why democracy?” And I gave four different reasons why democracy is good. And I noticed that there was very little agreement, that is, relatively equal spread across these four reasons. And so, I mean that’s an interesting fact to know about any claim that many people agree on, whether they agree on it for the same reasons. And it would be interesting if you just asked people, “Whatever your reason is, what percentage of people interested in AI risk agree with your claim about it for the reason that you do?” Or, “Do you think your reason is unusual?”
Because if most everybody thinks their reason is unusual, then basically there isn’t something they can all share with the world to convince the world of it. There’s just the shared belief in this conclusion, based on very different reasons. And then it’s more on their authority of who they are and why they as a collective are people who should be listened to or something.
Asya Bergal: Yeah, I agree that that is an interesting question. I don’t know if I have other stuff, Rob, do you?
Robert Long: I don’t think I do at this time.
Robin Hanson: Well I perhaps, compared to other people, am happy to do a second round should you have questions you generate.
Asya Bergal: Yeah, I think it’s very possible, thanks so much. Thanks so much for talking to us in general.
Robin Hanson: You’re welcome. It’s a fun topic, especially talking with reasonable people.
Robert Long: Oh thank you, I’m glad we were reasonable.
Asya Bergal: Yeah, I’m flattered.
Robin Hanson: You might think that’s a low bar, but it’s not.
Robert Long: Great, we’re going to include that in the transcript. Thank you for talking to us. Have a good rest of your afternoon.
Robin Hanson: Take care, nice talking to you.
- The actual quote is, “Things like, right now we’re kind of at the stage where AI systems are … the sophistication is probably somewhere in the range of insect abilities. That’s my current best guess. And I’m very uncertain about that. … One should really be diving into the comparison to insects now and say, can we really do this? It’s plausible to me that that’s the kind of … If we’re in this world where our procedures are similar to evolution, it’s plausible to me the insect thing should be a good indication, or one of the better indications, that we’ll be able to get in advance.” from his podcast with 80,000 Hours.
I’ll respond to comments here, at least for a few days.
Robin, if I understand you correctly you are saying: There is only a small difference between the brains of humans and other primates. The difference between human and e.g. chimp brains is not lumpy, it’s not a fundamental new architecture or capability; instead, it’s a gradual difference, human brains and perhaps other parts of the human body are just a bit better at some of the things chimps can in principle do as well.
But you don’t seem to dispute that this has made humans *much* better at cultural evolution, or at least that it has led to an outcome *much* more favorable for humans than chimps. So while there is no lumpiness in the “inputs” — algorithms, architecture etc. — there seems to be some lumpiness in the kind of outcomes we care about from a moral perspective.
Conditional on the biology facts being correct, I agree that your appeal to human-primate brain differences is an argument against lumpiness on the path to better AI *if lumpiness is measured by looking at how AI systems look like (algorithms, architecture etc.)*. But ultimately you are arguing for a stronger claim: that we won’t be surprised by catastrophic AI failures, i.e., ones that are so big that we cannot wait to deal with them until they arise. It seems to me that to argue for this conclusion you need more than AI progress being non-lumpy when looking at the inputs/AI systems themselves: you also need to argue that we’ll be able to anticipate thresholds that, when crossed by AI’s gradual progress, will lead to large differences in relevant capabilities.
In more concrete terms, the argument from human-primate brain differences should make us less worried that, one day, someone will show up and be able to say “I’ve developed this radically new neural net architecture and now can build AGI even though you all thought AGI was far away”. But it doesn’t seem to be an argument against the possibility that e.g. some leading lab by gradually improving their AI system will one day unwittingly cross some capability threshold (similar to the “cultural evolution threshold” crossed by the human brain) that sets in motion a process (which might itself be gradual, or take a long time to unfold) that will ultimately end with AI systems controlling the future in a way humans don’t want. — Just like human brains crossing the cultural evolution threshold set in motion a gradual process that ultimately led to humans dominating chimps, and in a way that chimps were powerless to stop.
I’m curious if you agree with this take — and if so, if you do think there are arguments that either there are no relevant thresholds on the path to AGI or that we’d able to anticipate such thresholds early.