Robert Wiblin: Hi listeners, this is the 80,000 Hours Podcast, the show about the world’s most pressing problems and how you can use your career to solve them. I’m Rob Wiblin, Director of Research at 80,000 Hours.
I’ve got nothing to add today, except that if you like this episode, please pass it on to someone who could benefit from it!
Robert Wiblin: Today, I’m speaking with Allan Dafoe. Allan is [Senior Research Fellow in the International Politics of AI] and Director of the Center for the Governance of AI at the Future of Humanity Institute at Oxford University. His research seeks to understand the causes of world peace and stability, and his current research focuses on helping humanity navigate safely through the invention of superhuman artificial intelligence. Thanks for coming on the podcast, Allan.
Allan Dafoe: My pleasure. Thanks for having me.
Robert Wiblin: We’ll get to talking about listeners can potentially have a career in AI strategy themselves, but first, tell me about the Center for the Governance of AI at FHI in Oxford.
Allan Dafoe: Great. The Future of Humanity Institute has for a long time been thinking about transformative artificial intelligence and superintelligence, most notably with Nick Bostrom’s Superintelligence, but also a lot of other scholars there have been out working and thinking about this. Recently, we’ve come together in a group that is called the Center for the Governance of AI, and so that’s almost a dozen of us are working on various aspects of the governance problem.
Robert Wiblin: What kinds of specific questions are you guys looking at?
Allan Dafoe: I have a research agenda in the works, or a research landscape document that I’ve been working on, that systematizes the questions. This might be getting into a big area, but it can be broken up into three big categories.
The first category we call the technical landscape. That refers to all the questions related to trends in AI and to the properties of advanced AI, their sort of political and strategic properties, how difficult safety is, so we can talk about that.
The second really bulky category we call AI politics. That looks at all the dynamics around different groups, corporations, governments, the public as an actor, pursuing their interests as they understand it, given their capabilities, and trying to understand how that all plays out and how we can address and mitigate the worst potential risks from that and seize the opportunities.
Then, the third category we call AI governance, which asks the question, “If we’re able to coordinate and cooperate, what should we cooperate to build? What are the institutions, the constitutional foundation of a global governance system? What are the challenges that it needs to address? What are the values that we want encoded into the governance regime?”
Robert Wiblin: So, you’ve got about a dozen people in one research group here, and my impression was that about two years ago, very few people were thinking about this. Is it fair to say the interest in this topic is growing very fast?
Allan Dafoe: Absolutely. There’s tremendous interest in this space. I just came from lunch after my talk at EAG London, and the table is filled with very interested, talented people wanting to find the way that they can best contribute. Currently, we’re struggling with building a career pipeline and a community that most efficiently directs talented individuals to the problems that they can best address, but that’s one thing we’re working on.
At the Beneficial AI conference at Asilomar, several people commented that AI strategy, as this space is sometimes called, seems like it is in the place were AI safety was two years previously. AI safety, at the Puerto Rico conference, was barely an explicit research commitment in the AI research community. There’s this one story that is told about a PhD supervisor and a PhD student meeting at the Puerto Rico conference, not realizing they were both interested in AI safety, so it was almost kind of a, I don’t know, taboo or secret interest.
Now, at the Asilomar conference, it’s large and flourishing. We have research agendas. We have top labs working on this. AI strategy seems like it’s at that beginning stage such that in two years, we’ll probably have a really much more substantial research community, but today we’re just figuring out what are the most important and tractable problems and how we can best recruit to work on those problems.
Robert Wiblin: What do you think has caused people to get so interested in this topic now? Is it AlphaGo and big improvements in machine technology, or perhaps Bostrom’s Superintelligence book has raised the alarm about the issues here?
Allan Dafoe: I think AI strategy, as within the community we’ve historically called it … We can also call it AI governance or AI politics or other terms … It follows directly out of concern about transformative AI, and so I think the causes of its growth are the same as the causes of the growth in AI safety interest. It’s just lagged by two years, probably because the people who first realized that AI safety is something they should be working on were AI researchers, and so they were more likely to see the problem right away, whereas social scientists, politicians, policy makers, are more removed from this phenomenon.
Absolutely, Superintelligence had a huge impact on the world’s awareness of the dangers of advanced AI. AlphaGo had a tremendous impact on the world’s awareness of transformative AI [crosstalk 00:04:40] which is a result we could talk about. Also AlphaGo, we could talk about, is a bit of my history of how I came online.
Robert Wiblin: Do you see your work as just a continuation of previous research that’s presumably been done on transformative technologies or potentially militarized technologies, like nuclear weapons? The academics must have thought about how destabilizing new technologies could be and how they could be made safe.
Allan Dafoe: Yeah. It’s part of that conversation, but it’s not just a continuation in the sense that the challenges posed by AI are radically different than the challenges posed by nuclear weapons. Part of our research today is engaging with those older literatures. One thing that many of us read up a lot on were the political efforts to have international control of nuclear weapons, the Baruch Plan or the Acheson and Lilienthal Report, which is really the inspiration behind that.
That was a very unique historical moment, when we almost had strong global governance of a powerful technology. That didn’t work out, and I would largely blame Stalin and the Soviet Union for that. It wasn’t a clean, natural experiment on the viability of truly binding international control over dangerous technology.
Robert Wiblin: Yeah. What do you think are the distinctive characteristics of artificial intelligence from a political or an international governance point of view?
Allan Dafoe: Yeah, so several. Perhaps the most distinctive is that it’s so dual-use. Compared to nuclear weapons and nuclear energy … Nuclear energy is useful, but it wasn’t crucially useful, whereas AI seems like it’s on track to be like the new electricity or the new industrial revolution in the sense it’s a general-purpose technology that will completely transform and invigorate the economy in every sector of the economy. That’s, I guess, one problem or one difference is that its economic bounty and the gradient of incentives to develop it are so much more substantial than most other dual-use technologies we’re used to thinking about governing.
Then, the second is the ease of separating the dangerous capabilities from the commercial and beneficial capabilities. In the Acheson and Lilienthal Report, they talked about what components of the nuclear cycle are intrinsically dangerous. These are the parts that you really … A country shouldn’t be engaging in this activity unless it’s thinking about building a nuclear weapon. Those are the sites you want to control, and hopefully you have several of those sites so you have redundant control. Whereas with AI, it’s a general-purpose technology. A powerful AI agent for some task can often easily be deployed for another task that is considered a risk.
Robert Wiblin: Some people recently have started talking about artificial intelligence in terms of an arms race, I guess, either between companies or even potentially between countries. Is that the right way to think about it?
Allan Dafoe: Yeah, so this language attracts a lot of attention. Some of the discussion of an AI arms race is not accurate in that some of the quotes are talking about a race for talent. They’re using the arms race as a loose analogy or a race between companies, but it’s really not militaristic between Google and, say, Facebook. That’s just hyperbole or just using a colorful metaphor.
There’s another discussion of an AI arms race that, for example, most prominently was stimulated by the quote that Putin gave of, “Whoever leads in AI will rule the world.” This quote garnered a huge amount of attention from media around the world and also by serious thinkers of national security. Because one property of an arms race is that, in many ways, all it takes is the perception that the other side believes it to be existing, believes that an arms race is taking place, to generate the possibility of an arms race.
If I think that you believe this technology is strategic, even if I personally don’t believe it to be strategic, then I need to now worry about how your beliefs might shape your behaviors and your willingness to take risks. This quote, “Whoever leads in AI will rule the world,” it was not an official statement of Russian foreign policy. It was not a summary of a report that the Russian military produced. Rather, it was what seems to be an extemporaneous comment that Putin made in the context of giving children feedback on their science projects.
It was this televised show where he was talking to Russian students about their various science projects and congratulating them all and saying how this is the future of Russia, space travel, or materials science and whatnot. What he had to say about AI was the most emphatic of all the technologies, but it was just one statement in this long, Putin-meets-students-of-Russia event. However, that one quote was pulled out by the media, by AP, and seized upon by the world.
I think that teaches us the lesson that we have to be deliberate and thoughtful when talking about the possibility of an arms race, again, because it has this property that the mere talking about it can increase the probability that it takes place. There will be substantial tremendous risks from a racing dynamic, and so we ought to be very thoughtful and careful before engaging in just sort of knee-jerk reflection on what’s taking place.
I’d like to quote Demis Hassabis on this. “The coordination problem is one thing we should focus on now. We want to avoid this harmful race to the finish where corner-cutting starts happening and safety gets cut. That’s going to be a big issue on global scale, and that’s going to be a hard problem when you’re talking about national governments.” I think that captures the crux of the challenge without being inflammatory and likely to instill fear in other national security communities.
Robert Wiblin: It would be kind of a fitting irony for the craziness of humanity if we end up effectively destroying the world because someone made an off-the-cuff remark at a school event, just praising their science projects, that that could be the end of us. Actually, the fact that he said that almost suggests to me that they’re not really racing to produce artificial intelligence. Because if that was the plan of the Russian government, they probably wouldn’t randomly announce it while he’s visiting a high school.
Allan Dafoe: Yes, and part of what he said is … Putin actually said because a race would be dangerous, that is why Russia is going to give away its AI capabilities to the world. He almost had sort of an Open AI, early Open AI, stance on how Russia would be a benevolent AI power.
Robert Wiblin: Interesting. I guess that didn’t get reported so much. It wasn’t as interesting to people, obviously.
Allan Dafoe: That’s right.
Robert Wiblin: There might be some people listening who think superintelligence isn’t going to come anytime soon, and others who think it might come soon, but it’s not going to be that risky, things are going to be fine, people are fretting over nothing. What would you say to skeptics like that?
Allan Dafoe: To the first perspective, that … Let’s forget about superintelligence. Let’s just say that transformative artificial intelligence, this is artificial intelligence that could radically disrupt wealth, power, or world order … Some people hold a perspective that if it’s coming, it’s coming in many decades. The easiest response is to say that you should have a portfolio of policy investments that focus on different risks, and if you cannot with confidence rule out this potentially large risk entirely, then some research and policy effort should be spent on it.
But I also think we can more directly address the question. I would point out that even with just today’s AI technology, if there were no more structural improvements in AI performance, we just collected more data, built more sensors, more computing capability, but we stopped Moore’s Law in its tracks, and we stopped algorithmic scientific improvement in AI, I think we can already see some extreme systemic risks that could emerge from AI. To give you four, and I’m happy to talk about these, but one would be mass labor displacement, unemployment, and inequality. It’s plausible that today’s AI and the AI revolution that will be generated from the application of AI to industry could dramatically increase unemployment and inequality.
Two, it’s plausible that AI is what is called a strategic industry, or it’s a natural global oligopoly in that various properties of AI services mean that an incumbent company is able to basically retain a monopoly in that service in that sector. If you think about Facebook, if you think about Google, if you think about Amazon, each of these are companies that it’s very hard to compete with on their core industry. If that’s the case, if AI services in general have this property, which could arise from the digitization of services so there’s a zero marginal cost or a low marginal cost in what they’re providing as well as to other sources of increasing returns to scale, such as this virtual cycle in AI that once you have some consumers, they provide data, which allows your algorithm to become better, which gets you more consumers and support …
This would mean, or this could mean, a move away from sort of liberal economic world order and logic that has governed the past decades where free trade increases all countries’ wealth, and countries have really sort of embraced this logic of cooperation and economic productivity. You could have a reversion to economic nationalism, protectionism, mercantilism, where countries are each trying to build up their own AI champion in the way that has been done in the past with oil companies or automobiles or commercial aircraft.
You can see that today where most of the AI companies are in the U.S., and the main rival comes from China, which has basically been protectionist with its market. It’s excluded American companies and cultivated its own sort of AI champions. Okay, that was the second extreme possibility. Maybe I’ll just go quickly through three and four. Three is surveillance and control. You can imagine ubiquitous sensors, algorithms that can identify individuals through face recognition, through gait, through other kinds of behavioral signatures.
Robert Wiblin: We’re basically already there, I think. We’re all carrying around location detectors in our pockets every day.
Allan Dafoe: I agree. I don’t think it would be that hard to build a full system to monitor the public, track what they’re saying and doing, profile, identify individuals who might be a challenge to some political objective or an ally, have tailored persuasion. What much of the ad research money is going to is figuring how you can persuade individuals to buy some product or feel differently about something. Lastly, robotic repression. Many coups or authoritarian governments fell because the army was not willing to fire on the citizens. If you have autonomous weapons that have a good independent chain of command that doesn’t go through a human, then citizens might lose that protection of human decency.
Robert Wiblin: Yeah. This really only requires the willingness to build these capabilities as a major state.
Allan Dafoe: Yes, and if the public has power, their failure to defend themselves from such a system.
Robert Wiblin: Yeah.
Allan Dafoe: Then, there’s a fourth that … Again, to clarify, all of this, I think, comes from AI today. We don’t need to concoct a science-fiction story of new technology to generate this capability. The fourth is strategic stability. It may be that imagery intelligence from satellites, from subsea sensors, passive sensors, or submersibles that are tracking, say, submarine ballistic missiles, to analysis of social network behavior, would be sufficient to reveal most of the submarine ballistic missiles and mobile ballistic missiles that are currently the two most secure arms of the nuclear triad.
If that were to happen, couple that with hypersonic missiles that shorten the time from launch to when a decision needs to be made about retaliating, it could lead to a much more unstable world. These are just four possibilities that are generated or exacerbated by narrow AI as we see it today that we need to think about a lot more, and none of that required speculating that we could have superhuman intelligence in various domains. We can also talk about why such a possibility is not improbable.
Robert Wiblin: How urgently do we need to figure out solutions to these problems? When can we expect artificial intelligence to be dramatically better than today?
Allan Dafoe: The problems I previously mentioned that go through mostly with 2017 AI, we need to obviously start thinking about those urgently. There’s then a whole other set of governance challenges that comes from human level and superhuman level AI in various domains or in general. It could be that you just get … To walk it back a bit, you could have superhuman AI just on some strategic domains, so it could be, say, just in cyber, or just in profiling, or social network mapping and manipulation, or math, or science and technology.
Or, you could have human level or superhuman level AI across the board or in all domains that are strategically relevant. It’s hard to understand what those different possibilities would mean for the economy, for strategic stability, for political order. In terms of when, how much time we have, basically, we don’t know. We don’t understand intelligence well enough. In history, we’ve never observed an intelligence explosion or the emergence of machine intelligence before. We don’t have other civilizations that we can look at to get a good frequentist estimate.
I’ll say a bit about how to have an informed belief. One thing we’ve done is survey AI experts, and that shouldn’t be thought of as really authoritative. They are experts at building AI. They are not necessarily experts at forecasting progress in AI. Current research that’s being done in the community is to take forecasting more seriously, to get more data, to calibrate, see how well people are doing, and use the other tools of forecasting to try and have improved forecasts.
However, taking the survey of AI experts as we had it, you see a range of opinions. These are published researchers at the top two conferences in AI. Some think it could happen very soon, and it is human-level machine intelligence defined as when machines are better than humans at every task. Some researchers think it would not happen for many decades or even a hundred years. However, if you take the average or the median of those perspectives, there is enough probability mass on the near term, say, 10 years there’s about 10% probability, in 25 years there’s about 30% probability, that if we just take that as a reasonable basis for our credence, that is a sufficiently high probability that we ought to be taking it seriously.
Now, I can say a bit more about why we might think it could come quickly. One argument by sort of anecdote is that it’s happened before. AlphaGo is illustrative of this. We were surprised by progress in Go, not necessarily because there was a technical breakthrough, but because the company deployed a lot of talent and resources towards a problem that we didn’t realize they were doing. That caught us, the world, off guard as it were, but they might have anticipated that it’s possible, given that investment.
There are other examples. To give one, the Atari game, Frostbite … If you look at EFF metrics page, it has this lovely figure where it was basically highly subhuman from 2013 through to the middle of 2015. It could barely play the game. Then, if you were to extrapolate from that data, you might conclude 2050 to 2100 would be about when it’s going to reach human level. But by the end of 2015, some publications came out that showed a dramatic improvement to human level and then superhuman level over the course of one year.
Other games show a similar quick change in the rate of progress. Then, in other games, we see a more steady change in the rate of progress. This is part of what we will be researching with others is to try and have more informed forecasts. I can also give you some more technical answers and more inside-view answers for why progress changes and capability of AI, and especially towards AGI, could be much faster than we expect.
First, we really don’t understand intelligence well enough. We don’t know what are the core missing pieces and how independent they are. It’s plausible that there could be some common factor to many of the capabilities that are currently being worked on. Once we crack that common factor, the rest of the pieces will either fall into place or become less important for solving these general intelligence tasks. An example of a common factor is a good world model. If a machine capable of developing a knowledge graph of the world, of people, of history, of technology, so that it can then integrate new knowledge by … If it reads a physics textbook, it can incorporate that knowledge into its world model, and it can integrate various kinds of narrow AI systems together, that could be really transformative.
As has been discussed before, for example in Superintelligence, the possibility of highly recursive self-improvement could generate rapid changes if there’s a narrow AI capability that permits self-improvement in AI research. There’s another whole class of reasons why we might think progress could be surprisingly fast, and that is if there’s what’s called overhang in various inputs to AI. Most talked about is hardware overhang. If we’re in a world where we already have so much computing power out there that once we devise this AGI algorithm, artificial general intelligence algorithm, if it can run on a laptop, then you have many laptop equivalents that it could run on. So, you go from a world with no AGIs to many millions.
We could have what’s called insight overhang. If there are key breakthroughs in algorithmic design that the human scientists just missed … They’re kind of waiting there to be plucked, and then they will add efficiency to machine learning. There’s a nice example of this by Bellemare et al. of DeepMind, where they show that if you just use the full distribution of the value function rather than take the expectation, you get dramatically improved performance, really quite substantial, more than any other algorithmic improvements over the span of many months.
This is an insight that, in retrospect, seems like something that someone could have seen at the time, but we didn’t. So, it’s plausible that there’s other such insights just waiting to be found that, if you had an AI researcher, a machine intelligence AI researcher, that it could potentially pluck those. Then, there’s what we can call data overhang, so there’s an overhang of data waiting to be analyzed by a system capable of analyzing it. The best example, I think, for this is the Internet, the corpus of works of Shakespeare and history, strategy, strategic thinking, physics textbooks, social books, basically any kind of book that has encoded knowledge that at some point, a machine capable of interacting with a human at a high intellectual level could make sense of the knowledge in there.
Robert Wiblin: And potentially read very fast if it has enough hardware.
Allan Dafoe: Exactly. Read very fast, never forget what it’s read once it integrates it into the massive knowledge graph that it’s building. Then, there’s one last argument for why you might see really rapid and broad, or in this case rapid development of systems. This is the train-to-execute ratio of computing costs. In current machine learning systems, it costs around one to 100 million times more computing power to train an algorithm than it does cost to deploy that same algorithm. What that means is if you were to, say, train up for your first AGI, if you want to repurpose that computing power that you used to train it to deploy it, you don’t just have one AGI in the world. You now might have 10 million.
Robert Wiblin: Can you explain how that works?
Allan Dafoe: Yeah. One way researchers develop algorithms is through reinforcement learning in a simulated environment. You can think about those MuJoCo runners that are trying to learn to go over obstacles, or AlphaGo Zero, for example. It’s playing with itself in this simulated environment and gradually making sense of its environment and learning heuristics and whatever strategies it needs to to succeed. But it costs a lot of computing power to run that algorithm again and again and again as it’s gradually making sense of its environment and learning. Okay? AlphaGo Zero is run for, say, well, several days or 30 days before it gets to its really high-level performance. But then once you have that system, once you have that set of trained weights in the neural network, you can then deploy it with a much smaller computing power [inaudible 00:23:48]
Robert Wiblin: I’ve also heard that you can speed this stuff up a lot by producing specialized chips that do the necessary operations much faster than a generalized computer chip would do. Is that right?
Allan Dafoe: Yes. There have been substantial improvements. I mean, first going from CPUs to GPUs, and now GPUs to TPUs, and many people are wondering what’s the next hardware improvement that will accelerate.
Robert Wiblin: We’ve talked about a range of different issues there. What do you think’s the most important question to deal with quickly in this field?
Allan Dafoe: Some problems are more important than others. However, we are sufficiently uncertain about what are the core problems that need to be solved that are precise enough and modular enough that they can be really focused on that I would recommend a different approach. Rather than try to find really the highest-leverage, most-neglected problem, I would advise people interested in working in this space to get a feel for the research landscape.
They can look at some of the talks at EAG, and then ask themselves what are their comparative advantages? What’s their driving interest or passion? Do they believe they have an insight that’s neglected, an idea? What’s their background? What’s the community of scholars and policy makers they would see themselves as most comfortable in?
Then, either hopefully with the help of their community, and we will try to help, and 80K can help try to map that individual to the part of the research landscape where they fit best. So, rather than optimize the whole community into one narrow problem, I would say let’s optimize the community and mapping, from the distribution of comparative advantages to the distribution of problems. Try to find a modular project that they can work on. You can do that in consultation with others. Examples of a modular project include case studies of historical analogs
We were discussing the Baruch Plan and the Acheson and Lilienthal Report. That’s one historical moment, but there’s other analogies in history that we can look to, everything from CERN as a collaborative, scientific endeavor, the International Space Station, or international collaboration over nuclear fusion. Those offer a different kind of analogy that’s imperfect. There’s other, international control of dual-use technologies, analogies that we can look to.
There’s historical work to be done. There’s also a lot of economic work to be done, modeling race dynamics, modeling tech development. There’s forecasting work to be done, which is a mix of quant modeling and almost psychology and working with experts. There’s ethics work to be done and moral philosophy to be done. There’s governance design, so institutional design, constitutional design. There’s public opinion research to be done.
I could just go on and on, there’s so much work. One way of thinking about it is that the AI revolution will touch on everything, and political processes are sufficiently interdependent that many parts of that could be critical for how it all plays out. Because of that, we basically need a social science of the AI revolution.
Robert Wiblin: And we need it fast, I guess.
Allan Dafoe: Well, we don’t know how quickly we need it.
Robert Wiblin: Okay, yeah.
Allan Dafoe: So-
Robert Wiblin: Well, I suppose that’s at least a little bit reassuring.
Allan Dafoe: Yes.
Robert Wiblin: Maybe we have more time than we think.
Allan Dafoe: Yes, but we should not dawdle.
Robert Wiblin: No.
Allan Dafoe: I will say in this respect, some fields are moving faster than others. Economists are taking this seriously. Daron Acemoglu, Charles Jones at Stanford, and some others are really considering what transformative AI could mean in terms of modeling growth, modeling unemployment and labor displacement. Political scientists that I’ve spoken with, almost all of them think this project is important and worth pursuing and interesting, but it does pose methodological challenges to what some branches of political science prefer.
It’s hard to be quantitative about the future. It’s hard to do experiments about the future, because we’re so uncertain, correctly, if we’re being scientifically humble about what the future will look like, it means we have to be open to a lot of possibilities, and so it’s hard to draw firm conclusions. I think AI social science … Social science with the AI revolution really has to embrace that the question should guide the research and not the method. We need to look where we thought we dropped the keys, not where the lamplight is.
Robert Wiblin: I’ve heard some people say recently that in new research fields like this one that are kind of pre-paradigmatic, is the term that they’ve used, it’s so early that we don’t even know exactly how to think about these problems, that you need a particular kind of person to be able to make progress there. You always kind of need a certain boldness and a certain creativity to do any normal research, and you often need quite a lot of intelligence, as well. But when you’re moving into a new field, maybe you need those even more than you would or than you will later on, once someone has mapped out the space and defined how you can find solutions to these problems. Do you agree with that perspective?
Allan Dafoe: I do. Carrick Flynn, who works at the Center for the Governance of AI at the Future of Humanity Institute, he’s characterized this as disentangling research, the challenge of disentangling all the threads and possibilities to really generate a clear research agenda and a clear vision of the problems. This does seem like the highest priority for our research. However, it’s hard to do this work at scale. I think that just means that people who think they have or they might be able to contribute to this kind of revisioning of what the nature of the problem, what kind of work needs to be done, people should try their hand at it and should try to articulate how they see the problem and what research needs to be done to reduce our uncertainty.
However, I do think at this point, we’ve identified enough tractable questions that there’s normal science to be done, that researchers from a range of fields, like institutional design, like public choice, diplomatic history, international relations, conventionally conceived and also quantitative study of international relations, forecasting … There are lots of projects for talented, motivated people to work on that I think have sufficiently well-defined contours that people can get started on it.
Robert Wiblin: So, you both need disentanglements and then workarounds to solve the problems that have been disentangled.
Allan Dafoe: Yes, and there’s a lot of work to be done. There’s a lot of interest coming into the space, which is really exciting to see. I would just encourage people to perhaps not be frustrated if it’s hard for them to find their niche just yet and just read everything they can, talk to people, try to find some part of the problem that they can make a contribution on, and then get working at it.
Robert Wiblin: How did you get into this field? I don’t imagine this was what you were studying as an undergrad or even in your PhD.
Allan Dafoe: Actually, not in my PhD, but it was what I was looking at as an undergrad.
Robert Wiblin: Oh, interesting.
Allan Dafoe: This goes back to the … I mean, I think I’ve always been interested in computers and artificial intelligence. I followed Kasparov and Deep Blue, and it was actually Ray Kurzweil’s Age of Spiritual Machines, which is an old book, 2001 … It had this really compelling graph. It’s sort of cheesy, and it involves a lot of simplifications, but in short, it shows basically Moore’s Law at work and extrapolated ruthlessly into the future. Then, on the second y-axis, it shows the biological equivalent of computing capacity of the machine. It shows a dragonfly and then, I don’t know, a primate, and then a human, and then all humans.
Now, that correspondence is hugely problematic. There’s lots we could say about why that’s not a sensible thing to do, but what I think it did communicate was that the likely extrapolation of trends are such that you are going to have very powerful computers within a hundred years. Who knows exactly what that means and whether, in what sense, it’s human level or whatnot, but the fact that this trend is coming on the timescale it was was very compelling to me. But at the time, I thought Kurzweil’s projection of the social dynamics of how extremely advanced AI would play out unlikely. It’s very optimistic and utopian. I actually looked for a way to study this all through my undergrad. I took courses. I taught courses on technology and society, and I thought about going into science writing.
And I started a PhD program in science and technology studies at Cornell University, which sounded vague and general enough that I could study AI and humanity, but it turns out science and technology studies, especially at Cornell, means more a social constructivist approach to science and technology.
Actually, I wrote my masters thesis on technological determinism, and one of the case studies is the Meiji Restoration. Here you have Japan basically rejecting the world and saying, “We want this Tokugawa regime to live on the way we envision it.” Eventually, the world intervened, Commodore Perry via the U.S. wanting to trade in coal. That just shows how the ability of a group of people to socially construct the world they want is limited by, in this case, military constraints, military competition, and more generally, also by economic competition.
Okay. Anyhow, I went into political science because … Actually, I initially wanted to study AI in something, and I was going to look at labor implications of AI. Then, I became distracted as it were by a great power politics and great power peace and war. It touched on the existential risk dimensions that I didn’t have the word for it, but was sort of a driving interest of mine. It’s strategic, which is interesting. Anyhow, that’s what I did my PhD on, and topics related to that, and then my early career at Yale.
I should say during all this time, I was still fascinated by AI. At social events or having a chat with a friend, I would often turn to AI and the future of humanity and often conclude a conversation by saying, “But don’t worry, we still have time because machines are still worse than humans at Go.” Right? Here is a game that’s well defined. It’s perfect information, two players, zero-sum. The fact that a machine can’t beat us at Go means we have some time before they’re writing better poems than us, before they’re making better investments than us, before they’re leading countries.
Well, in 2016, DeepMind revealed AlphaGo, and it was almost this canary in the coal mine, that Go was to me, that was sort of deep in my subconscious keeled over and died. That sort of activated me. I realized that for a long time, I’d said post tenure I would start working on AI. Then, with that, I realized that we couldn’t wait. I actually reached out to Nick Bostrom at the Future of Humanity Institute and began conversations and collaboration with them. It’s been exciting and lots of work to do that we’ve been busy with ever since.
Robert Wiblin: Let’s talk now about some concrete advice that we can offer to listeners who are really interested in these topics, that they’re still listening to us, and they’re thinking, “I might want to work on this kind of research with my career.” My impression is that it’s been quite hard to get people working on this problem, or especially to train them up. Because as you were saying, there isn’t a great pipeline because it’s a new area, so we’re having to figure things out as we go along, and there aren’t enough people working in the area to have a lot of mentors, a lot of teachers who can explain all of this to everyone. Given those challenges, what can people do if they want to get into the field?
Allan Dafoe: A first suggestion would be just read and expose yourself to the conversation. There are some pieces of work coming online, of course Superintelligence and Yudkowsky’s writings, and a lot of the work in the community as less formal publications, touches on many issues, but also increasing work and more academic work is starting to emerge. So, read that. Look at the EAG talks and others. Attend events.
Try to find a part of the problem. The research landscape is vast. If you can just find part of that that you have a comparative advantage in, that seems interesting, and that others agree would be useful for you to work on, then tackle that and feed that back into the community. That’s a good way to be useful right away, to learn more about the community, and then ultimately to prepare yourself to do other work that’s perhaps adjacent to that initial work.
Okay. Two, I would say, it actually may be possible to do a lot of this work within academia. We just don’t know exactly how yet. Economists are increasingly building problems out related to transformative AI as a legitimate topic of inquiry, so that’s good for economists. Political scientists, I think there’s a number of ways that you can situate the challenges from transformative AI into canonical problems.
You just need to learn how to pose AI as an element of dual-use technologies, or governance of emerging technologies, or other kinds of technologies with very strategic properties, or looking at historical analogs of cooperation, or national or regional industrial policy for leadership in advanced technologies.
In doing so, even though there aren’t many professors whose active research is on transformative AI, I think there are many that are sympathetic to it, especially if you can get the framing right enough that it fits within an existing scholarly vocabulary. Just at Oxford and at Yale, I’ve been impressed by the extent to which we can find sympathetic professors who would supervise students who can get the framing right.
Then, three, I would say I think we are identifying more and more professors who can do the supervision who do take these issues seriously and who could be a supervisor and really understand the importance of this new field and are willing to support it.
In summary, it is a challenge right now to figure out exactly how you can feed in. I do think we’re getting better. I think at the Future of Humanity Institute and in the community more broadly, we’re getting a better on-ramp for new people, and we’re able to allocate talent to different problems. Just do your best. Be patient and excited and just keep reading and thinking and try to enter the community in one way or another so that you can make sure you’re contributing as best you can.
Robert Wiblin: How does that on-ramp look like? Is there anything that exists that people might not be aware of?
Allan Dafoe: The on-ramp in part looks like sending an email to someone. That might be Carrick Flynn. If it’s appropriate for you in your career, apply for an internship or a longer-term position at the Future of Humanity Institute or at other locations as they emerge and the positions arise. There’s actually a lot more conventional funding sources and positions that are, I believe, compatible with also studying transformative AI.
You can think in cybersecurity … Cybersecurity is not that far away from transformative AI in the cyber domain, so you could go in that direction. Work on autonomous weapons would be another area, thinking about employment, and the welfare state and universal basic income and inequality would be … In many ways, you may not want to focus on AI. You may want to focus on the issue area that, for whatever reason, you have a comparative advantage in that AI is going to impinge on in the near or medium term.
Robert Wiblin: You think then you’ll be able to transfer into other issues as they come along, because you have the most relevant expertise?
Allan Dafoe: Yeah, transfer, or perhaps you’re the specialist in the area. I mean, we certainly need … Again, coming back to labor displacement, inequality, unemployment, these are such a large social challenge that I don’t think we’re going to saturate that field and solve that problem. As transformative AI really comes online, we will need that expertise to work closely with us and to provide frameworks for thinking about how do we redesign the social contracts, redistribution systems, cultures, senses of self-worth? What’s the future viable model of a, say, liberal country in a world where either people are being constantly knocked out of their occupation and need to retrain, or even just a large proportion of the population is systematically unemployed?
Robert Wiblin: Quite a few people have found it challenging to get into this field, but my impression is that it’s probably going to get easier over time, firstly because it’s going to develop, and the process for training people and putting them into good roles is going to become more mature. But also just the level of interest is rising so quickly that I expect the demand for people who know about this topic is going to outstrip the actual supply because we’ve been very constrained in our ability to train people up. Do you think that’s right?
Allan Dafoe: I do. Yeah, and that’s actually a good, maybe, fourth point. I lost my enumeration from before. I think there’s so much interest in artificial intelligence, both from industry side, from government, from philanthropy, from other areas. While in some ways, it’s a risky move to enter this space for someone who’s in a traditional discipline or field, in other ways it’s a safe move because there’s just such a great demand for expertise.
Robert Wiblin: We’ve spoken a lot about the questions and the problems and less about the solutions and answers this time, so I think hopefully we’ll get a chance to speak next year or the year after about how the research is going and what kinds of answers you’re getting. Do you generally feel like you’re making progress? Some listeners might think, ‘well this is all very important, but I just don’t really believe that we can answer these questions. It’s all going to be so speculative, and we don’t know what the world will look like in five or 10 or 20 years’ time when it becomes relevant, so maybe we shouldn’t bother even studying it at this point.’
Allan Dafoe: Yeah. Certainly, it’s hard to study the future. It’s important to be appropriately scientifically humble about our abilities to say things with confidence about the future. Whenever we’re making plans 10 years out, we need to have large error bars around the assumptions in our models and so forth. But I do think we can say productive things, as you mentioned. We didn’t cover that today, but we’ve just been starting in this work, and we’ve already glimpsed some productive contributions, insights, and policy recommendations.
A lot more work needs to be done on those before we are ready to share them and advocate for them, but the lack of progress in them so far is not for want of tractability. It’s not that the problems are hard or intractably hard so much as lack of time. We’re pulled in so many different directions trying to build the community, speaking at various events, that the matter of time we have left for research so far has not been sufficient to develop all these ideas. Just judging the promise of this area based on the number of ideas that no one’s written up yet but people have in our community, I think it’s a very promising area for more intellectual effort.
Robert Wiblin: For technical area research, there’s the NIPS conference every year. Is there a conference now for AI strategy and AI policy?
Allan Dafoe: There is not, and again …
Robert Wiblin: Maybe that’s something that listeners could potentially help to organize.
Allan Dafoe: Absolutely. Yeah. I mean, there are events of various kinds that bring together different subsets of this community, and there’s at least one event on the horizon that looks like it could be a core event for this. But because AI strategy and governance touches on such a range of issues … I mean, there are events. There’s just not a single one. We just came back from the partnership on AI inaugural meeting in Berlin, and there are a range of issues from, well, near-term challenges that are confronting society today with how to deploy algorithms in a way that represents our values and advances them rather than perhaps distorts or loses some concerns, to issues about safety critical systems. That’s one institutional vehicle for these conversations. It has a lot of promise, but we’ll see to what extent it really flourishes. Others are just cropping up all over the place. OECD events, UN has a new center on AI and robotics. Governments are hosting conversations. There’s academic conferences of various kinds, and then those in the EA community.
Robert Wiblin: For technical AI safety research, there’s the concrete problems in AI safety that’s become this canonical paper that people can read in order to get across the field and understand exactly what it looks like. Is there a similar canonical paper now for AI policy, laying out the issues in a very specific way and making them academically credible?
Allan Dafoe: There is not a single paper. There are some papers on AI policy focused more on near-term challenges, and we can link those, perhaps. Then, we are actively working on a paper, sketching the research landscape and then the nature of the problem. We often use the language of governance, the governance problem, but you can also think of that as AI policy from the perspective of extreme challenges. We are working on those papers, and they should be available at the time that this goes to air. If not, an interested listener can reach out to us for an early copy.
Robert Wiblin: I’ll try to put up links to everything that we’ve discussed in this episode in the blog post. If it’s out by that point, then I’ll link to it in the blog post when this goes out, and if not, whenever it is released. There’s a lot of specific questions I could ask you, like places to work or people to study, where … A list of questions in the area, but they’re actually pretty well covered in our AI policy career guide, which people can go take a look at. I think you’ve read that and put in some suggestions so-
Allan Dafoe: Of course.
Robert Wiblin: … we don’t so much have to rehash that here. One final question, if you were thinking of taking on a PhD student, what kinds of qualities would you be looking for? What would be the … Is there anything unusual that people can look for in themselves to tell if they’re a good fit to do this kind of research?
Allan Dafoe: I don’t know if there’s anything distinct that would distinguish them from other fields. I think the most important traits are smarts, intelligence, and that’s a multidimensional concept, of course. There’s different kinds of intelligence, and drive, whether that comes from passion or from conviction or work ethic. Those are the two most important traits. Then, there’s a number of skill sets in comparative advantages that you might look for, but as I mentioned, the scope for research is so large that it’s unlikely that that’s going to be the limiting consideration. Even people who are perhaps outside of the most useful set of backgrounds I think could retool within a year and produce useful contributions.
Robert Wiblin: Because we’ve talk so much here about the problems and less about the solutions, people might come away with a bit of a doom-and-gloom kind of perspective. Are you optimistic about our ability to solve these problems over the next 50 or a hundred years, or however long we have to do it?
Allan Dafoe: I think I am dispositionally optimistic, and one probably needs to be to work on this in the right mental state going forward. Rationally, it’s hard to form confident beliefs about the difficulty of surmounting this challenge. I am hopeful that humanity can overcome this challenge. In many ways, it’s our final test for our ability to cooperate, for our ability to build institutions, to represent our values. I think we often take for granted the current world. We think the world is the way it is, and there’s selection bias in what kinds of news we’re exposed to. It’s striking how many people think the world is worse today than it was 50 years ago or a hundred years ago.
Almost every metric, from Steven Pinker and others, it’s clear that the world is becoming more peaceful, more liberal, better in almost every metric, except perhaps for exposure to existential risk from advanced technologies. There’s no reason why we can’t continue that trend. We’re this close to really overcoming all the challenges and catastrophes that history has subjected us to, and the stakes are tremendous, that not only is the downside vast, but the upside is huge. The amount of wealth and scientific and intellectual insight that could come from advanced AI and just human happiness and flourishing is so vast, that I think if we can maintain that perspective that the gains from cooperation are tremendous, and the losses from failed cooperation are so vast, and that we’ve come so far as humanity.
Most countries today that feel like a nation, they were, in the past, not a nation. They were warring, regions or warring tribes. It is possible for us to come together and construct an identity that overlooks the differences of the past. Here’s our final hurdle. Can we overlook the differences between different perspectives, different cultures, and recognize our common interests? Yeah, I guess I’m hopeful, maybe just for literary reasons, that it seems like the tempo of the narrative is going to a difficult but ultimately victorious ending. I think we can get there.
Robert Wiblin: My guest today has been Allan Dafoe. Thanks for coming on the 80,000 Hours podcast, Allan.
Allan Dafoe: Thanks. This was great.
Robert Wiblin: The 80,000 Hours Podcast is produced by Keiran Harris.
Thanks for joining – talk to you next week.