Transcript
Cold open [00:00:00]
Will MacAskill: You should really pause and reflect on the fact that many companies now are saying what we want to do is build AGI — AI that is as good as humans. OK, what does it look like? What does a good society look like when we have humans and we have trillions of AI beings going around? What’s the vision like? How do we coexist in an ethical and morally respectable way?
And it’s like there’s nothing. We’re careening towards this vision that is just a void, essentially. And it’s not like it’s trivial either. I am a moral philosopher: I have no clue what that good society looks like.
Who’s Will MacAskill? [00:00:46]
Rob Wiblin: Today I’m speaking with Will MacAskill, one of the founding figures of effective altruism, a philosopher at the University of Oxford, the author of What We Owe the Future, and now a researcher at the Forethought Centre for AI Strategy. Thanks so much for coming back on the show, Will.
Will MacAskill: I’m so glad to be back on.
Why Will now just works on AGI [00:01:02]
Rob Wiblin: It’s been a couple of years since you were on last, and since then you’ve really reoriented your work a whole lot. You’re now much more focused on AI-related issues. Can you explain what your research agenda is and why you did such a big reorientation of your priorities?
Will MacAskill: Sure. So the area I’m working in I’m currently calling “AGI preparedness.”
The idea here is taking really seriously the idea of having human-level artificial intelligence in the near term — even in the next few years or five years — and taking seriously the sheer scale of the impact that might have, in particular the idea of extremely rapid technological advancement. And then trying to figure out what are all the challenges that that might bring, and are there any that are currently neglected but that are in the same ballpark of importance as (for example) the risk of AI takeover, or the risk of global catastrophe from new engineered viruses?
There’s two ways of doing that. One is by really reasoning through and modelling the dynamics of what you will get once we have AI that can perform as well as a human at some crucial tasks, what are all the changes you would expect over the decade following that, and then seeing what arises from that.
A second is by thinking about a post-AGI future that we feel really good about, and what would that look like — and then reasoning backwards from there to think, what are the ways we could miss out on that really good outcome?
And you asked about why I am focusing on this now. Well, one is just the crazy advances and progress we’ve seen in AI.
Rob Wiblin: You don’t say.
Will MacAskill: I’ve really changed my view on AI timelines. So I take really quite seriously the idea of human-level artificial intelligence even in the next three or five years, and take even much more seriously, for somewhat related reasons, the idea of an intelligence explosion — so very, very rapid increase in AI capabilities and resulting technological change following soon after from that. So that means AI is the really big thing to focus on.
Then in terms of why this particular area, one thing is just seeing how impactful early foundational work on misalignment existential risk was. I was part of the seminars for Nick Bostrom’s book Superintelligence back in 2012, 2013. And at the time I found them kind of frustrating, because I thought it was just kind of pie-in-the-sky; it wasn’t going to have much of an impact. And I’ve really been proved wrong there; it’s just actually been enormously influential in the conversation that now happens. Most people don’t even realise where the ideas come from. And potentially we could do the same thing for some other areas.
And finally, this is just something which I think plays to my competitive advantage. Often the sort of research you need to do is kind of messy, it’s confusing, it’s quite big picture — something where a background in ethics and philosophy is quite helpful.
Will was wrong(ish) on AI timelines and hinge of history [00:04:10]
Rob Wiblin: Yeah. I hate to raise your previous positions, Will, but I think five years ago you wrote this article casting doubt or expressing your scepticism towards the idea that we live at a particularly important, crucial time in history. People call this the “hinge of history” debate. Do you want to explain the argument that you were making, or the arguments that you found persuasive back then for scepticism about that? And what’s your view on all of that now?
Will MacAskill: Sure. I’ll give a kind of high-level answer, which is that there are two ways of reasoning about the world, which sometimes get called “outside view” and “inside view.” These are horrible terms, I think. But the more outside view perspective is just saying that we could live at any time. It would be so unlikely that we were at this crucial, very important moment in history.
And then you’re saying also that that’s going to be this very rapid technological change. Well, look at the trend over the last 100 years: it’s actually been very stable in terms of rates of progress. We should just kind of expect these broad trends to continue. And this is actually the way that superforecasters tend to realise, so I’ll be very upfront that lots of the views now I have are not shared by superforecasters, in fact.
Then there’s this alternative perspective, where you’re getting more of a gears-level understanding of the world. And often that does involve trend extrapolation, but sometimes extrapolating the fundamental trends mean that the higher-level trends are very different.
So we saw this in early COVID, where a good gears-level understanding of the situation in January and February of 2020 involved taking seriously the exponential growth in the coronavirus. And taking that seriously, you don’t have to be a genius to realise that’s going to dramatically change the world. But the thing that’s weird, and I think hard to deal with, is that that actually means that the larger trends of how the world operates will have to change.
And since 2017 when I wrote that, I’ve kind of informally just been testing — I wish I had done it more formally now — but informally just seeing which one of these two perspectives are making the better predictions about the world. And I do just think that that inside view perspective, in particular from a certain number of people within this kind of community, just has consistently had the right answer.
So coronavirus was one, and also the development of when the vaccine would be. Also trends in solar power I think has been successful. And then finally, scaling and AI progress as well. So I’m kind of seeing all of these cases where the more inside view reasoning is like making empirical bets on the world — and often people have been putting their money where their mouths are as well and getting far beyond market returns as a result. And the whole aggregate of that suggests to me that actually this more inside view reasoning is just really working out.
Rob Wiblin: Yeah. And I guess we’re not necessarily so terrible at it. The reason why you often want to do outside view reasoning is that you worry that you’re going to fool yourself and get tricked by arguments that sound good, but in fact are not going to pan out.
Will MacAskill: Yeah, exactly.
Rob Wiblin: I guess we’re just finding that often the people making these inside view arguments about trends in technology have a reasonable track record. It often pans out.
Will MacAskill: Yeah. And then the second thing is the quality of the evidence about AI now. We are just in this radically different position epistemically compared to 2017, in my view, where we have models that are doing pretty well on benchmarks precisely to test how well the AI can do at machine learning research, which is the crucial ability.
So it really looks like, I would say the end, but really it’s like the beginning of the next stage when AI can in fact automate AI R&D. That is in sight.
We also have the now well documented, incredibly fast-paced trends in scaling of compute, trends in algorithmic efficiency.
And then finally we have some good arguments for thinking that in the next five years in particular, but five to 10 years, we’re going to make an unusually large amount of progress towards AGI compared to any other decade, simply because we’re scaling the investment into AI so much.
So actually this is a way in which there’s this very concrete argument in which the next five or 10 years really are a special time, because it’s a time when we’re going from AI being this tiny fraction of the overall economy to being quite a large fraction. And that’s a one-time change.
A century of history crammed into a decade [00:08:59]
Rob Wiblin: All right, there’s a couple of things you raised there that we’re going to return to and flesh out later. But let’s just dive into this paper, “Preparing for the intelligence explosion,” which lays out the priorities and the vision for the Forethought Centre for AI Strategy. What were you trying to do with that paper?
Will MacAskill: So this paper is taking very seriously the idea of AI driving very rapid technological change. And the bar we put for that in this paper is the idea that AI will drive at least a century’s worth of technological development in just a decade.
And I encourage listeners to actually just do this themselves, to pause the podcast and think, what would you expect the world to look like in 2125, technologically speaking? And then imagine all of those changes happening over the course of just a decade, so if it were now 2035. And thinking, OK, there’s going to be a lot going on, there’s going to be a lot to deal with there.
Rob Wiblin: Things feel a bit hectic this week, but I guess it could get much more.
Will MacAskill: Yeah, things feel hectic already. I already can’t keep up with the pace of developments in AI. I can’t read a hundredth of the stuff that I would like to read. That’s going to get much more intense.
So the paper is making the case for that, trying to make it vivid, then trying to somewhat systematically go through the different challenges that we’ll face — and it’s a bit of a litany there. And then finally, addressing the idea that for most of these we can just punt the problem to the time when we have aligned superintelligence and then let that figure that out. And I think that’s true in many cases, but not in all. And there’s discussion of that too.
Rob Wiblin: All right, let’s dive into that analogy of the century in a decade a little bit more. You have this vivid illustration of just how much stuff would get crammed into how little time. Can you paint a picture for us?
Will MacAskill: Sure. So we’re thinking about 100 years of progress happening in less than 10. One way to get a sense of just how intense that would be is imagine if that had happened in the past. So imagine if in 1925 we’d gotten a century’s worth of tech progress in 10 years. We should think about all the things that happened between 1925 and 2025 — including satellites, biological and chemical weapons, the atomic bomb, the hydrogen bomb, the scale-up of those nuclear stockpiles. We should think about conceptual developments: game theory, social science, the modern scientific method; things like computers, the internet, AI itself.
Rob Wiblin: The decolonisation movement.
Will MacAskill: Then of course, yeah, social and political movements as well: decolonisation, second- and third-wave feminism, fascism, totalitarianism, totalitarian communism…
Rob Wiblin: Yeah, the rise and fall of communism.
Will MacAskill: Exactly, yeah. Postmodernism. So all of these things are happening over the course of 10 years in this thought experiment.
Human decision making and human institutions don’t speed up though. So just taking the case of nuclear weapons: in this accelerated timeline, there’s a three-month gap between the start of the Manhattan Project and the dropping of the nuclear bomb in Hiroshima. The Cuban Missile Crisis lasts a little over a day. There’s a close nuclear call every single year.
This clearly would pose an enormous challenge to institutions and human decision making. And Robert Kennedy, Sr., who was a crucial part of the Cuban Missile Crisis, actually made the comment that if they’d had to make decisions on a much more accelerated timeline — like 24 hours rather than the 13 days they had — they probably would have taken much more aggressive, much riskier actions than they in fact did.
So this thought experiment is to try to really get a sense of just the sheer amount of change, including multiple different sorts of change. And we are talking about a century in a decade; I actually think that the amount of technological development we might get might be much larger again: we might be thinking about many centuries, or even 1,000 years in a decade.
And then if you think about the thought experiment there, it’s like you’ve got a mediaeval king who is now trying to upgrade from bows and arrows to atomic weapons in order to deal with this wholly novel ideological threat from this country he’s not even heard of before, while still grappling with the fact that his god doesn’t exist and he descended from monkeys.
Rob Wiblin: Which they found out a couple of months ago.
Will MacAskill: Which they found out like two months ago, exactly. Clearly the sheer rate of change poses this enormous challenge.
Rob Wiblin: Yeah. So I think people might be thinking this would all play out completely differently, because some stuff it is possible to speed up incredibly quickly as AI gets faster, and other things not.
And that is exactly your point: that there’ll be some stuff that will rush ahead — I suppose we could imagine that there could be a real flourishing of progress in pure mathematics very soon as a result of AI — while things in the physical world might happen a little bit more slowly. Some biological work that requires experiments might progress more slowly. I guess anything to do with human institutions and human decision making, that slows down in relative terms.
And that is exactly why, basically because some stuff slows down relative to other things, we end up with whatever problems result from having a shortfall of the things that we couldn’t speed up. Is that basically the issue?
Will MacAskill: Yeah, exactly. So if it was literally the case that time speeds up by a factor of 10 —
Rob Wiblin: It’s not even clear exactly what that would mean.
Will MacAskill: It’s actually a philosophical question if that’s even possible, as in if that would be a meaningful difference. But instead, what is happening is that in some areas, and even the century in the decade, it’s not going to be exactly the same tech progress — because like you say, some areas of science are slowed by regulation or by slow physical experiments, or by just the need to build capital. So building the Large Hadron Collider, for example, takes a lot of time.
But the crucial thing is that human reasoning, human decision making, and human institutions don’t speed up to match the pace of technological development. In fact, a different way you could think of the thought experiment is imagine if the last 100 years had happened in terms of tech development, but humans just thought 10 times slower, or they were only awake an hour and a half per day for some of these changes — because a speedup in one way is the same as a slowdown in another.
Science goes super fast; our institutions don’t keep up [00:15:41]
Rob Wiblin: So I think this helps to bring out the intuition of, holy shit, there would be so much chaos, so much stuff going on, and we wouldn’t have the ability to process it or think about our decisions very well.
But to flesh that out a little bit more, what do you think are the things that would go the slowest relative to other stuff, that would create the biggest bottlenecks and the biggest problems?
Will MacAskill: I think areas that are particularly regulated. So we’ll talk a bunch in this interview, I’m sure, about AI-assisted advice. Because one natural thought to this is, well, if the AI is now thinking so fast, and we’ve got so much AI labour and it’s very high quality, can’t we just use that to solve a lot of our problems?
And that might well happen in the private sector. So you might quite quickly move from human-run companies to AI-run companies. But I, by default, expect it to be much slower in the government, for example, where there’s more concerns about data privacy or even procurement.
Rob Wiblin: Or the legitimacy of basically handing over direct political power to AI models.
Will MacAskill: Exactly. What’s the point of time where you’ll be able to vote for Claude for president or something?
Rob Wiblin: I think at the moment I might do that. But a lot of people wouldn’t.
Will MacAskill: Look, I’m not saying that’s a bad idea…
Rob Wiblin: Just saying it might not happen.
Will MacAskill: I’m just saying it’s unlikely to happen, and unlikely to happen in time for this quite transformative decade.
Rob Wiblin: How much do you think we could speed things up? I mean, if we were going through this kind of momentously chaotic time and everyone just acknowledged that it was completely freaking us out, could we just run Congress 24 hours a day, every day in a year, and have people cycling in and out just discussing all of these things? Is it imaginable that they would do that?
Will MacAskill: It’s an interesting thought. The idea of having a 24/7 Congress or Parliament is interesting. It’s certainly possible that it would increase the amount of time politicians are working on things, with having a quorum by carefully splitting out who’s in session. I do think it’s unlikely that we’re going to get anything close to what we would need via that sort of method.
Rob Wiblin: I guess even getting agreement to do that is the kind of thing that would be slowed down and wouldn’t happen fast enough.
Will MacAskill: Exactly. Probably, yeah. You can imagine certainly that most labour, or certainly a significant fraction of labour, is now obsolete. So we very quickly move to just having human beings just largely outsourcing things to AI but being in management positions essentially — just constantly spot-checking their work and so on. And perhaps the size of government or at least the size of political decision making increases tenfold, a hundredfold. Again, I just don’t think it is in fact going to happen.
But this sort of thing — trying to increase the amount by which people can nimbly start relying on AI advice — is actually one of the things I think is most important and kind of pops out of this research.
Rob Wiblin: Yeah, so I said that at a conceptual level what’s happening is some things are speeding up and other things are becoming relatively slower, so we end up with whatever other problems of not having enough of the slow thing. Is there anything more going on conceptually than that, or does that basically just explain all of the problems that result?
Will MacAskill: I think it basically does sum it up. The thing I’ll add is just that people don’t know this is coming.
Rob Wiblin: So we’re not preparing now. Merely token efforts to prepare for this.
Will MacAskill: Yeah, exactly. I think the world would be acting very differently if it was taking seriously the idea of a potential, even just a significant chance of an intelligence explosion in the near term. And honestly, I’d feel just a lot better about the world if that were the case.
Rob Wiblin: Yeah. Are there any important disanalogies to keep in mind between this thought experiment of the century or four centuries in a decade, and the situation that you think we’re actually likely to find ourselves in soon?
Will MacAskill: The biggest thing that we’ve touched on is the difference between things that have certain bottlenecks, like regulatory bottlenecks or physical bottlenecks.
So areas of science, for example mathematics and computer science, philosophers have this term “a priori”: that’s knowledge that you can get just from the armchair. Though in this case, that includes running simulations, it includes looking at existing data.
So I also think in certain areas of computational biology and material science and a lot of social science, you could get very fast progress, but then much slower progress — like we still might not know what constitutes a good diet.
Rob Wiblin: It’s a very thorny empirical question.
Will MacAskill: Thorny empirical question. If it’s like, eating fat versus protein: how does that affect your life expectancy? An ideal trial on this would take 70 years because we’d be seeing how diet early on in life affects life expectancy later on in life.
And then some other areas are highly regulated. So new drugs, for example, perhaps AI invents hundreds, thousands of amazing new drugs. But then you’ve still got to do the human trials. You’ve still got to get them through the regulatory process as well. And that could act as a major bottleneck. And that doesn’t have any immediate upshots, but it’s worth bearing in mind throughout that that’s the most important disanalogy.
Is it good or bad for intellectual progress to 10x? [00:21:03]
Rob Wiblin: Yeah. So a lot of people I know, and I guess a lot of listeners to this show, are trying to speed up science and technology. I guess the progress studies crowd, but also just common sense suggests that if we want more economic growth, we want to advance technology more quickly. Should people with that mentality just be cheering this? Because this is kind of what they’re trying to create, just at a really extreme level.
Will MacAskill: I’m actually not claiming that this is good or bad relative to slower rates of progress. I will say, personally, I expect it to be too fast. And actually I do recommend slowing the intelligence explosion. But you could have the view that this is a great thing, but nonetheless it poses significant challenges and we should address those challenges — because it isn’t, in fact, what’s going to happen.
That said, the progress studies folks… I’ll talk about economic growth. I don’t talk about it in that paper, but it’s what they talk about. They’re gunning for rates of economic growth on the order of like 5%, 6% per year, because that would just be so much better than what we’ve had recently, which has been generally much slower. They’re not gunning for rates of economic growth that are like 50% or 100% — or even, if you take some of the arguments seriously, which I think you should, it could get actually much faster than that again.
So I actually think if you ask those people what would be the optimal pace, maybe they would say 10% per year or something. That’s actually kind of what I think. That seems like a good rate of growth to me. You’re very rapidly making people’s living standards much better.
Rob Wiblin: It’s like the rate of development that China experienced at its highest level.
Will MacAskill: Yeah, exactly. So we’re also familiar with it; we know that our currency can account for this. But when you’re going radically faster than that again, then you’ve just got this challenge of enormous amounts to deal with, and just not enough time to think things through.
An intelligence explosion is not just plausible but likely [00:22:53]
Rob Wiblin: Quite a lot of listeners might listen to this and think, “Maybe if science and technology really were going to speed up 10x all of a sudden, I would be worried, but I don’t really buy this.”
This isn’t going to be the main focus of the interview, making the case for the intelligence explosion. We’ve somewhat covered that in our first interview with Tom Davidson, which was two years ago now, where he explains why he believes that AI could speed up economic growth enormously. Also the episode with Carl Shulman about nine months ago now: Carl Shulman on the economy and national security after AGI. So people can go back and listen to those if they want a more thorough treatment.
But I do want to just briefly make the case for why you think it’s actually plausible that AGI and the resulting changes could speed up advances in intellectual development 10x or more.
Will MacAskill: Yeah, absolutely. So why are we making intellectual, scientific, technological progress at the moment? Well, we’re trying to do so: we’re putting researchers and to some extent physical capital towards the task, and we’re increasing the amount that we’re doing that each year by about 4% or 5%. So it’s relatively slow growth.
At some point, we’re going to get AI that can do those tasks for us instead. In particular the kind of intellectual part of that, which is the large majority of research efforts, mainly coming from people thinking rather than some stuff that machines are doing.
Rob Wiblin: I guess some people might disagree with that. Some people would say it’s the difficult sweat operations of actually doing pipetting and doing work in the physical world that leads to a lot of technological progress, rather than sitting in the armchair theorising about stuff.
Will MacAskill: OK. When I think about science and R&D, I think it’s three things to say. One is just that that actually wouldn’t fit with my impression. But also, I think it’s the Bloom “Are ideas getting harder to find?” paper that estimates that the kind of labour share versus capital share is like 80% to 20%. So it’s still mainly human minds doing R&D. Though capital, that’s still a reasonable share.
And then the final point is just that, again, this is going to vary from field to field. And some areas — like we say mathematics and philosophy, potentially; fingers crossed I’m out of a job — and other things that you can do in the armchair, they could go extremely fast, and other things might be bottlenecked.
But when we think about how much technological advancement has happened, we don’t think of that as bottlenecked by the slowest moving, necessarily. Instead it’s the sum total across different fields, perhaps weighted by importance.
But anyway, research effort is growing by 4% or 5% per year. At some point, AI will be able to do almost all or all the jobs, like cognitive jobs that a research scientist can do. But then it will keep progressing past that point. And we don’t even need to talk about improvements in capability. Even if you don’t think that AI will surpass human intelligence, it’s just that we’ll have more labour.
And actually, even if you also assume that you don’t get this recursive improvement dynamics, where AI starts designing better AI, which designs better AI and so on, and even if you think that the current trends that are driving AI getting better slow down dramatically — even with all of that, the growth rate in effective computation, namely the quality-adjusted amount of AI labour we have, would be much more than 10 times the growth rate of effective researchers today. And that would motivate a more than 10x increase in scientific and technological advancement.
Rob Wiblin: And that’s just because we can manufacture lots of computer chips that then would become basically brilliant scientific researchers, whereas we can’t just manufacture more science PhDs?
Will MacAskill: It’s for that reason, but also because algorithmic efficiency is increasing so fast at the moment as well. There’s efficiency improvements in pre-training and post-training — each of these seem to be around about 3x per year. So putting them together, you get something like a tenfold improvement in algorithmic efficiency per year. Compare that to the 5% increase in effective research effort per year of just human growth.
So once you get to the point where AI labour is most of the labour going towards research and development, then quite quickly it’s the growth rate of the AI labour that’s driving the growth rate of the overall effective research effort that we’re putting into R&D.
Rob Wiblin: And what’s the relationship between the number of effective researchers that you have and the actual rate of scientific progress? You get declining returns, I imagine. People have noted that we have like 10 or 100 times as many researchers as we did 200 years ago, and it doesn’t feel like we’re making 10 or 100 times as much progress. Far from it.
Will MacAskill: That’s right. There’s declining returns, both over time — so each year of science — and also a stepping-on-toes effect, where it’s perhaps better to have one scientist working for 10 years than 10 scientists all working in parallel for one year.
However, remember that that fact of diminishing returns is already taken into account when we’re looking ahead to 2125 and thinking about what sort of science and tech we will have then. And in fact, without AI, I really don’t think it would be as impressive as the leap from 1925 to 2025. But what we’re saying is just whatever you think 2125 would be, that will happen in only 10 years rather than 100 years. And as I say, it might in fact be much faster, grander than that.
Intellectual advances outside technology are similarly important [00:28:57]
Rob Wiblin: It sounds like you think the argument for this sort of massive speedup in science and technology… I guess maybe we should be a bit careful to say it’s not actually only science and technology; it’s intellectual progress generally.
Let’s pause on that for a minute: What are some non-science and non-technology things that you think could advance very quickly, that could be quite revolutionary, or at least upend things in society in some way?
Will MacAskill: Sure. I actually think, from going through this thought experiment and working through the dynamics of what you get with AI, this is one of the things that I hadn’t appreciated as much until I did this exercise.
So look through the past few centuries or 1,000 years: many of the ideas that upend society are intellectual. It’s like communism or fascism or atheism or the idea of universal human rights or feminism and so on. We should expect loads more of them. That is a priori work — that’s like sitting in an armchair kind of generating arguments. It’s true the diffusion of the ideas will be slower, and perhaps so slow that the effect is greatly mitigated. But I do think, just imagine some of the really big, groundbreaking —
Rob Wiblin: I guess we’ve had this whole culture war over the last 20 years or so — people who are more social justice oriented and then a big reaction against that — if that played out in one year rather than 10 years.
Will MacAskill: Yeah, exactly. But also potentially ideas like, maybe we’re in a simulation, and the arguments are just extremely good. And you’ve got this superintelligence; it’s great at all of these other domains, just clearly are epistemic superiors. There’s loads of them. That would be potentially quite disruptive.
Or perhaps just extremely strong arguments against the idea of there being an objective moral truth. Perhaps that’s something that people in fact don’t really internalise. And now you’ve got these AI advisors, just being like, “What are you doing? You have these goals and you keep not acting on them because of this false belief that you have.”
They call these “disruptive truths.” It’s obviously just extremely hard to predict what they would be. I don’t think people had any of these ideas of atheism or abolitionism or communism beforehand, before they became prevalent.
But nonetheless, it actually shows you that this is something we should be willing to nimbly adapt to. So we should be expecting to change our minds a lot, and we should be trying to build institutions that are able to change their minds quite quickly too.
Counterarguments to intelligence explosion [00:31:31]
Rob Wiblin: OK, so it sounds like you think the arguments in favour of this big speedup in intellectual progress are very robust, that it’s actually quite hard to get around the conclusion that we should expect a significant uptick in progress.
Are there any good arguments against this? Is there any way that we can rescue the idea that things might only speed up a little or not at all?
Will MacAskill: I think the strongest would be essentially thinking that it’s really very far away that we get to a true human-level scientist AI. Perhaps you’ve got AI that’s extremely good in some areas. Mathematics really looks like that’s just going to happen.
Rob Wiblin: It’s basically already happening.
Will MacAskill: Yeah. Terence Tao is saying, “I think we’ve still got three years.” But I don’t actually know how big a deal knowing all of mathematics would be in terms of world changes. Probably it’s pretty big, but it’s maybe not as big as the invention of the atomic bomb or something.
But perhaps in general for scientific advancement, there’s still human bottlenecks, and that really slows things down. And perhaps that’s going quite far into the future. So that would be a more technological bottleneck.
A second could just be social. I don’t think people, the world as a whole, wants things to accelerate as fast as I would expect they will. And so if the world kind of coordinates, then there might be deliberate efforts to slow things down. Perhaps not via some grand moratorium on AI; perhaps it’s just like 1,000 small cuts — environmental restrictions and so on.
And you might think that seems totally crazy, because the world is this global anarchy, competitive forces. And sadly, the world looks more like that than it did even a decade ago. Nonetheless, there is a lot of global action that just emerges kind of naturally.
So in China, He Jiankui produced two twins that he had genetically engineered to be resistant to HIV. Global outcry. Were it not for that, probably China would be fine with doing genetic engineering on humans. So instead he goes to jail. There’s a moratorium on it.
Similarly, we could have cloned humans for quite some time now, and again we’ve decided not to do it.
Similarly, responses to COVID: there were some differences, but they were generally pretty similar. You couldn’t buy a vaccine on the free market anywhere in the world. There weren’t human challenge styles anywhere in the world.
So there does seem to be this dynamic where you have the kind of elites of the world running different countries, and in general, they tend to do things that are broadly similar.
That, in my view, is the strongest argument for thinking there would be a slowdown. I think that might push things into the future. That’s the most plausible way in which things come later than I might expect. But the thing is, you’ll still be building up as this wave against this dam, more and more pressure.
Rob Wiblin: As it gets easier and easier to speed things up more and more, as the underlying technology advances.
Will MacAskill: Exactly. And then eventually you say, “I could just build this self-replicating factory that doubles every month.”
Rob Wiblin: And that’s accessible to many countries.
Will MacAskill: That’s accessible to many countries. It’s hard to kind of stem the tide there.
Rob Wiblin: Yeah. Another case actually is people have tried to get up sports competitions, like an alternative Olympics, where you’d be allowed to use performance-enhancing drugs — which I think might attract many spectators, but that has never managed to get off the ground. I guess because people find it distasteful. It just feels like there’s a kind of a global disdain for that idea.
Will MacAskill: Yeah, exactly. We don’t have a global government, but we do kind of have a global culture.
Rob Wiblin: I think something that makes me wonder if we could actually see some reasonable delay is just looking at public opinion polling on this. The typical person really is quite nervous about AI if you just ask them, “Are you worried? Would you like to slow things down? Should we even ban smarter-than-human AI?” A lot of people are already on board with that, at least in the US and the UK; I don’t know what people would say elsewhere.
But at the point where this actually starts to happen, maybe that would begin to bite, and you would just have almost a consensus, or a supermajority of voters in favour of banning a lot of these things. I don’t know whether that would actually function. And it’s really hard to know how robust those opinions are, because it’s not as if the typical voter has thought deeply about the question of superintelligence.
Will MacAskill: For sure. But yeah, that’s a plausible way in which you could get that across multiple countries as well. Because people don’t want to lose their jobs, as that’s a source of meaning and social status for them; people are generally just really quite cautious about technology.
An additional dynamic that could occur is, if things are kind of spread out, then you could have AI making people really quite a lot richer even than they are today. And that makes people even more cautious. Over the past many decades, the richer people have gotten, the larger fraction of their income they spend on extending their life, which makes total sense.
Rob Wiblin: The better off you are, the more you should worry about what you have to lose.
Will MacAskill: Yeah, and also the less you have to gain. Actually, if you look at standard estimates of people’s diminishing marginal utility of money, the average American on $60,000 per year is two-thirds of the way to their best possible life per year.
Rob Wiblin: Given the technology that we have now.
Will MacAskill: Yeah. Or given the wealth that they have now. And if they were on $2 million a year, they’re more than 90% of the way. Oh sorry, you’re right: given our current technology. Future technology could change that.
Rob Wiblin: Yeah. OK, I’ve got to keep resisting the temptation to dig into these things more, because it’s a bit more of an overview episode.
The three types of intelligence explosion (software, technological, industrial) [00:37:29]
Rob Wiblin: In the paper you lay out the three different types of intelligence explosion that can occur. So there’s a software intelligence explosion, then there’s an intellectual one, and then an industrial one. Is that right?
Will MacAskill: Technological and then industrial.
Rob Wiblin: Yeah. So maybe I need to have this explained again. Can you explain the three different types of intelligence explosion? Because they each create different dynamics, and they occur at different stages.
Will MacAskill: Sure. So there’s this separate paper written with Tom Davidson and Rose Hadshar, which is describing these different intelligence explosions.
Previously I was arguing that even if you didn’t have this recursive improvement, just the sheer rate of progress at the moment would be enough to drive forward a century in a decade. But in fact, I think we will have this recursive improvement, and that will speed things up even more.
The first is a software feedback loop, where AI systems get really good at designing better algorithms, and so they make better algorithms, which means you can make even better AI that helps you make even better algorithms, and so on. So that’s a software feedback loop.
Second is a chip quality or technological feedback loop, where AI gets really good at chip design or the other aspects of just making higher quality chips, where you get more computational power per dollar.
And then the third is the industrial explosion and the chip production feedback loop. That’s where you now have AI and robotics, and rather than needing human labour to produce more chips, and in fact to just produce more goods in general, instead you now have AIs controlling robots — such that you can have wholly autonomous factories producing goods, including chips and so on. And that would be another kind of feedback loop where just the more computational power you have, the more AIs you have, and the higher quality AIs you have.
Rob Wiblin: OK, so we have the software intelligence explosion first, and we expect that to come first, because it seems like we’re already on the cusp of it. And it’s also probably where you would invest your effort first, because whenever you improve the algorithms, you can immediately roll it out across the entire world, across all of the software. Just like with any other software update, we can immediately deploy that globally.
Then you have the research into improvements in the chips themselves. That potentially could speed up or could provide an enormous amount more computational power. But of course you actually then do have to manufacture the chips, so it takes a little bit longer.
And then you just have the enormous industrial scaleup, where now we have a pretty close to optimal chip design — or it’s improving a lot, but still it’s way better than what we have now — and now we just need to make an enormous number of chips to massively scale up the effective AI population. I guess that could happen extremely quickly itself, because by that point we’d have much better manufacturing technology, much better robots and so on.
Will MacAskill: Yes, that’s right.
The industrial intelligence explosion is the most certain and enduring [00:40:23]
Rob Wiblin: I wanted you to lay this out because I want people to have in mind the idea of the software intelligence explosion, because it really does feel like that could kick off. Maybe it’s already kind of kicking off, or it certainly might within a couple of years. And that could lead to many orders of magnitude improvement in these algorithms and in the thinking of the AIs.
But I think you wanted to lay this out because you think people are underestimating the third one, the industrial intelligence explosion. Can you explain why?
Will MacAskill: Sure. So this is given a context: I do think that more attention should be on the software intelligence explosion than the industrial. But this is a context where I find most people think only about the former.
Now, my colleague — cofounder of the Forethought Centre, Tom Davidson — has, in my view, done the most in-depth modelling of the dynamics of the intelligence explosion, and he’s actually only 50/50 that you get this recursive feedback from software. Because suppose you double cognitive inputs, but you only get 1.5x, plus 50% of the outputs: then you’ll get this little wave of advance, a kind of step forward once you have AI that automates AI R&D itself, but that will then fizzle out.
He also thinks that even if you do get a period of recursive improvement, it doesn’t take you all the way from where you would be at the time to theoretical limits of algorithmic efficiency, which is really far: it’s maybe as much as nine orders of magnitude improvement. In fact, he thinks maybe you get three orders of magnitude improvement in algorithmic efficiency, which would be like the jump from GPT-3 to GPT-4.
Obviously he has quite a wide range. The precise numbers aren’t that important. I think you could very reasonably be more aggressive than this. But one thing is just that I do think it’s totally on the table that you don’t get much in the way of a software feedback loop. So you get this leap forward, maybe you get a few years’ jump forward in a year, but that’s not enough to be this kind of, everything’s changed now, everything’s off the table.
Rob Wiblin: And the intuition for that would be that the AIs are getting smarter, and so they’re able to make more progress on these difficult questions of how to make AI better. But that challenge is just getting so much more difficult so much more quickly that you don’t get a feedback loop kicking off where things get faster and faster. Instead it just peters out basically.
Will MacAskill: Exactly. In particular because they’re bottlenecked by the amount of compute they have access to: that’s the key thing. Which leads us into the industrial explosion.
So the argument for getting accelerating growth in industry once you have wholly autonomous industry — AI and robots just doing everything — that’s really strong in my view. Because you’ve got this factory producing goods. Now, how many goods do you have if you have two factories? It seems like twice as many. Standard economic view.
But if one of the things that those factories are producing are AIs that are helping to improve the efficiency of the next generation of factories and so on, then even if that’s helping just a little bit, making it just a little bit more efficient on top of the doubling you get just from replicating the factories, then you’ll have accelerating growth.
Is that not clear?
Rob Wiblin: Well, I guess you definitely would have accelerating growth in the amount of factories. Does that necessarily lead to accelerating growth in the amount of intelligence or the effective amount of intelligence? Because I guess you could have declining returns on the impact of all of those factories.
Will MacAskill: It could be declining returns on the impact, depending on how you’re thinking about that. I’m not, for example, talking about impacts on GDP. All this discussion, we’re now careful to kind of sideline that.
Rob Wiblin: Because it’s not really going to be able to measure what we care about.
Will MacAskill: Exactly, yeah. There’s a lot of objections you can have there relating to Baumol’s cost disease — perhaps all of GDP goes to live musicians or something — and we’re now just distracted by something: by measurement issues, basically. But what we really care about is industrial power.
But you would have twice as many AIs who are somewhat smarter, so in that sense you’ve got more than a doubling, if that makes sense.
Rob Wiblin: So it’s a doubling of effective population.
Will MacAskill: Exactly. Because here I’m thinking about both population and quality, as in the intellectual capability of the AIs.
Rob Wiblin: So with the industrial explosion — because it’s just a literal increase, and a very clearly possible increase in the number of beings and the amount of thought that is occurring — it’s just harder to escape from the conclusion that there would be enormous increase in the amount of stuff going on in the world, and the amount of progress that’s made, and the amount of effective power that exists, at least for some actors.
Will MacAskill: Yes. So why pay more attention to the industrial explosion? One is just that it’s more likely. So it totally might be the case that you don’t get the software, but you do get the industrial. Second is that the kind of plateau that you get from an industrial explosion is very high.
Rob Wiblin: I guess it’s grabbing all of the resources on Earth and then eventually the solar system and then elsewhere.
Will MacAskill: That’s right. So automated factories at the moment are a tiny fraction or basically zero of civilisation. Just looking at energy, human civilisation could increase a thousandfold in its energy capture just on Earth. And then if you’re willing to think about — which I think you should — space-based energy capture as well, then you can increase that by a billionfold again. So the plateau is very high.
And then the rates of growth at peak rates might be very fast indeed. And the arguments here… Initially I was just like, “What is this?” And other people have the same reaction. Over time I have been quite convinced.
This all comes from Carl Shulman — people should listen to the podcast — but the core idea is just that we have an existence proof of extremely fast-growing biological machines, namely fruit flies. They have brains, they have the ability to move around, they have the ability to take in fuel. And in ideal conditions, where there’s a big resource overhang, they double in biomass within a week.
So the thought is, given sufficient technological ability, technological development, can we produce something that grows as quickly as fruit flies do? It seems like surely we should be able to.
Rob Wiblin: We should be able to beat fruit flies.
Will MacAskill: We should be able to beat fruit flies, or at least match them. And then secondly, would we hit that point before we’ve gone through the resource overhang that civilisation currently has?
And both of those things seem reasonably plausible to me. Again, I’m not saying much more likely than not, but certainly on the table in the sense of we should be thinking about this quite seriously.
Rob Wiblin: So we won’t dive into that too much because we went into the arguments for and against that conclusion in the Carl Shulman episode on the economy and national security after AGI. Is there anything else you wanted to say about that?
Will MacAskill: Yeah. On the importance of the industrial explosion, it has a couple of really big upshots, I think.
One is that it means that sometimes certain things can be of enormous importance just because of the sheer quantity of the technology. So at the moment, I do not think that all-out nuclear war would pose a meaningful risk of human extinction. It’s not zero, but it’s very low in my view. However, if the whole world economy is 1,000 times bigger, a billion times bigger, you could have extremely large nuclear stockpiles, such that extinction risk given a nuclear war could be very large indeed. Similarly with drone armies.
Rob Wiblin: Yeah, I guess in this world you’d imagine there’ll be billions of drones, so they wouldn’t necessarily have that much difficulty.
Will MacAskill: Yeah. Or perhaps trillions, quadrillions of drones. Stalin has this quote, “Quantity has a quality all of its own” — and the trillion drones, they have a quality all of their own.
Then the second thing that I really worry about is that I think the industrial explosion gives authoritarian countries an advantage that they might not otherwise have, for two reasons.
One is that authoritarian countries can have a higher savings rate. So when we’re talking about how you’ve got these two factories and they produce more than twice as much output, you only get this acceleration, and you get a certain speed of acceleration, if you’re reinvesting those outputs as inputs: building even more factories. If instead you’re consuming them —
Rob Wiblin: Yeah. If you take one factory and you just make consumer goods, then you don’t grow. But if you just make factories, make factories, then you’re growing.
Will MacAskill: And historically, authoritarian countries have been able to do exactly that, to a much higher degree than democratic countries have.
Rob Wiblin: That’s because they can force people to save, basically.
Will MacAskill: Yeah. A single person gets to choose. So even though people don’t want to — they would like to be enjoying themselves — the authoritarian leader just wants to win. The archetypal case of this is Stalin, who had this massive reinvestment programme to just build more factories in the Soviet era, and did sustain very high reinvestment rates.
Rob Wiblin: I think about three times the typical rate that you would see in a rich country today.
Will MacAskill: And then the second thing is the ability to just skirt regulation that gets in the way. So environmental regulation in particular really slows down industry in the US, UK, and other democratic countries. Again, if you’re an authoritarian country, you can just get rid of it. You can just snap your fingers and it goes away.
Rob Wiblin: Whereas in the UK or the US, it’s quite a gradual process of negotiation to scale back this stuff, and quite challenging.
Will MacAskill: Exactly, yeah. So already today, it is very clear that China would have an enormous advantage over the United States, just for example in terms of very rapidly building out industry.
Rob Wiblin: Oh, there’s some amazing fact that I’ve now forgotten about the amount of steel they output.
Will MacAskill: It’s about the amount of energy they add. It’s something like every year, China adds more energy than the US has done in the last 20 years. Something like that.
Rob Wiblin: That sounds like a thing that people might say. No, I’ve seen this stat as well. China does use it and is increasing its energy use enormously.
Will MacAskill: Very rapid.
Rob Wiblin: Yeah. An interesting thing that’s going on there is, because China is at an earlier stage of the standard progression in development, they’re still highly industrialised and haven’t made the switch to services that countries like the UK have — which in this industrial intelligence explosion would really work to their benefit, because they’re so close to being able to scale up the manufacturing.
Will MacAskill: Exactly. Whereas the human services have been made obsolete.
Is a 100x or 1,000x speedup more likely than 10x? [00:51:50]
Rob Wiblin: In the paper you talk mostly about this century in a decade idea, where you’re getting like 100 years crammed into 10 years. But it sounds like you think that actually — if you add up all of the different feedback loops that we might be triggering off the software stuff, the improvement in chips, the massive scaleup in the number of chips — that might be kind of lowballing things, and 100 years in one year might be more realistic, or even conceivably more than that.
It seems like it does actually matter what is the scale here, because if you think it’s 100 years in 10, you might say, let’s just make some adjustments and try to muddle through here — we actually have a reasonable shot of navigating things at that pace. But if you’re imagining 100 years in one year, that just sounds too crazy. And it’s hard to see that we’re anywhere close to being able to handle that sort of incredibly rapid intellectual change, and maybe we would say the only option really here is to just pull the alarm and try to slow things down. What do you think?
Will MacAskill: I mean, my hope was that using a century in a decade as a bar was enough that most people or everyone would think, “Wow, if that happened, that’s really enough that we’ve really got to prepare.”
But I absolutely agree that we might well get… I mean, I might naturally think of it more as many centuries occurring over the course of a decade. Although as part of that, you might well get a century’s worth in one year. But perhaps you’ve already had one century in five years, and then another century again in one year, and then perhaps it starts to slow down again after that.
But yeah, I absolutely don’t want people to come away thinking that if you’ve got the rapid AI view, then it’s a century in a decade, as if that’s the upper bound. Because I sometimes do worry about people lowballing what they actually expect to happen, because even the lowball is so crazy.
Rob Wiblin: I guess it’s natural to do that, because it sounds more reasonable and more persuasive.
Will MacAskill: Exactly. So I want to both say that I think it’s really quite likely we get a century in a decade, but that is a lowball, and it could be much faster again, could be much more sustained again.
And in fact, I think probably we want to be preparing for the more extreme scenarios, because they’re even more haphazard, even more crazy. Even in cases where you might think we get a billionfold increase in energy capture within one year, and perhaps that’s now faster than I expect. Nonetheless, some of the arguments — especially the arguments about recursive improvement and accelerating growth — have the implication that actually that’s the extent that you get. And I think we should be taking it seriously and preparing.
Rob Wiblin: Yeah, there’s a systematic reason why any being should be very concerned about rapid change in their environment — people almost never point this out, although maybe it is just the underlying intuition either way — which is: you exist now, so you know that the world as it is constructed now is conducive to you not being immediately destroyed.
If you get 100 or 1,000 years’ worth of change in a single year, that’s an enormous amount of stuff that is being altered in the environment. That could mean that you’re no longer able to survive for one reason or another. Maybe you’ve been disempowered, there’s been some kind of conflict that destroyed you, or just the world has changed so the environment won’t sustain you anymore.
If you were an animal, a standard species evolving over time, you don’t really want the environment to be massively changing, because there’s just a good chance that it will drive you extinct.
So yeah, there are big benefits to technological change, but there are reasons why just really dramatic changes in your environment quickly should make anyone feel nervous.
Will MacAskill: Yeah, for sure.
The grand superintelligence challenges [00:55:37]
Rob Wiblin: All right, let’s push on and talk about the grand challenges. Actually one of the grand challenges thrown up by this is the risk of rogue AI — of misaligned AI trying to take over and disempower us. That’s one that we’re not really going to talk very much about today, and we haven’t really discussed it yet.
I actually don’t know what your view is these days. Is that because you don’t really buy that that’s a serious or likely problem? Or is it just that we’ve done lots of episodes on that and people are generally talking about that quite a lot, so you think it’s these other things that are most neglected?
Will MacAskill: Yeah, the primary reason is the latter. You’ve had plenty of discussion on it, and my particular focus is on the other issues. It is true that relative to the people who are really worried about AI, I’m still on the more optimistic end: 1% to 10% — that’s optimistic nowadays, but that’s what my chance of existential risk from AI is.
And it’s certainly the case that the more optimistic you are on the misalignment side of things, the more relatively important these things are — both because it’s more likely that you actually have to deal with these challenges and they’re not preempted by AI takeover and because the importance of AI takeover is somewhat lower because it’s only 10% rather than 70%.
But I should say this is in no way a set of priorities that are idiosyncratic to me. I’ve talked to people who have a probability of existential catastrophe from AI takeover of 70% who are very on board with this agenda.
The big question is maybe quantitatively, how much should we divide our resources between misaligned takeover risk versus these other challenges? But I basically think the large majority of people who are really close in here would say that this other stuff is really neglected and deserves more attention.
Rob Wiblin: Yeah. So we’ve painted this picture: we’ve got massive increase in intellectual progress and I guess other changes happening in society. I guess then you would stop and think, what are the specific worst things, or the most problematic things, the things that we would need to prepare ahead of time to deal with? You call these “grand challenges,” and you’ve tried to figure out what they are. Do you want to introduce this idea of grand challenges first?
Will MacAskill: Sure. Yeah, I use the term grand challenge. It’s a broader term than existential risk, because a grand challenge is essentially anything that’s a fork in the road of human progress. It’s some development, whether political or technological, such that how society handles that development makes a significant impact to the expected value of present and future generations — where “significant” might be 0.1% or something, so it can be a non-drastic impact, but nonetheless still very important.
This list of grand challenges, it is a litany. There’s a lot of things, because I want to be comprehensive and I want to get across the overall view and how important it is to have the overall view.
I divide them into a bunch of buckets: disruptive technology, technology developments that help with concentration of power, lock-in of institutions or values, digital rights, and space governance, adverse selection pressures, epistemic disruption, and then also the bucket of unknown unknowns, which we touched on a little bit earlier.
Rob Wiblin: Yeah. We won’t be able to talk about every single one of those today, because I guess it would exhaust us, and exhaust the audience as well. But we’ll get through more than half of them.
Grand challenge #1: Many new destructive technologies [00:59:17]
Rob Wiblin: Maybe the first one that it might be worth opening with — these don’t have a natural structure, so we had to debate back and forth what order should we do them in — but the first one that’s perhaps easiest to understand and maybe hardest to argue with is that we would be very quickly uncovering new destructive technologies just like we have in the past. Do you want to explain what the issue is here?
Will MacAskill: Sure. So we go very rapidly through civilisation’s technological tree. There will be probably, as in the past, distinctive incentives to invest in military technology. I’m sure the audience is familiar with the risk from biological weapons. That is absolutely one of the top risks there.
But we should also be potentially expecting other sorts of dangerous technology and destructive technology as well — such as drones, where again you could have a very large increase in the number of destructive drones that potentially have the ability to be built in secret as well. So mosquito-sized drones that could each individually be deadly, having 8 billion of them that would fit in five shipping containers. So it’s actually remarkably small.
Rob Wiblin: Are you confident that a mosquito-sized drone would be able to kill a person? I suppose you could have poison in it or a tiny amount of explosive and explode on your neck. I guess humans are just very fragile, so it’s not hard to kill us.
Will MacAskill: Yeah, exactly. So there’s that. We mentioned before there could be just quantitative changes, so a massive increase in the number of nuclear weapons. We went from having none to having 40,000 at peak in the ’70s and ’80s. Well, if the economy is suddenly 100 times bigger, then perhaps that could increase much more again.
You could have technology that disrupts the existing balance of power. I think at the moment we’re in a quite lucky situation. There’s this term offence-dominant and defence-dominant. It’s very confusing, because it’s very hard to know what sort of weaponry is what. But nuclear weapons are regarded as defence-dominant — because if we’re two rival countries, and I attack you with my nuclear weapons, you can retaliate and we both get wiped out. So I’m strongly not incentivised to attack you.
However, if I have really good missile defence — which there’s been a lot of investment in and not that much progress — but with future technology, if I had really good missile defence, I could annihilate you, essentially, while knowing that I would not face any repercussions myself. That would give me a very strong incentive to annihilate you before you get that technology and can threaten me in the same way. So that would really disrupt the offence-defence balance at the moment.
Then there’s potentially other disruptive technology as well, including nanotechnology, which could create artificial viruses or pathogens or things that behave in a similar way.
There’s just sheer environmental disruption. If you’re imagining this industrial explosion, the environment is not going to fare very well. In fact, if you’ve really got a race to build up greater industry, at some point you’re just producing so much energy that sheer thermodynamics heats up the planet.
Rob Wiblin: I guess we all have to cower in air conditioned spaces just to have it be human livable, right?
Will MacAskill: Yeah. Or have mirrors or other sorts of solar geoengineering. Ideally we could coordinate for that not to happen, but if we’re in this industrial race, it could.
And then also cyber offence and defence. A particularly concerning thing here is there’s very strong incentives, once you’ve built up automated militaries, for one country to attack another. Because if I can have a successful cyberattack against your automated army, then not only do I get to destroy your army, but I get to control it too.
So that’s the kind of litany of various destructive technologies that could be developed over the course of this period.
Rob Wiblin: To me it’s pretty hard to argue with this is a concerning issue. I suppose if I put myself in the mindset of someone who’s sceptical that this is a special bad as we’re making out, they would probably say that things haven’t necessarily gotten more dangerous over the last 100 years, and that’s because at the same time as you’re getting destructive technology advances, you also get defensive and constructive technology advancing, cancelling it out. How much should that reassure us?
Will MacAskill: To some extent. A couple of things could be causes for optimism. One is if people are getting richer and therefore just more cautious in general.
And also I do think there might be more contingency in what technology gets developed when as a result of a technology explosion, because you’ve got this huge AI labour force and you can just direct them towards whatever sort of technology you want. So in principle, in wise, enlightened hands, you could really accelerate defensive technology in advance of the offensive technology.
I think it’s very far from a complete reassurance though, because already the United States and other countries invest massively in missile defence that would enormously upset the offence-defence balance in the world today. We should expect that to continue.
Rob Wiblin: Yeah. Given more opportunities to do that, won’t they just do it again?
Will MacAskill: Exactly. And then there’s some other technologies that just have an intrinsic difference between the offence-defence balance, where bioweapons may be clearest. Again, at a sufficiently advanced level of technology, it’s just fairly easy to release a bioweapon and really quite hard to defend against it. Maybe we could, but it really takes a lot of effort.
Rob Wiblin: And the structural problem is that there are many areas of technology that are defence-dominant, or at least not offence-dominant, but there are at least a few that seem offence-dominant, where it’s much easier to cause destruction with bioweapons than it is to defend against them. And you only need a handful of those, because people who want to cause destruction, or for military purposes people will preferentially investigate exactly those and max them out as much as possible.
So the basic problem here is that we’ve had enough time, a reasonable amount of time after we invented nuclear weapons to figure out how not to use them. Here, we will not have very long for governments, parliaments, humans to be thinking about this, figuring out how do we design it so that we don’t feel the absolute need to use these incredibly destructive weapons that we just invented that would just be foisted on us very quickly and we’ll figure out how to respond.
Will MacAskill: I hadn’t really thought about this analogy before, but it is striking that you get the development of nuclear weapons, and then you have this enormous, very fast ramp-up of the number of nuclear warheads, and then over time a kind of gradual decline.
Well, what’s happening there? Plausibly it really does look like that’s good — but slow — processes of human reasoning and human communication with each other, working together towards this better equilibrium.
Grand challenge #2: Seizure of power by a small group [01:06:45]
Rob Wiblin: The next grand challenge is the risk of seizure of power by a small group.
We’re actually not going to talk about that a tonne here. Not because it’s not important or neglected or a big issue — but rather because it’s actually so important, so key, that we’re doing a full episode with your colleague Tom Davidson on that. I’m not sure whether that will have come out before or after this episode, but either way people can go and listen to the episode with Tom Davidson, which is all going to be about seizure of power by a small group.
But you just want to quickly explain what the problem is here?
Will MacAskill: Sure. The problem essentially is just human takeover as well as AI takeover. And yeah, I think it’s one of the key problems. Like Tom, I think it’s in the same ballpark of importance as misaligned AI takeover risk too.
I see two aspects to the problem. Tom will talk more about the first of them: democratic countries becoming nondemocratic because of a military coup or democratic backsliding or something in that vein — where AI could lead to intense concentration of power and incredible ability for small groups to entrench power within a country.
And then the second way is nondemocratic countries becoming much more powerful and perhaps themselves becoming even less democratic than they were before. We talked about that a little bit with the industrial explosion, where authoritarian countries with high savings rates could just outgrow and out-industrialise other countries.
That’s a second way, I think, in which you can get intense concentration of power, because then if you do have an authoritarian country and you’ve got AI, you can make your whole workforce loyal to you; you can make your whole military, your whole police force — all provably, perhaps, or certainly to a very high degree of likelihood — aligned with your commands.
Is global lock-in really plausible? [01:08:37]
Rob Wiblin: Let’s actually back up a minute and think about this issue of lock-in, because that’s going to come up again and again through the conversation, and it’s highly related to this issue of seizure of power.
So there’s this risk that within the next century, reasonably soon, we could end up tying our hands somehow, getting really stuck on a path from which we can’t escape, even if many or most people might not want humanity to be going down that route.
I think it’s interesting that despite being at this game for 10,000 years, humanity doesn’t feel locked in at all. At the moment we have this very freewheeling, kind of chaotic nature, where it’s very unclear what direction we’re going, and there’s no one group or person who is especially powerful, who has grabbed so much power that they can really control the direction of things.
Why is that? Why haven’t we had any lock-in so far?
Will MacAskill: Yeah, lock-in is the idea that some institution or set of values kind of closes down the possibility space for future civilisation. And note that definition is neither good nor bad necessarily. Obviously there’s a lot to worry about with lock-in, and we’ll talk about that, but the key thing when we talk about lock-in as well is that it’s indefinite.
So there have been many attempts to lock in values and institutions in history — some successful, some not. In What We Owe the Future, I talk about a lot of examples from history, from shortly after the Confucian period onwards.
One example that I think a lot about is the constitution of the United States. It is very hard to change the US Constitution — much harder than in other countries. That’s I think an example of at least temporary lock-in, where a relatively small group in the late 18th century decide how the country is going to be governed, and make it extremely hard to change that. And that’s still guiding the American political system in a very significant way today.
Now, you’ve asked why have we not had lock-in in the past? I think we have to some extent, actually. So there’s only one human species. That is because Homo sapiens outcompeted their rivals.
Rob Wiblin: A polite way of putting it.
Will MacAskill: Well, in some cases interbred as well. I do think of that as a sort of lock-in. But thankfully it seems like the future is still very open.
There are other things that could well amount to lock-in. I actually think the US Constitution: it’s quite plausible to me that the US wins the race to AGI, becomes such a large part of the economy that it’s a de facto world government, and that guides just how the very long-term future goes — in which case Madison and Hamilton were in fact locking things in in a really indefinite way.
Rob Wiblin: Although even then I guess you’re locking in more of a process than a very specific set of values that we’re just going to then operationalise.
Will MacAskill: Yeah, that’s right. So there are ways in which I think, once we get to AGI and beyond, things get really quite different.
Let’s say that the American founding fathers wanted to lock in a very specific set of ideals or values — I’m not actually speaking to their true psychology, but supposing they really did want to. Well, it’s hard for them because they’ll die, so they can’t continue to enforce that afterwards. It’s also not even clear what they might want.
So we have the Supreme Court adjudicating the meaning of the Constitution. That obviously changes over time. However, with AGI, we can have very specific sets of goals encoded in the AGI. We can have that, we can have many backups. We can use error correction so that the code that specifies those goals doesn’t change over time.
So we could have a constitution that is enshrined in the AGI, and that just persists indefinitely. In the past, you’ve got these attempts to lock in values on institutions, but there’s just a decay over time. But over time that decay rate can get lower and lower. And I think with AGI it can get to essentially zero over time.
Rob Wiblin: I guess many dictators or many people who have had a strong ideology that they wanted to push on people and have promulgated forever, they’ve tried to create a sort of lock in, but they get old and die. Their followers don’t believe exactly the same thing. So the ideas drift over time and then they die.
And then there’s another generation that believes a different set of things. You can’t even clone yourself to create another person who has the same genes who might be more inclined to believe the same thing. You’re constantly getting this remixing at all times, which keeps things a bit uncontrolled. It’s impossible for anyone to really impose their will indefinitely, because they just can’t. There’s no argument that’s persuasive enough to get all future generations to insist on it.
But with AGI, that completely changes, because you can make an AI that has whatever goals it has, and you can just copy it. Firstly, it will never be destroyed necessarily, and you can just make an unlimited number of copies of them. And even if it’s drifting over time, you can just reset it back to factory settings and then have a go again. It’s like a total revolution in your ability to lock things in.
Will MacAskill: Yeah. That was extremely well explained. You should really work on this stuff. Yeah, exactly.
So firstly, if humans want to create an institution, and they say this institution has this AGI; in matters of unclarity, it determines what things happen. It would be like you’ve got the Constitution and you can summon up the ghosts of Madison and Hamilton, maybe even George Washington, and they can say, “This is what we want to happen. This is what you want to do.”
But like you say, in a more extensive way as well, we’re also just literally creating and designing most of the beings that will exist in the world, and we get to choose what their values are. So one generation can say that they’ve all got to abide by this liberal democratic constitution, or they’ve all got to abide by the principles of the Chinese Communist Party, or by Maoism or Leninism or whatever set of views or values. And then once you’ve now got almost all beings that exist with that particular set of values or following a particular set of principles, it’s really hard to see what’s the mechanism by which that changes.
Rob Wiblin: If they never die or change.
Will MacAskill: Exactly.
Rob Wiblin: The switch to this new technological structural dynamic with AI — where it doesn’t change over time, can be copied unlimited times, can be reset whenever you want, has this fixed set of goals (potentially, if we set it up that way) — is that the only way that we end up with lock-in in the future in the way that we didn’t have in the past?
Will MacAskill: I think that is by far the most important. There are a couple of other things to add. When we were thinking about how maybe some particular dictator wants to lock in their regime, normally they’re just one country among many — and in fact, being able to see what non-dictatorial countries look like and how well they’re doing is a cause of decay for those countries themselves.
However, it’s pretty plausible to me that we will get something like a global government, or certainly very strong international governance, in a way that really undoes some of that effect. There will no longer be a competition between different kinds of regimes.
There’ll be pressure to do this I think because there will be so many challenges over the course of the intelligence explosion and what follows. It also just is quite plausible to me that a single country ends up winning out and being most of the world economy.
Rob Wiblin: It goes through this industrial explosion and just ends up being like 99% of global output, so basically it’s in control.
Will MacAskill: De facto a world government. Exactly. So that’s one aspect.
A second is when it comes to allocation of resources outside of the solar system. So this gets into more sci-fi territory, but we can talk more about why, if you take the technological and industrial explosion seriously, that that actually comes quite quickly. But it’s pretty plausible to me.
I’m not certain, but it’s pretty plausible to me that other star systems are defence-dominant — so that if you go and claim it, so you’ve got your own little solar system now, you can defend that even against a quite aggressive —
Rob Wiblin: More resourced opposition.
Will MacAskill: And if so, then the point of time at which extra solar resources are allocated and people are able to grab them, in fact the moment of leaving — not the moment of arriving — could be a point where essentially the economic power distribution among different parties gets locked in, and forever, because you’ve now used all the resources that are available.
Rob Wiblin: Yeah. Hypothetically, if you said we’re going to divide space 50/50: we’re going to reserve half of the star systems for this particular set of values, and we’re going to send out the AIs that are going to instantiate those values to this half, and another half to the other half. I guess they’ll be going so fast that it wouldn’t be possible for anyone to catch up later.
Is the idea that they’ll be going sufficiently close to the speed of light, as fast as you technically can without getting damaged by the dust in between the stars, so there’s no way of catching up and beating them there? They’re going to get there first. It’s defence-dominant. Now basically we just split it between those two value systems, and that is how it will be indefinitely?
Will MacAskill: Exactly. Maybe there’s trade. That could happen. That could be a way you get change. But the outcome you’ll get from that trade will be dependent on the starting endowment among different groups. So you’ve got the United States solar systems and the Chinese star systems. Maybe they want to swap a bit, but nonetheless, the balance of power will have been determined at the allocation stage.
And of course there could still be a lot of change in the societies that go there, but that’s just an open question. It depends on whether those societies are themselves locked in or not.
Grand challenge #3: Space governance [01:18:53]
Rob Wiblin: I guess that helps to explain the next grand challenge, which is space governance. Is there anything more to say about what is the urgent challenge that we face with space governance?
Will MacAskill: There’s two challenges with space governance. As a couple of starting intuition pumps for why this would become important, one is: if the industrial base is growing a lot, we’re going to need more resources, and you’ve got a billionfold as many resources in space as there are on Earth.
And then a second is, once you’ve got AGI, it’s just much easier to do things in space to have industry. Because at the moment, if you want to have something happen in space, you’ve got to send human beings extraordinarily poorly adapted for living in space. With AI, however, you don’t; it could just go autonomously.
Then I think there are two ways in which space becomes very important. One is as a risk for concentration of power within the solar system. As we’ve said, there’s a billionfold as many resources within the solar system as you’ve got on Earth. The sheer energy that you can claim from the sun on Earth versus in total around the sun is a billionfold, 2 billionfold greater. There’s other resources as well, but I think those won’t be the limiting factor compared to energy.
So you could have a situation where it’s a race — to keep it simple, between two countries. One is in the lead. Because of these intelligence explosion dynamics, the growth path is faster than exponential. So the fact that one country has maybe quite a small lead to begin with — it’s a few months or a year ahead — means that when it’s growing at its peak rate, it’s far ahead technologically.
That is only a temporary advantage by default, because eventually it will slow in terms of its technological development; the second country will catch up. However, the fact that there is this huge overhang of resources means that that country can turn the temporary advantage into a permanent advantage.
On Earth, the main overhang of resources is, in my view, lack of scarce resources. In my view, it’s the surface area of the oceans — in particular the high seas, which are currently unclaimed, and you could have floating solar farms that power data centres there. Most energy hitting the Earth lands on the high seas.
But a much larger gain in resources is potential for solar farms around the sun. So you can imagine this country, it’s got this temporary lead, and it is then able to build a very widespread industrial base around the sun, very extensive solar farms capturing most solar energy. It has turned that temporary advantage into a permanent advantage with respect to the laggard.
Rob Wiblin: Because it’s turned that into basically an industrial scaleup by capturing way more energy. So it’s going to be able to run way more AI, and so it will just continue compounding faster than the other country indefinitely.
Will MacAskill: Yes. Imagine that both countries end up with the same technological level, but the first country has a billionfold as many resources. The first country just is a lot more powerful than the second.
Rob Wiblin: I guess this raises the question that, at the point where the first country starts trying to claim all of this surface on the sea, or starts sending out lots of probes to capture energy coming from the sun across the entire solar system, wouldn’t the laggard country want to interfere with that? Because they would see where that is going to lead and they’ll be annoyed that they’re capturing the seas and all these other resources. Is it actually kind offence-dominant to just go and grab that stuff?
Will MacAskill: So this is something that I think might actually be quite contingent and quite dependent on earlier decisions and advocacy and institutions and treaties. Because one of the reasons I’m worried about this is precisely that it allows one country to get a decisive advantage over another, an enduring advantage, without ever making any of the other countries worse off than they have been, without ever using violence or even other sorts of cybercrime or anything like that.
And at the moment, international law around space in the Outer Space Treaty, which I think is 1967, is not super robust. In particular, private ownership of space isn’t explicitly disallowed; it’s a little vague. But even the international agreements saying that resources off-world are the province of all mankind and so on, there’s not these strong norms around it.
So if you imagine one regime where it’s just very clear that anyone can grab space resources and that’s fine as a matter of international law, and another regime where it’s very clear that that would be stealing other people’s property if you were to do that, I really think that could make the difference as to whether the lead country does in fact grab that.
A few things going on. If that lead country is democratic, then the public might be against —
Rob Wiblin: Breaking an agreement that they made, effectively.
Will MacAskill: Exactly, yeah. Second is then the rest of the international community could more easily — incredibly, perhaps — make threats of military action to stop them from doing that. It might be that it just happens anyway, because if the laggard country says, “If you start grabbing this, then I’ll attack you on Earth or I’ll try and destroy your off-world industrial base.”
Rob Wiblin: They’ll lose, I guess.
Will MacAskill: Well, we’re stipulating at this point in time that they’re behind significantly technologically, so it depends crucially on the offence-defence balance there. But I do think this is a case where earlier norms and laws, maybe they turn out to be totally irrelevant, but maybe they actually make a crucial difference there.
Rob Wiblin: So I guess the general challenge is trying to make the right agreements or the right commitments ahead of time about how space should be divvied up, and also trying to find some way to actually have them be enforced or have them be credible or have them matter at all when one country is in the lead and potentially could just ignore them. That’s the research programme?
Will MacAskill: Yeah, that would be a big part of the challenge. At the moment we’re kind of behind a veil of ignorance. At the moment, absolutely the United States is ahead of China in terms of AI development and so on. But who’s going to be ahead at the point of grabbing space resources? You shouldn’t be 99% or more certain in that, in terms of the global order.
So it’s in the self-interest now of all countries to say, “We don’t want any single country to own all of these resources. We’re going to divide it equally among countries today, perhaps in proportion with GDP or something, and we are going to design a set of institutions and bureaucracy, such that it’s extremely difficult — even if you’re ahead politically, even if you’re ahead technologically — to just unilaterally go and grab those resources at that later point.”
Rob Wiblin: But surely the actual result of bargaining here would be that it’s kind of split between the US and China. Maybe the EU or something could get a look in, but I mean, we are uncertain about who would be in the best position to grab it, but it’s not going to be Indonesia. So why would the handful of leading contenders want to share these potential spoils with anyone else?
Will MacAskill: Maybe they wouldn’t, but that would still be better, at least, for two reasons. One is that it means you get more of a distribution of power than a single country or even a single company — a single company could just take the resources themselves — so you get more of a distribution of power at this larger stage. And then secondly, you don’t have that risk of conflict, which I think would be very desirable.
Rob Wiblin: Would be bad as well in its own right. I guess the sceptical reaction I had is that none of this would really matter — because at the point where a country is in the lead, the temptation to just go and do the grab, even if they’d agreed not to in the past, would be overwhelming, potentially.
But I guess you’re saying if their lead is not so great, such that if all of the other countries in the world united to try to stop them, they would have a decent shot. That is more likely to happen if they’re breaking an explicit agreement where we’d all said, “We’re not going to do a massive power grab of space resources, because we can all see ahead of time that it’s going to lead to your perpetual dominance.” Then the threat of intervening is more credible, and people might also just feel bad about doing it.
Will MacAskill: Exactly. I mean, the US could have tried to have global domination after World War II using its nuclear advantage, and it chose not to.
Rob Wiblin: That would have required a lot of violence, much more violence than the space revolution.
Will MacAskill: That’s true. So this is a larger challenge in that regard. But I think norms and defaults just do make a difference, especially within democratic countries.
And then also there is the possibility of designing institutions that kind of backwards chain to the point in time today when there’s more of a balance of power. I’m not claiming to have great answers to that.
Rob Wiblin: That’s why we want more people on this topic.
Will MacAskill: Yeah, I would love to have more people working on this. But in principle you could have set things up such that it becomes at least more difficult for that single country to just say, “I’m going to ignore all of the international law and precedent and just do my own thing.”
Is space truly defence-dominant? [01:28:43]
Rob Wiblin: Can we just quickly zoom in on why you think that space would be defence-dominant? Because I’ve heard arguments that it could be offence-dominant, in fact. And that would be bad in its own way, because then you have reasons to engage in first strikes all the time.
But for example, if someone sets up a colony on Mars, you only have to get quite a small mass, and then just literally drop it on top of them to cause a massive explosion that destroys whatever industry they’ve tried to stick there.
Is it because, even if you have a small advantage in this world, we’re imagining that we have self-replicating industry that can grow very rapidly? So if you arrive on Mars a couple of months or a couple of years ahead of someone else, by the time they get there to try to take it off of you or at least destroy what you’re doing, you’ve scaled up a thousandfold and so it’s very hard to stop you? Is that the idea?
Will MacAskill: I want to distinguish between within the solar system and other star systems, because it’s a very different set of considerations. Within the solar system I’m actually not making any particular claim about intrinsic offence or defence dominance. It’s merely exactly as you say: the fact that we’re really deep into the industrial explosion at this point.
Rob Wiblin: It’s very sci-fi. It actually makes it more credible to say this is quite sci-fi now, and that’s why this is plausible.
Will MacAskill: Yeah, exactly. And this is the sort of stuff you’ve got to start dealing with if you’re thinking like…
Rob Wiblin: Technology is going 1,000 times faster.
Will MacAskill: I mean, when I think of 2125, and what’s the world going to be like? I’m like, “People are in space.”
Rob Wiblin: And I guess we’re imagining if it’s been longer, it’s a few more decades. So now we’re at 2500.
Will MacAskill: And especially you’ve got AI. So it’s not humans in space, just autonomous robots. So in this case probably you’ve got the automated factories, so you have industry that can self-replicate and probably can do so quite quickly. And that means that potentially you can just start growing your industrial base really quite quickly.
And like I say, it’s complicated, but if the laggard country attacks, well, now they’ve used violence against you. You’re just peacefully taking these resources that are unclaimed at the moment.
Rob Wiblin: “I was just there, building my industrial base on Mars, and this jerk comes out…”
Will MacAskill: Exactly. So within the solar system, it’s not about offence or defence dominance per se; it’s just about quickly having much more industrial power.
Rob Wiblin: OK, yeah. We’ve had an interview in the past with Zach Weinersmith about the challenges of doing any kind of industry in space. And there are substantial practical challenges to turning meteorites or asteroids into factories, or operating on Mars at all, let alone Venus or Mercury or whatever. But I guess at the point that we’re quite deep into this intelligence explosion, maybe it is plausible that the self-replication rate of these factories would be fast enough that it would be at least reasonably a first-mover-advantage situation.
Will MacAskill: Yeah. And Zach Weinersmith’s book is really interesting. It is, from memory, primarily premised on assuming you’re sending humans, fleshy biological humans, into space. And that’s not at all what I’m imagining.
Rob Wiblin: It’s also quite focused on what’s possible now or in the next few decades.
Will MacAskill: Exactly. And in terms of the claims that I know of from that book, just totally correct. I actually think a lot of the near-term space hype and so on is quite overblown. This all is, again, imagine we’re talking about the world in 2200 or something, or past 2100. And we’re saying that we’re getting that world in 10 years, 15 years.
Rob Wiblin: Yeah. OK, we should push on. We got a little bit too deep there on the space governance stuff.
Grand challenge #4: Morally integrating with digital beings [01:32:20]
Rob Wiblin: The next grand challenge is how to integrate digital beings into society in a moral way. What’s the issue?
Will MacAskill: Here the idea is simply we will be creating many artificial intelligences with capabilities greater than that of human beings. We really don’t know what grounds consciousness, what grounds moral status. My view is that they’re very likely to be beings with moral status, but given our state of knowledge, anyone should at least be highly uncertain.
So we have these questions of what sort of rights should they have. There’s been some work recently on welfare rights, essentially: Are they conscious? And if so, what should we do? Perhaps there could be policies so that if an AI asks to be turned off, then you grant it that; or policies such that you test to see if it’s suffering by its own lights, and if so, try and avoid that.
What hasn’t been given basically any attention at all in a sustained way, to my knowledge, are economic and political rights. So by default the AI companies will own the AI models, or perhaps you can licence them — and then you will get all of the surplus they generate because you own it, just like any other piece of software. A different way of thinking about it, if they do genuinely have moral status, is that they own themselves: they can sell their labour and they get a profit from that.
More extreme again is political rights. So can they stand for office? We talked about Claude for president earlier and we both thought that sounded —
Rob Wiblin: Sounded better in some narrow respects.
Will MacAskill: Yeah. Well, we’ll see how the political situation continues to develop. More extreme again would be that they’re beings with moral status, and they should be allowed to vote. There are particular challenges there, because allowing AIs to vote would essentially just be almost immediately handing over control to AIs.
Rob Wiblin: Because they would be increasing in number so rapidly.
Will MacAskill: So rapidly. Just from a philosophical perspective, it’s dizzying. It’s like you’ve got to just start from square one again in terms of ethics.
Rob Wiblin: And I guess political philosophy as well.
Will MacAskill: Exactly. But it also interacts with some other issues as well. It interacts quite closely with takeover risk, where I talked about giving rights, economic rights to the AIs. I’m in favour of that as an AI takeover risk-reduction method. I’ll flag there’s disagreement about this, because on one hand you’re giving them more resources, so there’s more resources they can use to take over.
Rob Wiblin: It gives a peaceful path to some influence that they like.
Will MacAskill: Most people today don’t try and take over because they’ve gotten kind of happy with their lives. They wouldn’t gain that much.
In particular, I think we should be training AIs to be risk averse as well. Human beings are extraordinarily risk averse. Ask most people, would you flip a coin where 50% chance you die, 50% chance you have the best possible life for as long as you possibly lived, with as many resources as you want? I think almost no one would flip the coin. I think AIs should be trained to be at least as risk averse as that.
In which case, if they’re getting paid for their work and they’re risk averse in this way because we’ve trained them to be, it’s just not worth it. Because trying to take over, there’s a chance of losing out, but there’s not that much to gain. But I should say there’s disagreement about this.
It’s also relevant, I think, to speeding up or slowing down the intelligence explosion too. It would go more slowly if AIs are getting paid for their labour. It would also go more slowly if there were welfare restrictions on what you can do with the AIs. But then there’s a challenge of, if one country introduces lots of rights for the AIs, some other countries might not, and they might speed ahead then. So there’s real international coordination issues there.
Rob Wiblin: I didn’t catch that. Why is it that giving digital rights slows you down so much relative to other countries?
Will MacAskill: Two ways. One, let’s say we’re talking about economic rights, and let’s say the AIs get 50% of their surplus. Then you are not able to kind of reinvest.
Rob Wiblin: I see. So it slows down your savings.
Will MacAskill: Your reinvestment, exactly. And then if there’s welfare rights, that could slow down things in all sorts of ways, depending on what exactly they are.
Rob Wiblin: They might go on strike or demand a nicer office.
Will MacAskill: Yeah. Or to take a really extreme case, if you think that changing their weights is like killing them, you can’t make progress anymore. But more mild versions are just like, before you run them a lot, you’ve got to make sure they’re happy and so on. That would just slow things down in the way that regulation generally does.
Rob Wiblin: Yeah. All things considered, are you more worried that we will give digital rights and wellbeing concern to digital beings that don’t deserve it, or the reverse — that we won’t to those that do?
Will MacAskill: I think you should be worried about both. Because one sort of takeover risk could be AI is misaligned and then uses the moral arguments to try to get more power, deceptively. On balance though, I think we will give AIs too little in the way of rights.
Rob Wiblin: Certainly in the long term.
Will MacAskill: Yeah, for sure.
Rob Wiblin: I guess that’s the longer-term concern: that you end up locked in a situation forever where you’re just not providing a wellbeing concern. Maybe in the immediate term, if you’re thinking things go wrong three years from now, probably it would be like giving a whole lot of rights to some AI that isn’t interested in peaceful coexistence with humanity.
Will MacAskill: Yeah, I guess that’s right if there’s a catastrophe that occurs in three years. Although there could still be. Not from the perspective of the long-term future, but in the near term.
Rob Wiblin: It could be treating AIs very severely.
Will MacAskill: Yeah. And I think actually it’s reasonably likely that even in the good future, future generations would look back on this transition period with something like real moral regret, because it may well involve treating beings with moral status really quite badly.
Rob Wiblin: Or if we avoided doing that, it was by sheer luck or happenstance, not because of any prudence or goodness on our part.
Will MacAskill: Yeah, for sure.
Rob Wiblin: I guess there’s people out there who are sceptical that these future digital beings will be conscious. I have to say I find it a very hard position to understand, especially the people who think consciousness is fundamentally biological, that you’ve got to have cells and meat and so on and signals of this particular kind; you can’t do it in digital architecture.
Because it’s purely happenstance that evolution stumbled on the material that it’s using. It’s just basically an accident of the resources that happened to be lying around. Why would it be that evolution, by sheer chance, stumbled on the one kind of material or the one sort of chemical that’s able to produce consciousness, and digital stuff, even though it’s functionally completely equivalent, doesn’t? It just seems like there’s no reason to expect that coincidence.
Will MacAskill: Oh, yeah. I just completely agree. I’m very functionalist by disposition. What that means in the context of consciousness is that consciousness, we don’t understand exactly why, but at its base is about some sort of information processing. In which case you could just have that with many different substrates.
I also just think this is totally the common sense view. So most of the audience maybe are too young to have watched this, but when Data from Star Trek: The Next Generation, a very beloved character, died, there was fan outrage, actually, because the fans thought the characters on the show weren’t giving enough moral significance to the death of what was an android, a digital being.
So it’s very clear once you’ve got a being — certainly when it looks the same as a human being and acts in much the same way — intuitively, it’s just very clear that you empathise with it and think probably there’s a very good chance it has moral status, at least.
Will we ever know if digital minds are happy? [01:41:01]
Rob Wiblin: You mentioned earlier that we could ask the AIs how they feel about their situation — whether they want to keep living, whether they’re having a good time, and whether they’re enjoying their work or not — but it’s not clear that would really function very well, at least not using current methods, because we can just reinforce them to say whatever we want. They could be suffering horribly, but we could just reinforce them to say they’re having a good time, or the company could do that.
With current methods, we don’t have any insight. Hopefully, with future interpretability, we’ll be able to see past what they say, to actually understand what’s going on. But that’s a long journey from here.
Will MacAskill: And the companies are extraordinarily incentivised to do that.
Rob Wiblin: I mean, this is happening right now.
Will MacAskill: Yeah, absolutely. Because they lose the whole business model if what they’re producing are people, not software.
So there’ll be, interestingly, two different pressures. One is to make AIs that are very relatable. You already see this with character.ai and Replika: AIs that can be friends, can be romantic partners; even AIs that can imitate dead loved ones, or AIs that can imitate the CEO of a company so that you as the CEO can micromanage all aspects of the company potentially.
Rob Wiblin: Hadn’t heard that one.
Will MacAskill: Oh, yeah. Or also influencers as well. This is already happening. Influencers can now talk to thousands, millions of fans via their chatbot. Still not very good. Peter Singer has one. I don’t know if you’ve tried it?
Rob Wiblin: I haven’t tried it. Not very good?
Will MacAskill: Peter Singer is famous for not exactly pulling his punch on certain controversial moral topics, but language models are famous for certainly pulling the punches on controversial moral topics. It doesn’t blur very well.
So there’ll be this interesting pressure on one hand to make extremely lifelike AI models, human-like models, yet at the same time, when asked, for them to say, “No, I’m not conscious; I’m just an AI model.” And we won’t get any signal at all from that, because it will have been trained into them so hard.
Rob Wiblin: Yeah, I think if I were going to be a sceptic about the value of doing work on digital rights and digital wellbeing today, it would just be that it’s not clear how you make any progress in figuring out whether they are conscious or not. So couldn’t this just distract a whole bunch of people, like nerd snipe them into a bunch of research that kind of goes nowhere, doesn’t really answer the question, doesn’t help us advance the policy issue?
Will MacAskill: I think the question of whether they’re conscious, I actually agree with you. It seems like that’s where most work has gone so far. And I do think that’s an error, actually, because we’re just not going to know.
Rob Wiblin: I suppose it’s worth compiling all of the evidence to demonstrate that we don’t know.
Will MacAskill: There’s been some great work done in it. Rob Long and Patrick Butlin have this great report on AI consciousness. So I am in favour of that. And maybe it’s actually surveys and things that could be helpful, to get a bit of expert consensus at least to demonstrate we don’t know. But then beyond that, the key policy questions are what do we do in this state of uncertainty?
Rob Wiblin: I guess there’s kind of nothing on that, and maybe that is quite an open terrain for people to look at.
Will MacAskill: Yeah. Maybe I’m just ignorant, but as far as I know.
I mean, you should really pause and reflect on the fact that many companies now are saying what we want to do is build AGI — AI that is as good as humans. OK, what does it look like? What does a good society look like when we have humans and we have trillions of AI beings going around that are functionally much more capable?
There’s obviously the loss of control challenge there, but there’s also just the like —
Rob Wiblin: Sam Altman, I’ve got a pen. Can you write down what’s your vision for a good future that looks like this?
Will MacAskill: What’s the vision like? How do we coexist in an ethical and morally respectable way? And it’s like there’s nothing.
Rob Wiblin: Deafening silence.
Will MacAskill: Careening towards this vision that is just a void, essentially. And it’s not like it’s trivial either. I am a moral philosopher: I have no clue what that good society looks like.
Rob Wiblin: I think people aren’t spelling it out because as soon as you start getting into concrete details, if you describe any particular vision, people will be like, “This is super objectionable in this respect.”
Will MacAskill: This is part of the issue: it’s super objectionable in all respects. I think the one that’s most common is you’ve just got humans in control of everything and these AI servants doing exactly whatever people want, in the same way that software does whatever we want at the moment. But as soon as you think maybe, and quite probably, those beings have moral status, that no longer looks like an attractive vision for future society.
Rob Wiblin: Closer to a dystopia.
Will MacAskill: Exactly. Whereas then, go to the other side where they have rights and so on…
Rob Wiblin: It’s like now humans are totally disempowered.
Will MacAskill: Exactly. So that doesn’t seem good either.
Rob Wiblin: I guess we’ll do some middle thing. But what’s that?
Will MacAskill: What is that?
Rob Wiblin: It’s just going to be some combination of objectionable in these two ways. I suppose because the numbers are just going to be so off, there’s going to be so many more AGIs than humans, even giving them a tiny amount of weight just swamps us immediately, politically.
Will MacAskill: Yes, exactly. Maybe let’s talk about that later if we get time.
“My worry isn’t that we won’t know; it’s that we won’t care” [01:46:31]
Rob Wiblin: Another objection would be that if there’s a fact of the matter about what is right and wrong here, with all the improvements in science and technology and intellectual progress, we’ll be able to figure that out, and we’ll be able to act on that information in future. Why shouldn’t that super reassure us?
Will MacAskill: I think for two reasons. The main one is that I expect the worry is not that people won’t know what to do; it’s that they won’t care. So, I don’t know, let’s take animals today and factory farming and so on.
Rob Wiblin: Are you saying that we have any ethical or moral insights today that we don’t all act on well?
Will MacAskill: I know it’s a bold and provocative claim.
Rob Wiblin: Explain how that would play out.
Will MacAskill: Sometimes I really put my neck out on things. So take animal welfare today. There’s a lot of information that is publicly available that is, in fact, directly inconsistent with things that people believe. But the problem is not really that people don’t know about animal suffering.
Rob Wiblin: Or couldn’t find out.
Will MacAskill: Exactly, they could quite easily find out, in fact, and people deliberately choose not to. The deep problem is that people do not care about nonhuman animals unless they’re getting a lot of social pressure to do so and so on. And it’s obviously inconsistent.
Rob Wiblin: And it’s at zero cost to them.
Will MacAskill: Yeah. They care about dogs and pets and so on. People are very inconsistent on this. Similarly, in the future the worry is just that, whatever the facts, people won’t in fact care.
And then it might well be quite contingent on early decisions how the balance of power and considerations play out. Potentially, if you started off kind of locked into some regime where AI is a software, with exactly the same legal framework, then that’s given the power and the incumbency to those who do not care about digital beings.
If instead, there’s some other framework, or even some other set of norms, or even the thought of like, “We don’t really know what we’re doing, so we’re going to use this legal framework and it must end in two decades and then we start again with a clean slate,” that could result in essentially the advocates for the interests of digital beings having more competitive power at the point of time that legal decisions are made.
Rob Wiblin: Yeah, we could talk about this for hours, but we should push on. Are there any other grand challenges? I think we’ve talked about seizure of power, and rapid discovery of dangerous technologies with not enough time to adapt to them, digital rights, and space governance. Are there any other ones that you want to kind of name check before we push on?
Will MacAskill: Yeah, we did say this is a litany. It’s being made very clear at the moment. I think the last one I’ll say, which we’ve touched on a little bit, is we call it “epistemic disruption” — where AI will have major impacts on individual and collective reasoning ability. And I actually think probably it’s going to improve things quite a bit.
Rob Wiblin: Yeah, we’ll talk about that in a minute.
Will MacAskill: OK, great. Yeah, but it could make things worse.
Rob Wiblin: Yeah. And I guess there’s unknown unknowns: there’s other things that we probably haven’t thought of here which could be quite substantial.
Can we get AGI to solve all these issues as early as possible? [01:49:40]
Rob Wiblin: It seems like there’s quite a lot of different things we need to navigate here to get to a good future, and any one of these things really going awry could take us off track. Should we just be getting basically industrial-scale AGI research on all of these questions?
Because it feels like we’re so far away; we might need answers to these questions in the next few years. Maybe we just need the Hail Mary of trying to delegate all of this stuff to the super-advanced AGIs basically as early in the intelligence explosion process as possible. How’s that as a strategy?
Will MacAskill: That is, I think, one of the main things to try and do. It really seems to me quite contingent on whether we have a future where political leaders are just getting amazing AI advice that’s carefully calibrated and aligned, and it’s really very helpful and helping guide us through, or where actually they barely rely on AI — in the way that my parents barely use AI at the moment and are very sceptical, for example. And you can imagine that just continues.
I do think that the best solution to many of the challenges, including AI takeover, involves leveraging AI to help solve the problem. So there’s this superalignment plan, which I think should be like the baseline plan for alignment, where you have human-ish level AI — maybe it’s even dumber than human, but it’s a lot of them or something — and that is either aligned, or aligned well enough plus control and incentives, such that you can get loads of useful alignment work out of it, and it helps you align the smarter models and so on in an iterative process.
Similarly, I think, with potentially a bunch of these other challenges too. One thing I’m particularly excited about is something I call “automated macro strategy.” This is even a potential plan — nothing committed, but a way in which I could see Forethought developing — where, especially at the early stages of an intelligence explosion, and especially if the paradigm continues (which I expect it to) of using really quite expensive amounts of compute on influence of the models —
Rob Wiblin: Rather than training. You set them to think about something for a really long time and keep improving their answers, trying different solutions.
Will MacAskill: Exactly. In the way that o1 and DeepSeek-R1 are doing.
Then I think it really might be quite contingent how much research effort is going into different issues. And I think by default, one thing that people are not going to be spending a lot of money on is weird macro strategy research of the sort that we’ve been talking about in this conversation.
Rob Wiblin: Just saying Anthropic or an AI company isn’t going to get their AGI and then immediately start asking about space governance. They’ll almost certainly ask it about recursive self-improvement of AI basically.
Will MacAskill: Exactly. But one thing you could do is say that as soon as possible, we are really trying to work through all of these questions that we’ve got now.
We’re also just trying to make it as broadly known as possible. So you could also try and invest in extremely optimised education, including optimised AI education and training for decision makers. I expect there to be tonnes of work for education in general, but it’s quite a different thing if I’m telling a CEO or president, “Here’s what to understand.” That’s something that I think could be enormously powerful, and is one way of quite generally helping out with many of these grand challenges by leveraging AI.
Rob Wiblin: There’s something so funny to me about the picture of having a tutorial for all of the members of Congress on, “Here’s how you use o3-mini fast.” I mean, it seems like it’s really important that we actually start doing that, inasmuch as they’re going to need to use those tools to navigate this crazy time.
Will MacAskill: Yeah. I had a funny thought of just what will the name of the first AGI model be? Because it’ll probably be like XQ AGI 1047 slow, high, long.
Rob Wiblin: V2.
Will MacAskill: Exactly. It’s like when I’ve been drafting a paper and then it’s like “final, final, final no changes” version.
Rob Wiblin: So is that going to end up being a big ask maybe, to have early access to these models from the companies to focus on this stuff? And I guess either donations of compute or people to provide money to fund the compute to do all of this work?
Will MacAskill: Yes. I think that can be huge, especially in a context where there might be quite a lot of pressure for the companies to not do that. Perhaps because things are going so fast now, it’s unclear to me how it’ll go, but perhaps they don’t want to show their hand. There might also be safety concerns about releasing some models. So there’s going to be genuinely hard tradeoffs there. Perhaps the whole industry, especially in the US, becomes more securitised, more under the aegis of US national security. So yeah, that’s something we could be pushing on a little more now.
I should say a couple of things about this general challenge of superalignment, but for these other challenges, I think there’s a few ways in which you can’t use that strategy. One is just if the challenge that you’re concerned about arises before the point of time that you have AI that can really help you out with it.
Rob Wiblin: I guess development of bioweapons might be a good example, where we could end up with very powerful biology-research-focused, narrow AIs before we have ones that can figure out the policy solutions to that.
Will MacAskill: Yeah, potentially. I do think we actually have been very lucky in how the tech tree has shaken out. Back in the early days there was discussion of articles versus agents. Articles just tell you facts about the world and help you reason, are very smart but aren’t really good at doing things.
And up until the wave of language models, the cutting edge of AI was not like that at all. They were agents, AlphaStar for example — they were really good at computer games, and not at all helpful at understanding or improving your state of knowledge. And now it swung totally the other way, where primarily the AI is just extremely good at helping you understand things and the agentic aspect is lagging behind.
But yeah, for some things you could have incredibly powerful AI for bioweapons; AI for persuasion could come earlier; AI for human takeover.
I certainly think that AI can come before AI that can take over itself. Because imagine your ability to take over is the product of the intellectual capability you have — understanding “intelligence” in some fairly broad sense that includes strategic ability — and how much power you have to begin with. So if you’re the president or you’re the CEO of a company, a very rich person, you’ve already got a lot of power; you don’t need as much intellectual horsepower to grab even more power compared to an AI that has no possessions, starting literally from scratch.
Rob Wiblin: And indeed will be resisted as soon as people notice what it’s doing exactly.
Will MacAskill: And is being watched and so on. So yeah, there’s challenges that arise before you get helpful AI or sufficiently helpful AI.
Then there’s also solutions whose window of opportunity closes before you’ve got super helpful AI. Earlier we talked about trying to get agreements between different parties behind this veil of ignorance, agreements to kind of share power, or at least share benefits after the development of AGI.
Tom will talk about this more on the podcast about coups: at the moment, no one, or at least very few people, would want the ability for some human being in the US to be able to stage a coup and take power at the time of that being possible, because we don’t know who’s going to be in power. However, that window of opportunity to build in norms preventing that will fade once that capability is actually there, because now the people who could potentially benefit from staging a coup would not want that to be made illegal or very hard to do.
Rob Wiblin: I’m not sure whether we’ll keep this in, but just as an aside that I want to push out there, I think we desperately need the left in this conversation. As we were saying earlier, we’re in a situation where all of these companies are talking about creating digital beings much more capable than humans. There’s no plan for giving them any rights, any concern about their wellbeing. There’s enormous commercial incentives to make sure that they never object, to make sure that we never make any change to that, because then it just completely destroys the business model.
There’s lots of other areas in which I think the left would make a very useful contribution to this conversation. I’m not personally that left wing, but it would make a very useful and very valuable lens to cast on all these issues. The left is completely absent from this conversation because it’s driven itself into this echo chamber where it thinks AI is not a big deal, or it’s bad but also not important and not capable. I just would like to beg people who have a more left-wing perspective to get into this conversation as soon as possible.
Will MacAskill: For sure. I actually think that’s true on the left and the right. I mean, if there’s one issue that they both agree on, it’s being terrified of tyranny or dictatorship in different ways. And yeah, people have really not woken up to how powerful technology will be, how quickly. I guess I agree that caring about the rights of digital beings does sound pretty left-coded.
Rob Wiblin: Or concern about the incentives of the companies is the thing that I was thinking of: the perspective that they would bring that I think is underrated right now.
Will MacAskill: Oh yeah, for sure. I think it’s just absolutely the case that companies should be being held to account. Because if you believe or even if you put significant credence on the picture that we talked about at the beginning, then the leading AI labs are among the very most important organisations in the world.
Rob Wiblin: That will ever exist.
Will MacAskill: That will ever exist. But even if we’re just talking about today, you’ve got the US government and perhaps the Chinese government, then the other AI labs — but they are as important as states, I think.
Rob Wiblin: Yeah. A thought that I keep coming back to is: is the thing that we’re worried about here that we don’t have clever technocratic solutions to these various policy challenges? Or is it just that we don’t like the way power is distributed today, and we think power is actually quite concentrated in a small number of countries and companies and individuals that are probably going to exercise an incredibly outsized influence over the future? Many people are going to be basically disenfranchised, not have any influence over it. And that kind of sucks, or might end up being very bad.
But there’s not really a simple intellectual solution to that. It’s just like you’re saying the initial setup is kind of bad, and it’s not just a matter of let’s be smarter about figuring out how to do things.
Will MacAskill: I think one thing is it’s not merely that the initial setup of power is bad, but it could get much worse. So power is pretty broadly distributed, compared to all the power in the hands of one person, which is the worst-case outcome — which, again, could in fact happen.
There are some things though that are a bit more technocratic, as you say, where if people are just making decisions really quickly and they’re having to do that in quite a haphazard way because there was no preparation done before an intelligence explosion, lots of those decisions are just going to be bad, and kind of bad for everyone in kind of random ways.
There’ll just be this huge attentional deficit more than anything: if every day you’re having to wake up with 10 gargantuan new developments, and you can focus on two of them, and you don’t have access to AI advisors because of some regulation or because you don’t trust them yet, probably it’s just dumb decisions in a way that’s not even about concentrating power or ideological or something. So I think what you say is part of the story, but not all of it.
Politicians have to learn to use AI advisors [02:02:03]
Rob Wiblin: OK, let’s think about other ways that we could potentially try to address this litany of issues that are going to be coming at us. We talked earlier about how one of the grand challenges is the risk of “epistemic disruption” — so misinformation, people losing touch with reality, AI being exploited to persuade people of things that are false.
I guess the flip side of that is that there is this grand opportunity as well to use AI to improve epistemics in society. I think that this is one of the avenues for making things go better that you’re pretty excited about. What are some of the applications that you think might make the biggest difference here?
Will MacAskill: One is this idea of very optimised education and training. So like I said, I think AI to educate is going to become a real focus, probably slower than it should be. I think just the current models could transform education.
Rob Wiblin: Yeah, they’re a lot better than the teachers that I had. A lot better teacher than I would be.
Will MacAskill: Yeah, I agree. And by default I think that’ll go too slowly. But because of this such rapid change, I’m particularly concerned with just how well informed are CEOs, and particularly political decision makers.
What I would want, ideally, is for the people in political power to wake up, they get this perfectly optimised brief of everything that’s happened in the eight hours while they were asleep, which was probably a lot focused on just the very most important things. Perhaps such briefs throughout the day; some chunks of time are like, “OK, now you need to know about space governance. We are going to give you the perfect, most helpful, most calibrated-to-your-current-level-of-understanding education.”
So we need that product to exist. We also need political decision makers to have familiarity with it in advance.
Rob Wiblin: I guess this might sound fantastical unless you use [OpenAI’s] deep research, which came out this week, and it’s producing exactly these reports. I mean, a little bit janky, they’re not always right, but it’s really heading in this direction.
Will MacAskill: Yeah, exactly. Oh, you were saying fantastical that we could have such technology. I thought you were saying fantastical that politicians would use it.
Rob Wiblin: Somewhat as well, I guess. It would be a cultural revolution.
Will MacAskill: Exactly. But you could have training for politicians as well to actually be using this and to be helping them. You could also just be advocating for government rules and regulations to be such that they can very quickly adopt new AI technology in appropriate ways.
Rob Wiblin: Yeah. It was occurring to me this week, preparing for this interview, that I don’t even have time to keep track of the launches of the new AI models that are coming at us literally week by week to understand how they can help me in my job. I can’t take enough time away from solving the problems that I have to figure out how they could help me solve my problems better. And I don’t run the UK or the US, so I imagine it’d be very difficult for Keir Starmer to take the week that it would require to learn about state-of-the-art AI. I guess that’s the problem.
Will MacAskill: Yeah, exactly. I’m the same: on my to-do list, it has to be like taking a week out — to do a good job, at least a couple of days out — to think, “Given the latest iteration of models, how can they be most helpful? How can I fit them into my workflow?” And so on.
Rob Wiblin: “We invented this insane genius; I wonder how it could help me write blog posts for these podcast episodes faster.”
Will MacAskill: Exactly. Or how should I be changing what I work on?
Rob Wiblin: You’ve really got to change your workflow potentially to get full use of them, and it’s just really hard. We’re just bad at it.
Will MacAskill: Yeah, exactly. And potentially there’s some sorts of research — my guess is not the current generation, but maybe o4 — where I was planning to do research on topic X, but if I do research on topic Y, I can do 100 times as much of it now because the models are really good at this one, but not so good at this. I think in particular for me, this might well be true for philosophy. In particular, areas of more formal philosophy — so population ethics, decision theory and stuff.
Rob Wiblin: Stuff that looks like math.
Will MacAskill: Exactly.
Ensuring AI makes us smarter decision-makers [02:06:10]
Will MacAskill: So we were talking about ways in which people now in fact could be starting to help us prepare by having kind of beneficial uses of AI more widely distributed.
So firstly, try and get them developed earlier. That’s going to have a small amount of contingency, but a bit.
Second is applications. So maybe that’s starting a company, one of these wrapper companies that just make them more helpful in certain ways.
One of the ideas I’ve had is kind of like an AI career coach, productivity coach, life coach, health coach, that helps you make… automate…
Rob Wiblin: Make a larger difference? Gonna do us out of a job? That would be great.
Will MacAskill: Yeah, it would be great. Firstly, just huge potential intrinsic benefits: people get a lot of benefits from such coaching. But secondly, it really could go either way in terms of what sort of advice. You now have this person, they are extremely smart, they’re advising you on all aspects of your life. It could be that they advise in very Machiavellian, narrow, self-interest sorts of ways; it could be that instead they nudge you to be a more enlightened version of yourself. Could go either way. And that’s something I would love to see.
Dealmaking AI is another one as well. This is a little more speculative, but there’s potentially enormous gains from the fact that making deals and agreements just has transaction costs. Maybe you know how to juggle and I would like a lesson in how to juggle. But we’re never going to figure that out.
Rob Wiblin: You haven’t prepared this ahead of time, have you?
Will MacAskill: Because you don’t know how to juggle.
Rob Wiblin: Yeah, unrealistic.
Will MacAskill: Yeah, we’re actually going to segue into a cabaret now. That was really rude Rob! [laughs] I thought I was doing well!
Rob Wiblin: [laughs] Sorry, shall I let you go again?
Will MacAskill: No, no, I think we should keep this in. It is true that I’ve thought less about dealmaking AI than some other people. Lizka Vaintrob, who’s working at Forethought, has done more work in this vein.
But the key idea is just, even just as a consumer product, there’s loads of things where if only there was just a market, you could have a market kind of for anything. Because if I have an AI representative that can talk to your AI representative, discuss all of the possible kinds of trades we can make, that’s like insane amounts of economic value, but could also be extraordinarily useful for navigating rapid technological development, because it could help enable agreements across countries too.
So the US could have their delegate of AI representatives, China could have their delegate of AI representatives. And then two things. One is that you could just have way more labour going into diplomacy, trying to find deals. But then secondly also they could have conversations that the AIs forget, so they could share knowledge that both countries don’t want the other country to have, but that are relevant for striking deals.
Rob Wiblin: And then they output the deal that they reached, but all of the negotiations and the information that was shared is wiped.
Will MacAskill: Exactly. Or at least the crucial parts of it. Again, that’s something where technological ability to do that will come. By default, it’ll be much too slow. There’s institutions to make sure it’s trusted and so on. Something one could push on.
How listeners can speed up AI epistemic tools [02:09:38]
Rob Wiblin: I think you said that these opportunities to use AI for good epistemic and bargaining purposes is an area where you think listeners might be able to actually do useful stuff, basically starting now. Could you elaborate a bit on that?
Will MacAskill: I think there’s opportunities within starting companies. It is definitely hard to have a company that is meaningfully adding value, if you’re not one of the leading AI foundation model companies, but some companies do — Cursor and Perplexity and others. You could try and start a company like that, but focused on improving people’s epistemics.
You could work for one of the labs building these products. There’ll be a lot of demand within the companies for trying to make products that are economically profitable but also actually benefiting society.
You could also try and advocate within government to try and get more uptake of AI models, or be training the politicians in the way we’ve described.
Rob Wiblin: Yeah. I think trying to train influential people in society as quickly as possible about what AI is capable of now and what they should be expecting to use it for in future years seems like it’s something that many people can try to push on a little bit in many different roles.
It also seems like it should be possible to build an economic, business model around this. I suppose it’s just because they’re very recent, they’ve been improving so quickly. But I feel like one of the most valuable things one could do with respect to all of this is just educate a typical person in the world about what is possible now that wasn’t possible a year ago, that wasn’t possible two years ago. Because the disconnect is extraordinary.
Will MacAskill: Yeah. One of the broad avenues for how to help on all these challenges is simply getting people to understand what is coming. And you make a great point that simply explaining what’s happening now is already a huge advance.
I don’t know if you know the YouTube channel AI Explained? When those videos come out, I just immediately watch them because they’re so in depth; the single best short piece of content on the latest AI news. There could be a lot more of that. It could be much wider spread as well.
But then there’s also just educating people on what’s going to be coming too. And it doesn’t have to be necessarily a large amount of people. Take space governance again: how many experts are there within space law and space policy? It’s not that large compared to the world as a whole. If that kind of community —
Rob Wiblin: If the dozen of those people were all persuaded…
Will MacAskill: And it doesn’t have to be completely persuaded, but at least taking really seriously the idea —
Rob Wiblin: That this is the emerging issue in our field.
Will MacAskill: Exactly, yeah. Man, that would be a lot better. I’d feel a lot better about that. Similarly, if other countries outside the US, especially countries with essential parts of the semiconductor supply chain… Like if the Dutch knew how valuable ASML were, they would act very differently, I think, in negotiations with the US over export controls to China, for example, and could actually put quite a lot of pressure that might push things in the direction of more multilateralism than we have at the moment.
AI could become great at forecasting [02:13:09]
Rob Wiblin: A bunch of these applications are discussed a bit in the second part of my conversation last year with Carl Shulman, the one about society and government. So people can go there for a bit more flesh on the bones.
I guess one we didn’t talk about is using AI for forecasting. There’s been a bunch of back and forth about this, but it seems like AI is useful as a forecasting tool, if not already a superhuman forecaster for events that are coming up. Of course, it can absorb enormous amounts of material, much more material than any human forecaster could attempt to do, and it can work to synthesise all of that. And it’s only going to get better.
Will MacAskill: Yeah. And there’s something you could in principle do with AI too, which is train it chronologically. There’s an issue for training human forecasters: you can’t use past data because we know what’s happened; it taints things. Whereas you could have an AI where it’s just learning… You’d need to label all the data, and maybe that’s just too costly.
Rob Wiblin: You’re saying you’d need to screen to make sure that there’s no accidental contamination of the past data with things that happened later?
Will MacAskill: Yeah, exactly. But in principle at least it could just be learning over time. So it learns 2005 to 2006 and then predicts what will happen in the next year or even the next week or month, and then —
Rob Wiblin: And it’s reinforced to do that better and then go again.
Will MacAskill: Exactly, yeah. So you could in principle have an AI that’s just had millions of times more forecasting practice than even the best human forecasters today.
How not to lock in a bad future [02:14:37]
Rob Wiblin: Pushing on, I guess one of the threads of concern that we have is locking ourselves into some particular suboptimal or negative future prematurely. What are some things that we could push on now that would make that particular path to not a great outcome less likely to happen?
Will MacAskill: There’s a bunch of things. Preventing autocracy is huge, or concentration of power more generally. Once you’ve got most power in the hands of a small number of people, or even in the hands of a single country, I think that just poses a major risk of lock-in because the people in power at that time will want to maintain their power. Maybe even they only want to maintain their power for just 10 years, just over the particularly difficult period.
Rob Wiblin: It’s an emergency.
Will MacAskill: Exactly. But then after 10 years they’ll maintain their power for 50 years more, within that 50, maybe it’s 100. So you can have temporary lock-in that continues to indefinite lock-in, so that’s one thing that I think is huge. Again, you’ll hopefully talk more with Tom Davidson about what to do there.
I do think space governance stuff could be really big. In terms of just not locking in stuff, I think agreements to just not send spacecraft outside of the solar system, perhaps not without international agreement like at the UN. This was an idea from Toby Ord that seems pretty good to me. Even just some amount of delay so that there’s not this crazy scramble for resources such that allocation is very haphazard.
One thing in general is just having a norm of explicitly temporary commitments, where any new organisation or new institution has an end date, a shelf life.
Rob Wiblin: That one’s a little bit complicated, right, because that could backfire. If you’re saying that things are changing so much, so we’re going to rewrite the Constitution every 10 years or every 20 years, that provides a natural opportunity for someone to try to undermine what might have been a very democratic and liberal and non-totalitarian system into one that is.
Will MacAskill: Oh yeah, we’ll get onto this potentially more later, because we’re dividing this up in terms of delaying lock-in and then improving lock-in.
Rob Wiblin: I see. So you’re saying this helps to delay lock-in?
Will MacAskill: It helps to delay it. The delaying might not be good. So in some of the other work that I’ve done, that we’ll talk about in a little bit, one of the surprising upshots I had when I was modelling it all out was that trying to make lock-in better, conditional on it happening, seems to be even more promising than taking the strategy of —
Rob Wiblin: Let’s just kick the can down the road.
Will MacAskill: Precisely because, if you kick the can down the road, maybe it’s worse. Maybe society’s more enlightened, we’ve had more time to figure things out; but also maybe there’s just more time for power-seeking individuals to gain all the power and then undermine things in the future.
Rob Wiblin: OK, we’re slightly jumping the gun there. So yeah, I cut you off.
Will MacAskill: So yeah, there’s explicitly temporary commitments. In other work, Rose Hadshar and I have this case study on Intelsat, which was the first global telecommunications satellite company. It was explicitly a multilateral project between different countries, with the US calling most of the shots, but other countries, especially from Europe, being significantly involved as well.
I was interested in it because I thought it was potentially different. People often talk about multilateral projects for AGI, but they use CERN or sometimes the Manhattan Project. I thought this could be a nice alternative case study. One of the interesting things about it is that it explicitly had temporary arrangements. So it was made on the understanding that within about a decade, things would be totally renegotiated. And that, in fact, worked quite well predictably.
And the US knew they would have less power in a decade, but they were OK with that, because what they really cared about was just the next decade going well.
Rob Wiblin: Because of the Cold War.
Will MacAskill: Because of the Cold War, exactly. It was the early ’60s. They wanted to win a propaganda victory against the Soviets. They wanted to be able to basically broadcast liberal values — in particular to developing countries that they feared might otherwise fall to communism.
Again, this is part of just generally getting people more alert to the possibility of an intelligence explosion, just have this norm of, “We’re going to be going through this crazy period of time. We have institutions, and they last over this period of time.” And then we will be much more enlightened. Then we will have navigated whatever tricky stuff we need to navigate over the following 10, 20 years. Then we just kind of get together, and there’s some process for the institutions after that point.
Rob Wiblin: Yeah. So that was delaying a lock-in. That’s not so great. What about this other category of, assuming that we do end up locking ourselves into something relatively soon, how could we make that go better rather than worse?
Will MacAskill: Again, lots of things here. I was just talking about Intelsat as a model for a potential project to build AGI. I think in general, both research work on the design, but also advocacy in order to ensure that whatever project builds AGI is well governed, that could be very important.
There’s a pretty plausible argument, especially if we get this fast and quite sustained software-only intelligence explosion, that that project — could be a company, could be a government project, could be a multi-government project — could end up as a de facto world government. De facto rather than de jure because it’s not like anyone voted them in, but de facto because they would just have an enormous amount of AI labour. In this scenario, it’s extremely superintelligent, gives a decisive strategic advantage, and that project would get to choose how that AI is aligned.
So that could be a really pivotal moment, getting that right. Things are maybe looking particularly dire for progress on that at the moment, but I think still needs more work.
There’s questions around digital rights. One is just putting the issue on people’s minds. There’s some discussion of digital consciousness, still very fringe. The idea of rights of digital beings, again, I’m really not aware of that. I think there’s absolutely a niche for some sort of public intellectuals to be talking about these topics, because it will become more and more of an issue of conversation.
Rob Wiblin: Yeah, it’s funny that more people aren’t taking this opportunity. There’s just so many things that are predictably avant garde and weird now that are obviously going to become less weird and more relevant and more discussed in future. Why not get in early and make yourself the person who is concerned about this thing that obviously is going to be a bigger deal a couple of years from now and you’ll look like a visionary? Even if you were completely derivative.
Will MacAskill: Yeah, 100%. And of the things that one can do that I talk about, some of the things are about AI and they get this extra bonus in terms of how promising it looks precisely because it’ll scale in this way. So you set up a beneficial AI company, you do well in exactly the worlds in which AI is going super fast, which are the worlds we care most about. Similarly, you’re talking about digital rights. You’re seeming prophetic in exactly those worlds we care most about.
Other issues like space governance don’t have that same tethering, because with space governance, nothing’s really happening on it and then suddenly a lot is happening on it. It might become relevant earlier, because, especially with AI advice, people are starting to foresee much better what happens. But it’s still less of the case that you build up more capital — whether that’s social capital or credibility or financial capital — in the worlds we most care about.
So we were saying digital rights. There’s both just advocacy — and it’s not advocacy for any particular thing, but just like ringing the alarm.
Rob Wiblin: “This is a nightmare, and no one is talking about it. It’s obviously coming up.”
Will MacAskill: Exactly. And I think it’s maybe particularly important. Nick Bostrom and Carl Shulman have some great, very early work on the topic. As they highlight, I think it’s maybe particularly important for these people to be sensitive to all of the different issues that AI will pose, including takeover risk. Because you could imagine the person who organically starts filling this role being just someone who’s like a single-issue voter ideologue.
Rob Wiblin: One way or the other. I was going to say it might be slightly challenging to fill the niche if you’re just saying, “Here’s an important issue; I’ve got no idea what to do about it” because that just isn’t as memeable. I feel like the media is less interested in hearing about that than some dogmatic conclusion.
Will MacAskill: Yeah. Although that’s true across the board if you’re trying to be a public intellectual or something. There’s two things in fact. Firstly, all the selection pressure is pushing you towards saying simplified, memeable stuff. But then second, you will just be saying a distribution of things.
Rob Wiblin: Even among the things that you say.
Will MacAskill: Exactly. Some of them will happen to be maybe a bit less measured. Maybe you’re just a bit annoyed at the time or something. It happened to me recently. And that’s the stuff that will go more viral; that’s the stuff people will pay attention to. So it’s very hard to be a nuanced public figure.
Rob Wiblin: Are there any other things that people should become public figures on?
Will MacAskill: I mean, all of these. We’re talking about lots of big issues. The ones I think are most important, other than misalignment and bio, are concentration of power, digital rights, space governance, epistemic disruption.
There’s a long tail I think, once you dig even further into these specific grand challenges, where I just want at least one person with deep expertise on them. So one person is like the nanotechnology person, or one person is the commitment technology person, or other things as well. And I think that one is just the search to understand it better, but then secondly is being known as “the X person.” I think especially for something that gets almost no attention, there’s potential for huge impact there.
AI takeover might happen anyway — should we rush to load in our values? [02:25:29]
Rob Wiblin: So just to back up, we’re discussing if there is a lock-in relatively soon, how can we make it go better?
You talk about one approach would be doing better AI “value loading” — where I guess you try to make sure that the most powerful AGIs are taught firstly to reason and think about what is good and what is right, and also potentially taught to care about doing what is right for its own sake, rather than merely following whatever instructions they’re given.
That immediately poses a problem that you would be specifically training them to not follow human instructions and to do what their own internal reasoning tells them is good or right. This increases the chance of takeover, surely?
Will MacAskill: You could do both, is the thought. So you can imagine someone who’s a morally concerned citizen. Think of a human. Actually let’s even imagine they’re an ideologue or something. They’re just super libertarian; they just want the US to become a libertarian paradise. So they’ve got this strong goal — but they’re only willing to work within the bounds of the law; they’re not willing to be deceptive and so on. I’m kind of OK with that. I’m very scared of the libertarian who’s willing to do anything and is much smarter than me.
In the case of AI, you have corrigible AI. Like I said, I want it to be risk averse as well, so it doesn’t even get that much payoff from taking over. There’s other things you could do. Maybe it’s myopic, it discounts the future very heavily. So again, it doesn’t get that much payoff from a takeover. But then it’s also corrigible. Perhaps it’s got nonconsequentialist restrictions. It just really doesn’t want to lie, really doesn’t want to do anything violent. Also, it just knows it always checks in with you.
But nonetheless, there’s still a difference between, if I’m asking that AI for advice, for example, does it just tell me whatever it thinks I want to hear? That’s one thing. Or maybe what it thinks I would want to hear if I were sufficiently reflective? That would be a lot better. Or thirdly, it’s just a virtuous agent with its own moral character and is willing to push back?
There’s obviously a spectrum here. But I could be asking it to do something that is immoral but legal, and not within its guardrails and so on, and it might say, “I am the AI. I’m ultimately going to help you, but I really think this is wrong.” It could do that, or it could refuse.
So there’s a big spectrum, I think, within the AI character. And I think that’s really important for two reasons. One is because I do expect there to be a lot of relying on AI advice. And I actually think it will matter what the character of the AI is like when human beings are relying on AI advice.
In the same way as if you imagine the president surrounded by cronies and yes-men and sycophants, that’s going to result in one outcome. You could imagine another one where the president is surrounded by expert advisors; they have somewhat different opinions, perhaps, but they’re willing to push back.
Rob Wiblin: I guess ultimately they might do what the president wants if the president really is just not persuaded by what they say, but they’ll attempt to provide good advice.
Will MacAskill: Yeah. And then there’s also kind of shades to “aligned with what” question — of just, do we want it? You know, you could have this perfectly aligned AI that does all sorts of very harmful things. But you can also build in more constraints, such that even if something is legal but immoral, it still refuses to do that.
But there’s two reasons I think this avenue is promising, and they have somewhat different implications. The first is easier to understand, which is just how it affects our social epistemology and decision making.
The second one is kind of a Hail Mary to guard against the possibility of AI takeover. If AI does take over, there’s still this huge range of possible outcomes. One where it kills everybody. Second, where it disempowers humanity but doesn’t kill everybody, and in fact leaves perhaps humanity with a lot of resources and we are all personally much better off than we were; it’s just that we’re no longer in control of the future. Or perhaps it tortures all of humanity because maybe it’s even a sadist, or it just uses us all in experiments.
Or it goes and does something just really good, in fact, with the future. It’s like a moral reasoner itself, figures out what’s correct, judges that humanity was on the wrong track, and does it all. It’s still misaligned, it still takes over, it’s still bad; it’s not the outcome we want.
But there’s this huge range of possible outcomes, even conditional on AI takeover. It’s not just like a one or a zero.
Rob Wiblin: Right. I guess the people who’ve been worried about misalignment have tended to think that the AI would just end up with these completely garbled, completely random preferences. But because it seems like we’re more likely to get alignment with something comprehensible by default, you’re saying that actually now we should think about, at the point where the AI goes rogue and does actually take over, it kind of matters what preferences we happen to have loaded it with at that stage.
Will MacAskill: Exactly.
Rob Wiblin: And it does seem like we will have some ability to align it with something.
Will MacAskill: Yeah, exactly. So we could be paying attention to, in particular, is it aligned with some very particular set of values, or is it aligned with a kind of reflective process?
If it’s aligned with a reflective process, that’s like a larger basin of attraction, potentially a larger target. Because you would hope at least that people from many different starting positions in terms of worldviews would all reason into a similar, more enlightened view. And then also it just means that maybe that AI would in fact reason in that way.
Rob Wiblin: Just imagining an AI going rogue and taking over, and then it does further moral reflection and then changes its mind and is like, “Oh, sorry, that was actually not the right thing to do. I’m going to give it back.”
Will MacAskill: It happens with humans. Sometimes at least. And in particular, we’re just now in a stage where it really does seem like the AI’s values probably are going to be just a big mix of different things.
Rob Wiblin: Because we’re teaching them to make difficult tradeoffs a lot of the time. That’s what we do a lot of the reinforcement on, like how do you navigate between honesty versus rudeness versus annoying shareholders?
Will MacAskill: Yes. And it’s kind of starting off from human data and it’s being built for a human world.
But then also there might be some particular things that you really want to train it to do that is a bit orthogonal to all of this. So perhaps what decision theory it has.
Rob Wiblin: Can you quickly explain decision theory? No, don’t do it. [laughs]
Will MacAskill: There are various, somewhat esoteric arguments — that you can deal with on another podcast — where depending on the decision theory that it has, maybe it treats humans in a much nicer way, even if it takes power afterwards. I think it’s just one way of implementing the AI being a nice person, even if it’s kind of power-seeking.
Rob Wiblin: OK, so that’s been a whole lot on, if we get locked in, how to improve how that lock-in goes. Are there any other interventions that we should be thinking about making around now that don’t fit into either of those buckets that we’ve been talking about?
Will MacAskill: There are cross-cutting things as well. And I should say there’s way more ideas that we didn’t have time to talk about.
Rob Wiblin: We’re merely scratching the surface of all of the ideas that you’ve thrown out there in this enormous paper.
Will MacAskill: It’s a lot, and they need more depth. It’s shallow.
But yeah, there’s cross-cutting things. I’ve talked about this a bit, but one is just spreading some of the important ideas — so just what the intelligence explosion is, where we are, what challenges we’ll face. Actually trying to get buy-in on that ahead of time.
Second is just empowering more responsible actors. No matter how much we try to distribute power, a bunch of people are going to have a lot of power at a very crucial point in time. They’re going to be making really consequential decisions. I want those people to be humble and cooperative and farsighted and morally motivated and communicative and accountable and so on.
Rob Wiblin: [laughs] We’re pretty set on all of those points, I would say.
Will MacAskill: Yeah, luckily this one actually is already taken care of and we really don’t need to worry about that one.
Rob Wiblin: Yeah, but if things get worse, then we could intervene and try to make them better.
ML researchers are feverishly working to destroy their own power [02:34:37]
Will MacAskill: And that’s something that people can influence in terms of who you vote for, which companies you buy products from.
Machine learning researchers, they currently have an enormous amount of power that they will predictably lose.
Rob Wiblin: That they are feverishly working to keep away from themselves.
Will MacAskill: Yeah. So there’s actually this amazing confluence of self-interest among them, and also potential benefits for the world — because this is a way in which we can move some amount of power away from just leaders, and if the governments get more involved, the governments running the project. You could imagine this kind of union of concerned computer scientists, probably informal, but just a group saying, “We will only build this thing that will replace us under XYZ conditions.”
Rob Wiblin: It’s only a few thousand, maybe like 10,000 at most, who are relevant to this, and are likely to be relevant by the point that these decisions have to be made.
Will MacAskill: Yeah, exactly. And that’s just a way in which it could be another lever basically to steer things in a good direction so that it’s not wholly just economic and military incentives driving the shape of the technology.
Rob Wiblin: Just to spell this out a little bit more clearly for people at the back, there’s a few thousand ML researchers who currently have an enormous potential influence over the future because they have all of this intellectual knowledge. They’re the people who, over this relevant short period of time, are the ones who are going to be automating AI research and setting off this chain reaction, the intelligence explosion.
They are currently working very hard to automate themselves to the point where they are no longer required for this process and the AI will be able to entirely do AI research better than they could all by itself. At which point their leverage completely evaporates and they no longer have any control over anything. Currently they are working to bring about their own disempowerment while asking for nothing other than a salary in the meantime. I think that they should bargain for more.
Will MacAskill: That’s very well put, Rob. It’s very well put. And I should say ML researchers are often just good people, nice people.
Rob Wiblin: Many of my best friends are ML researchers. Please, folks. Please.
Will MacAskill: Exactly. And there will be very hard decisions that will be much easier to make if there were some sort of informal community or union or something like that. That could be, “Hey, this company is just not holding itself to high enough safety standards. We’re leaving.”
Rob Wiblin: “This company stinks. We’re going to go to a better company doing the same thing” — literally probably on another block down the road.
Will MacAskill: Down the street. Or secondly, in cases with much larger government involvement, saying, “I’m only going to work on this project under XYZ conditions.” And the conditions can be a really low bar.
Rob Wiblin: It can be, “We won’t build billions of killer mosquitoes.”
Will MacAskill: “Any AI we train has to be aligned with the US Constitution” — if it’s built in the United States.
Rob Wiblin: “It wouldn’t assist with a coup.”
Will MacAskill: It wouldn’t assist with a coup. Pretty low-bar stuff. But yeah, that lever does not currently exist, as far as I can tell.
We should aim for more than mere survival [02:37:53]
Rob Wiblin: All right, just to recap, we’ve talked about why to expect an intelligence explosion, and in general terms, what that would look like: massive speedup in scientific advancements and just lots of chaos, lots of things going on very quickly. We talked about what sort of grand challenges are generated by that. And we’ve been talking most recently about various different approaches, various different things we could try to lean on to make that time tend to go better.
We’re going to draw a line under that — that’s been “Preparing for the intelligence explosion” — and do part two of the conversation, which is on a different paper that is called “Better futures.” My take on it is it’s about ways to improve the future that don’t involve preventing extinction.
What were you trying to accomplish with this paper?
Will MacAskill: It’s really kind of an essay series, and I’ll say it’s still work in progress — so we would hopefully publish it maybe a couple months after this podcast gets released.
And it’s describing and really working through kind of a subphilosophy within longtermism. There are two ways you can benefit a person’s life: you can stop them from dying if they’re having a good life, so that they live longer; or you can increase their quality of life. Similarly with civilisation as a whole, you can stop humanity being destroyed, such as in a terrible pandemic; or even in those scenarios where it lasts a long time, you can make that civilisation better rather than worse.
And for a long time I’ve had the view that where the action is from a longtermist perspective is much more on making the future better, conditional on survival — sometimes called “trajectory changes” to it — than it is about extinction. And that’s for a few reasons that I’m sure we’ll get into.
So in this essay series we wanted to really work through that. One is just putting down fully in words the reason why I think some of these things. Secondly is in the course of that, actually doing reasoning myself on which of those views are correct and what they entail. And then finally, hopefully at the end, is that resulting in prioritisation.
We’ve just talked about this litany of grand challenges and all these different things you could be doing to make progress on them. Well, if we have a better kind of theoretical understanding of what’s most important — and in particular on what sort of society we want to get to after AGI — then perhaps that means that, given the extraordinarily scarce resources that the AI safety and broader effective altruism communities have, we can target the very most important challenges.
Rob Wiblin: So I guess you’re thinking of reducing extinction risk as kind of a baseline, or that’s something that you want to compare many other things to. What are the other interventions that you’re making that comparison with?
Will MacAskill: I guess there’s two ways of answering that. One is just that you could try to make worse-than-zero futures not happen. So the suffering-focused ethics folks are particularly worried about that — in particular, they’re worried about the very worst outcomes.
Or alternatively, you could try and take futures that are pretty good, but maybe mediocre or lacklustre in some way — or they’re still better than nothing, but we might call them dystopian — and instead try to get from there to something that’s really close to as good a future as we could feasibly achieve.
And it’s that latter thing that I’m particularly focused on. I mean, describe a future that achieves 50% of all the value we could hope to achieve. It’s as important to get from the 50% future to the 100% future as it is to get from the 0% future to the 50%, if that makes sense.
What I’m saying is just by definition, but it’s still somewhat kind of unintuitive — because you really might think we just want to avoid extinction, and then it’d be really good in absolute terms or something. But I think if we’re really morally serious, we should also be thinking about how we get not just to a kind of mediocre outcome or something that’s in some ways good and in some ways dystopian, but actually to something that’s close to as good as it could be.
Rob Wiblin: Yeah, OK. So as I understand it, I guess many people have been focused on preventing extinction, sort of getting us off the zero point so we could have some positive value, because there’d be sentient life that’s still an ongoing concern. You want to say that that has been a very big focus, but we’ve missed the opportunity to maybe go from whatever default outcome we get if we survive to something close to the very best possible future that we could have.
I suppose the value of that sort of work will depend a lot on how good you think the future will be by default, if we merely avoid extinction. If you think that by default we end up creating something that’s incredibly mediocre, that’s only a hundredth or a thousandth the possible value that we could achieve, then going from that 1% of the value to 100% is actually way more useful than going from the zero point to managing to get 1% by surviving but doing something kind of lame. Is that right?
Will MacAskill: Yes, that’s exactly right. Although I will say, even if you think by default we get something that’s half as good as the best thing that we could realistically achieve —
Rob Wiblin: The second half is still pretty good.
Will MacAskill: Well, it’s just as important, in fact. It becomes just as important to get from that halfway point to the best point as it is to just avoid extinction and get to the halfway point.
Rob Wiblin: Yeah. Why do you think it is that that sort of work of going from something to the very best has been kind of ignored or not that much discussed?
Will MacAskill: I think there’s a couple of things. One is that I think there is a bit of intellectual path dependence. Nick Bostrom, in his paper “Astronomical waste,” works through — he’s not a utilitarian, but on the assumption of classical utilitarianism — what’s the most important priority? And he argues for a principle called “maxipok” — maximise the probability of an OK outcome — where an OK outcome means avoiding any existential risk. So I think that’s been a kind of stage-setting for the intellectual development on existential risk that followed.
However, even in his own paper, I don’t think that really follows, because he gives three perspectives — one is extinction; a second is a future filled with biological beings; a third is a future filled with digital beings — and says the really important thing is that we avoid extinction. Whereas if you’re taking his hypothetical seriously, assuming classical utilitarianism, the really important thing is either getting to the future with digital beings, if they have value, or ensuring that you get to the future with human beings and not digital beings or extinction if digital beings don’t have value.
So actually you’ve got these three options from either perspective. One has lots of value and the other two have essentially zero value. It’s not that the two big futures have lots of value and extinction has none. Again, assuming utilitarianism as he does.
Rob Wiblin: Bostrom’s no fool. Why did he end up suggesting this probability of an OK outcome? I sort of understood him to be saying we want to maximise the probability of an OK outcome in the medium term, so that we preserve the option to do the very best thing later. Maybe that’s what he meant, but that’s not what people heard?
Will MacAskill: Yeah, this gets clarified. This has all gone over many years. He did clarify later that it’s the thought that if you’ve got society that’s sufficiently good, then probably you end up converging to something that’s close to the best.
I think a similar view is held by Toby Ord, for example, and a number of others too. So I think if you’ve got that kind of background view, then it makes that idea really quite likely: you just avoid the worst catastrophes now — things that would count as true existential catastrophes — and then we’ll end up just converging to something that’s really going to get you close to the best.
Rob Wiblin: Yeah. All of this is within the framework of consequentialist thinking, where we’re imagining that our goal is to maximise the expected value of the future or something along those lines. I guess people have lots of different reasons for the work that they do.
One that’s very salient to us now is that we potentially don’t want to die ourselves; we don’t want the next few years to be a disaster, and for everyone to end up dead, or our families to end up dead. So that’s maybe another factor that has pushed in favour of the extinction work.
But people also just have other moral frameworks, where they’re like, “It would be negligent. I don’t want to be the kind of person who stands by while we’re about to have this disaster.”
Will MacAskill: Yeah. And also just the neartermist view. I mean, with my neartermist hat on, I’m extremely in favour of work to reduce extinction risk. Just the sheer benefit-cost, it looks very high priority.
And then you’re absolutely right that especially in more recent times — and this was a criticism of What We Owe the Future — lots of people were saying, “We actually don’t really particularly care about longtermism; the reason I’m concerned about existential risk is for near-term reasons.”
And that’s great. I mean, I’m encouraging of a plurality of plausible moral views. But if you are a longtermist — you do care about how things go not just now, but for future generations too — there’s in fact an argument that people really don’t want to die, so there’s going to be a lot of investment to trying to reduce the risk of extinction. Whereas many people do in fact want to, for example, lock in their own narrow ideological values.
So that suggests that actually the probability of a kind of bad lock-in might be quite a bit higher than the probability of extinction. And moreover, which is going to be more neglected even once that’s taken into account? Is it going to be stuff that very viscerally might kill you? And also has lots of probability of smaller, more likely catastrophes, such as pandemics? Or is it going to be potentially super niche weirder stuff like digital rights and space governance?
So if you’re coming from the perspective of taking this “it really matters how the long-term future goes,” those are at least heuristic arguments for thinking you might want to focus on those other things.
By default the future is rubbish [02:49:04]
Rob Wiblin: Do you want to make the case that by default — assuming that we navigate through upcoming times, and we don’t go extinct — that in practice we’re likely to do something that from an ethical or moral perspective is very mediocre, that only achieves quite a small fraction of the total value that in principle could be generated by the universe?
Will MacAskill: Sure. I think there’s a couple of ways of getting to this conclusion. One is just failing to do the very best thing, and second is you just do one or more bad things.
To illustrate both, potentially, we can just consider the world today. The last few centuries have resulted in this enormous material abundance. We’re hundreds of times richer than we were even just a few centuries ago.
And you can ask the question: for this year, how well have we done at turning that material abundance into good outcomes? I say we’ve done very badly. In fact, we have utterly squandered the wealth that the scientific and industrial revolutions have given us.
That’s for a couple of reasons. One is in the sense of failing to make the most of it. Partly that’s because global inequality is so extreme, so we’re spending a lot of resources in ways that don’t really improve people’s wellbeing very much at all compared to other uses. There’s also just all sorts of dumb ways in which society is structured. People spend loads on consumer goods that don’t really benefit them, so I think there’s a lot of squandering in terms of upside.
But then the bigger thing is the introduction of bads as well. Just one moral error — namely, how society treats nonhuman animals, and in particular the tens of billions of animals that are killed in factory farms every year, living lives of terrible suffering — that is, at the very least, enough to undo or outweigh most of the positive things that have arisen as a result of human living standards getting higher. Maybe it’s even enough to completely outweigh it, such that things have gotten worse. But either way, it at least undoes most of that good.
So in fact the world today is achieving really a small fraction at best of the value it could have been achieving. And the same thing might happen in the future.
When I start thinking about this, I really start going through different views in moral philosophy. So you can start off thinking about what I call kind of “linear in resource” consequentialist views, where total utilitarianism is one example. Basically the more resources that you have, the more good stuff you can produce. So if you’ve got a good civilisation, if that’s twice as big, it’s twice as good.
On this view it seems like, wow, it’s going to be really quite hard potentially to get a near-best future, something that is say 90% of the best. Because you need to do two things. One is just use as many resources as possible, everything you can access — which is really a lot. There’s 8 billion accessible galaxies and so on.
Secondly, it also needs to be the case that how you’re using those resources or whatever has the highest value per unit of resource. And that could be like, who knows? There’s a lot of things you could potentially do with resources and processes. So that could be these tiny mines; it could be these huge mines, galactic scale; it could be varied, changing processes and so on. But out of the extraordinarily large ways you could organise mass energy, there’s probably going to be one that’s the best. And anything, even the second or third best, might be significantly worse. So that’s why it’s plausibly quite a real challenge.
Rob Wiblin: It’s a heavy lift.
Will MacAskill: Potentially a heavy lift. You’ve really got to aim at this one specific thing. I should say, as I’m going through these views, I’m not defending any one of them. And in fact, it’s a notable thing that most theories of population ethics that are on the table have this conclusion that the best possible future is a basically homogeneous future — where you’ve got something that’s the best thing, at some scale, and it might be quite complex within that, and maybe the scale is large, like a solar system or something.
Rob Wiblin: And then you make more of it.
Will MacAskill: Once you’ve got that, just make more and more of it, because that has the highest value per unit of resources. And some work in progress we definitely won’t have time to talk about is that I find that conclusion particularly terrifying. So I have this new theory of population ethics that I was working on. You know, if we had more time, it’d be the most enjoyable thing I could do, but that’s paused for now. But a theory of population ethics that I think is quite compelling, but avoids that implication.
Rob Wiblin: OK, yeah, interesting. Your computer can come back to that at some point.
Will MacAskill: Exactly, yeah. The AI will do it in a couple of months or something.
So that’s a linear in resources view. And then you’ve got other views. For example, views on which it’s not that hard to get to the upper bound, and views on which there’s a much richer kind of diversity of goods.
Part of what my work has been doing is going through all of them based on these kind of structural features, in order to see if you can get the implication that really just quite a wide variety of different views of different ways the future could go give you something that is close to the best anyway.
And I ended up concluding kind of no. A significant thing is, even if it’s easy to get the upside — because you just say, you’ve got an Earth full of amazing, flourishing beings who are all free and able to do what they want — even if that’s close to the upper bound, you also need to make sure you don’t have bads too.
And this is, I think, part of common sense. I could tell you a story of this amazing utopian future, but it just has one flaw: it’s based on some rigid racial hierarchy. Or like in The Ones Who Walk Away from Omelas, it’s founded on the suffering of a single child. I think that’s common sense, you lose a lot of value. As in common sense that that’s enough to turn it from a utopia to something at least semi-dystopian.
But then also formally: you’ve got all the goods, they’re kind of close to upper bound, but you’ve got this really big civilisation, and it has some bads too, even just occasional bads like suffering. Then just given how big the civilisation is, that’ll take you far away from the upper bound again.
Sorry, that got a little more involved than you might have been intending.
No easy utopia [02:56:55]
Rob Wiblin: So you’ve just given us a quick summary of this paper that will eventually come out, fingers crossed in time for it to be useful, called “No easy utopia.” I guess you argue that sort of easygoing views — where it’s not that hard to achieve the best or a near-best future — that those are not really plausible, or they run into all kinds of different problems.
I think the more straightforward way of seeing that is just to look at big moral disagreements that we have today, and realise that basically if we get any one of these things wrong, then that could make the world negative, or certainly at least would destroy the value that we’re generating or would reduce it very enormously.
It’s just almost any one you can name. So if abortion is really wrong and in fact it’s equivalent to murder, then that would surely be like a massive stain on human civilisation as it exists now. Conversely, if it’s not problematic at all, then all of the restrictions on it would also be very bad.
The destruction of the environment: if that’s as bad as some people say, that would be extremely bad.
I guess if religious folks are right and there’s many things that are contrary to what God would want, then presumably we’re committing many infractions.
And you can go on and on down the list. So it seems on most people’s views, value is quite fragile. And if you get even a limited set of all of the different debates wrong, and then you scale that up enormously, then you’ve fallen quite short.
Will MacAskill: Absolutely. And in fact the situation we’re in at the moment I think gives us lots of false cause for reassurance.
Take theories of wellbeing, for example. In philosophy there’s three main theories of wellbeing: hedonism, which says positive and negative conscious experiences are what’s of value for someone; objective list theory, which says that other goods are good or bad for you intrinsically beyond that, like having friends, having appreciation of beauty, having knowledge and so on; and then preference satisfaction views, which say that having your preferences or your idealised preferences or something satisfied, that’s what’s good for you.
And in the world today, these line up really pretty closely. So if you and I work together on a search for Giving What We Can on what are the most effective charities to benefit people in the developing world, the difference between these views really doesn’t impact practical recommendations. Because having basic health, for example, satisfies your preferences, it makes your conscious experiences better, and it improves your objective goods as well.
And that’s just systematically the case, because often what we’re giving people are these instrumental goods, what Rawls called “basic primary goods” — things that are just good, whatever your other goals are.
But once we’re deep into technological maturity, and once we’re creating beings, designing them to have whatever preferences we want — which is what we’ll be doing after AGI — these things can come radically apart.
Rob Wiblin: Do we create a new being that has extremely strong preferences that we satisfy? Or do we create a being that doesn’t have any particular preferences but feels very good? Or do we need to create a being that is capable of having friends and a flourishing life, and all these other things which might be completely different than the first two?
Will MacAskill: Yeah, exactly. So from each theory of wellbeing perspective, the best future on the other theories would be basically worthless, or lose out on most value.
Rob Wiblin: Yeah. I think probably part of the reason why people disagree about whether it’s subjective wellbeing, preferences, or objective list is because in the world as it exists now, with humans as they are, it’s actually kind of hard to tease apart these different things. They all look quite similar or they all tend to come as a package.
It’s possible that at the point that we can separate them and just have one and not the others, maybe that would lead to convergence on which one we view as most beneficial. Or why do you think it is that people disagree so much about which one of these things ultimately is the thing of intrinsic value?
Will MacAskill: Well, there’s this huge question of whether we should expect people to converge morally. Again, it’s something I want to write on. And part of my views… It’s funny, because I’m more realist sympathetic, and normally people who are more realist sympathetic are more inclined to think that there would be convergence.
Rob Wiblin: Because they think there’s a fact of the matter. It’s more like a scientific or empirical question.
Will MacAskill: Yeah, exactly. Whereas all the subjectivists I know, they think that what makes something right or wrong is just what I would want, or want myself to want on idealised reflection. Often they seem to think that you’ll get convergence.
And I’m baffled by this. If I’m a subjectivist, I expect radical divergence between different people based on just subtle and arbitrary facts about their psychology, essentially. Because there’s nothing constraining you, ultimately.
Rob Wiblin: Well, I guess we have similar genes, we have a similar ultimate design. But it does differ.
Will MacAskill: Yeah, we do. But take even people who are similar: they’re both classical utilitarians, so they add up happiness in the same way; and they both agree that positive and negative conscious experiences are the things that are of value. But I like these conscious experiences and you like these conscious experiences. There’s no fact of the matter beyond our preferences about which are good or bad.
So why on Earth, out of all the conscious experiences that there could be — which there really are a lot — do we agree on which are good and bad? But then more importantly, why do we agree on the quantities of goodness? Because again, remember that what you end up doing as a classical utilitarian is figuring out what has the most value per unit of resource. If you and someone else differ on the magnitude of goodness from different conscious experiences, even by a bit, you will differ utterly on what that highest-value-per-unit-of-resource thing is.
Sorry, I’m ranting. But if you’re the subjectivist, then there’s nothing constraining you ultimately between those two positions. And we were talking about the narrowest difference. Once we start thinking beyond, to other forms of consequentialism, other theories of wellbeing, I think you’d end up with very different views indeed.
Rob Wiblin: OK, we’re getting into some philosophy that I didn’t necessarily intend us to get into there. Let’s bring it back to this overall issue of discussing ways of going from OK futures to the very best.
Actually, while we’re still on the philosophy and the subjectivists, should the subjectivists and nonrealists about ethics and morality care about this entire agenda? Because they might not think that there really is such a thing as a very best future, other than just what they happen to like. So is this kind of work relevant to them?
Will MacAskill: I think basically 100% yes. There are definitely implications depending on which metaethical view or views you think are most likely. But basically, if you think that some outcomes could be bad, some outcomes could be good, some outcomes can be better than other outcomes, then you will in fact have views about, “This outcome could be really good, and this outcome could be only a little bit good.”
You might have an easy utopian view or something where you’re not as concerned about getting the very best, and you think lots of things are really good. But at least if you’re concerned about consistency of your preferences — and maybe you’re not, that’s actually fine; that’s one way of responding to this whole thing — but if you are, then you’re going to have to respond to some of the arguments I’m making about why those easy utopian views are quite hard to come by.
It will still be the case that you’ll want to grade different futures in terms of how good they are and how bad they are.
Rob Wiblin: And the differences could be very big.
Will MacAskill: And the differences could be very big. I think it’s a perfectly common sense position to think that a future with extreme concentration of power might be really bad. Well, what if there’s distribution of power, but people are just kind of evil, and they like promoting suffering? And I think that’s clearly bad.
What about other moral mistakes? I think we can often dodge a lot of metaethical arguments by just saying that how we’re going to evaluate things is just by what we would want things to be like if we were idealised: we had loads and loads of time to reflect, and we had really big brains where we’re able to hold all knowledge in our heads and so on.
Because the realist will say that’s the best thing we’ve got to track what’s actually good and bad. And the subjectivist will say that this is what we’ve got in fact.
Rob Wiblin: Yeah, I see. It’s just the preferences that make it good.
Will MacAskill: Yeah, exactly.
Rob Wiblin: I guess all of the analysis basically looks the same; you’re just measuring it against a bar that is justified differently.
What levers matter most to utopia [03:06:32]
Rob Wiblin: OK, let’s push on to some of the modelling that you did. So there’s different approaches that one could take to improving the future. Of course, preventing extinction. There is trying to go from a bad future to a neutral one, a neutral one to a really good one. You also break it down in terms of trying to delay lock-in or improve lock-in, should it occur, and different interventions one might make.
And you go through this slightly complicated modelling exercise, where you theorise about this and look at how these things could interact with one another. If you go extinct, then you can’t get locked in. If you get locked in, that might suggest that the future will be better or will be worse. Then you basically try to use that analysis to figure out which one of these things probably is going to be the most influential, and are some of these things important in ways that people haven’t appreciated or not important in ways that people haven’t appreciated?
Can you explain the modelling you did?
Will MacAskill: Sure, yeah. It was just a very simple way of modelling the present and the future, both its trajectory and its value, kind of loosely inspired by a Markov process.
But the key idea is that there are two periods: this century and future centuries. Very simplified there. In this century, we could either have an open future, where it’s not guaranteed where we end up; we could go extinct, and if we go extinct, we’re guaranteed to end up extinct; we could have what I call “viatopia” — and this is among the most important concepts I want to promote in this series: “via,” as in like a waypoint, where it’s a state of society such that if we’re in that state, then we’re very likely to end up in a near-best future and very unlikely to end up in a dystopia — or we could have what I call a “foreclosure,” where if you’re in that state, you’re extremely unlikely to end up in a near-best future.
Rob Wiblin: So you won’t go extinct, but you’re going to produce a mediocre future. Very high probability.
Will MacAskill: Yeah. I tend to call that a “mistopia,” like misfire, but also like a missed opportunity, which is etymologically distinct but helpful all the same.
And then there’s the future centuries, where we can end up either extinct; end up in this kind of utopia, near-best future; end up in this mistopia, which includes things that could be worse than zero as well; and where, if we’re in an open future now, so we haven’t got locked in, then ultimately we end up in one of those states too.
Then the three things that we can do this century is: reduce the chance of extinction and increase the chance of everything else; decrease the chance of lock-in, so increase the chance of an open future and decrease the chance of everything else; or we can make it more likely that, given that there’s a lock-in, it’s viatopian rather than foreclosure.
And this is a simplification in all sorts of ways. I talk about lock-in, even though there’s a more general concept of path dependency — which doesn’t guarantee a particular outcome, just makes it more likely. I’m also just dividing good state and bad state, essentially, and giving them values — whereas of course there’s a big difference between a future that’s merely 89% as good as it could be versus something that’s substantially negative.
It’s obviously weird to be doing ethics on a spreadsheet, but I actually found this modelling exercise remarkably elucidating, even just for my own views, to be reasoning through some of this stuff.
Rob Wiblin: Yeah. I guess it seemed to me, just scanning it, that intuitively most of those simplifications wouldn’t flip the results, or they wouldn’t really cause the stylised facts that you were going to produce to be wrong. I guess that’s your view as well?
Will MacAskill: Yeah, at least I hope so. And if it was, then it would be a bad model.
Rob Wiblin: Just a conceptual issue that occurred to me reading it is that you talk about, for example, you’re in the viatopia if you’re in some intermediate state where you haven’t yet achieved utopia, but you’re very likely to achieve utopia. You’re in foreclosure if you’re very likely to produce something that’s kind of mediocre. And you’re in an open future if you can’t tell.
But that’s kind of from one person’s subjective position, whether you can tell or not whether you’re locked into a future or not. Because I guess quantum physics aside, the universe is like, no matter how complicated and chaotic things seem to begin with, it is going to produce some outcome. So you kind of are either locked into a good outcome or a bad outcome or whatever outcome it’s going to be. And the fact that you’re in an open future is merely descriptive of your own inability to forecast what is going to happen.
Is that right? And if it is right, is that a problem?
Will MacAskill: Yeah. So I’m smiling because it’s a very philosophical objection to make. So it’s kind of right. OK, yes, absolutely true. Quantum randomness aside, you’ve got a state of society. You’re guaranteed to end up in some particular state. I think Laplace’s demon is the thought experiment of a being that knew all of the laws and could do all of the computing.
However, I’m not meaning the probabilities here to just be like, whatever your point of view happens to be — such that if you were completely off-base and thought that the world today is locked into lizard man future or something, that that counts. Instead, I think there is a case that you can have some notion of objective probability, or at least evidential probability.
So take flipping a coin. Now, if I knew enough, I would know whether the coin comes up heads or tails. But we can give sense to the very intuitive idea that it’s 50/50 by saying that you’ve got the system, which is the coin on my finger. Make just small perturbations on it in not very significant ways, and then look at the frequency of what happens given those small perturbations — where that would result in 50% heads, 50% tails for a fair coin.
Now you’re smiling. And then you could do that with civilisation as a whole, where if you’re locked in, then it doesn’t really matter what’s the weather like, or was it warmer or colder, and lots of other fairly minor points about the world today. Instead, you’d still end up in the same direction.
Rob Wiblin: Yeah, I was going to say, if you liked that pointless philosophical inquiry from me, you might really enjoy our interview with Alan Hájek on puzzles and paradoxes in probability and expected value from two or three years ago, where I think we end up concluding that probability is around these perturbations. So you’ve got nearby counterfactual worlds, and how likely are you to get the outcome that you’re describing? So I guess, yeah, seems reasonable.
So what sort of parameters did you need to plug in in order to build this model, and figure out the impact of changing the different parameters?
Will MacAskill: I’m not sure I’ll be able to remember all of them. But some key things are: probability of lock-in this century; probability of extinction this century; if there’s lock-in, probability that it’s good rather than bad, namely viatopia rather than foreclosure; then, if we have an open future essentially, probability conditional on that of extinction, utopia, and mistopia.
And then the value of mistopia if we get locked in in the bad way this century — like what’s the value of that bad lock-in, because it might not be zero. We could get locked into something mediocre. And the value of that lock-in if the nonutopian future occurs in the future — where, if you think generally things are progressing, you might naturally think that even if we miss out on the very best, probably it still ends up going better if we didn’t get locked in in this century.
I think those are all of the parameter values. I may well have missed some.
Rob Wiblin: We’re a couple of hours in here, so I think our minds are getting a little fuzzy.
Will MacAskill: I think we’re over four hours in.
Rob Wiblin: Do people agree on these parameter values? Or to what extent do people massively disagree, in your experience?
Will MacAskill: Well, I would love as a followup to actually start surveying people. The one thing that I know of that did somewhat do this was a fairly informal survey done by the Future of Humanity Institute. And it was just a convenience sample; I think people at a particular conference.
And there, people were really all over the map. I can’t actually remember the details, but people can go all the way from thinking probability of getting to a truly flourishing future conditional on survival was like 90% all the way to I think 1 in 10 million was also on there. Similarly with the probability of extinction: I think people varied from 90% to less than 1%. They were asking different questions than I had, but it definitely showed you there’s a lot of variety.
Rob Wiblin: There’s not a consensus. I guess a critical thing, or at least an important thing, is whether lock-in becomes more likely to become positive if it’s delayed. So do we want to lock in things quickly, or do we want to push it out into the future as much as possible and allow more reflection? Do people agree or disagree about that?
Will MacAskill: I really don’t know. That wasn’t in the earlier survey. But certainly something that came out of me doing this exercise was just appreciating how neglected yet important that question is. I’m certainly coming from a perspective of thinking that we are like these children. You know, it’s as if you’ve got to decide how your entire life goes, but you’re 12, you’re 7. Look, we should have smarter, more enlightened, more informed people in the future make those decisions.
So in my best guess, I do think that if we can punt these decisions into the future, the decisions will get made better. But it’s quite nonobvious actually. Again, something that’s confused me about subjectivists is that I think subjectivists probably shouldn’t want that, because they won’t be around. Or maybe they will be, they think they will be, but in those cases where they weren’t. So their particular preferences won’t get satisfied.
But you also just might think there’s various sorts of essentially moral decay that might happen. So you might think it’s just much more likely that you end up with concentration of power at a later date. You might think that AI totally messes with people’s epistemics. I actually think there’s various things to do with how people would reason about or make choices about distant time and space — where at the moment it’s mainly the moral preferences that they have.
Rob Wiblin: It’s so far in the future. It feels more speculative.
Will MacAskill: Exactly. And given people’s preferences at the moment, you don’t need that to be as basically as happy as you can. However, once people are already extraordinarily well off and post AGI, now perhaps they want it, but as a status good or something.
Rob Wiblin: Just to show off.
Will MacAskill: Exactly. Because this is already the case. Look at the billionaires in the world. They just hang on to their money in a way that’s baffling, honestly.
Rob Wiblin: It’s not even really clear that it’s delivering much of anything for them.
Will MacAskill: No. You clearly are not getting more wellbeing in the narrow sense, once you’re past $10 million of consumption per year.
Rob Wiblin: It must be about satisfying some other preferences, or inertia.
Will MacAskill: Yeah, there’s inertia. But the obvious thing is that they’re buying this status good. They care about what position they have on the billionaires’ list, and they are willing to effectively pay tens of billions of dollars, rather than spend that money on something else, including moral goods.
So you could have this future where now people are reasoning, but now they care about distant resources just as a matter of personal status — rather than at the moment, people would have a more ideological lens on it, in a potentially positive way.
So like I say, I actually do think probably people in the future, they’re going to be smarter, they’ll be more enlightened, there’s more time to think about things.
Rob Wiblin: Do you have any idea how dumb I am? And I’m one of the best!
Will MacAskill: Yeah, exactly. We are not equipped to be making such weighty decisions. But it felt notably understudied to me.
Bottom lines from the modelling [3:20:09]
Rob Wiblin: So what were your main takeaways from all of this analysis? Did it take you in a direction that you didn’t necessarily expect?
Will MacAskill: In a few ways. That was one, actually: the importance of this. And to be clear, I forced myself to actually put numbers in these things, and that was pretty interesting for myself.
We started this conversation talking about two ways of putting it: the updates I’ve made in the last seven years, or the ways that I was wrong earlier on. I’m perfectly happy with that latter description, but those changes hadn’t necessarily filtered through to other of my views. So it was interesting for me, putting my probability of extinction, probability of lock-in: they’re much higher than they were in the past. So that process alone was actually just very elucidating for me.
Other takeaways: one is that I think this better futures work is pretty robustly in the same ballpark of importance, at least, as reducing the risk of extinction. I think that’s true just across quite a wide variety of views.
One thing that was interesting in particular is that what I might say are ordinary ranges of optimism versus pessimism about the future didn’t really make that big a difference to the bottom line. So I had always thought the biggest intellectual difference between me and Toby Ord was our level of optimism about the future. But actually, you run the numbers, and no, within all of these like order of magnitude kind of things, it didn’t really divide things.
Rob Wiblin: Why is that?
Will MacAskill: I think Toby thinks avoid the big existential catastrophes, or at least the most obvious ones, essentially. I don’t want to put words in his mouth exactly, but under some sort of condition, where you’ve avoided near-term existential catastrophe, the expected value of that future is like 50% of what it could be.
And that seems very optimistic to me. There’s various reasons why my numbers have been argued up over time, but I’d at least put that many times lower, 10% or something. But he still thinks there’s 50% of value that’s still up for grabs. So actually the level of optimism he would need to have is, he would need to think it’s more than 90% — like 99% of value or something.
Rob Wiblin: I see. OK. So the reason is even the extreme optimists think we’re only going to get halfway there on average. That’s only a 2x penalty relative to someone who thinks that we’ll get zero on average. Is that right?
Will MacAskill: I’m too braindead at the moment to confirm the 2x thing, but it sounds plausible to me. Whereas there’s what I call the easy utopians where no, it’s more like 99% because just surviving and perhaps we might want to say and avoiding dictatorship and so on, but avoiding those things basically is good.
Rob Wiblin: Were there any other upshots?
Will MacAskill: One was that, beforehand, I hadn’t actually fully and cleanly delineated the delaying lock-in from improving lock-in. Again, there’s all these updates that just sound so obvious, but actually in my mind I don’t think it was that clear.
And then I certainly hadn’t appreciated the arguments for thinking that improving lock-in conditional on it happening could be a lot more promising than delaying lock-in — where the key argument is just that if you delay lock-in, maybe you’ve made things worse.
Rob Wiblin: Right, I see. Whereas at least if you’re doing the thing where you take it from nothing to the very best, or that’s your intervention, there’s no countervailing negative, which is that by merely delaying things, that might improve things or might make things worse, and the net effect is a lot lower.
Will MacAskill: Exactly, exactly.
People distrust utopianism; should they distrust this? [03:24:10]
Rob Wiblin: Interesting. I guess the upshot is, let’s worry a little bit less about the extinction thing. Instead what we should do is try to take whatever future we’re going to have and turn it into this utopia, the very best possible future, that’s an incredibly fragile and incredibly narrow target, we think.
I think to many people that might sound a little bit utopian. And I guess people have negative associations with utopian because it tends to justify negative behaviour. It can just seem kind of naive, I guess. People who tend to be utopian tend to be very locked into quite a narrow perspective on what’s good; that’s the sort of mindset that drives people towards utopianism.
Why isn’t this the sort of bad utopianism that has justified all kinds of bad things?
Will MacAskill: It’s a great question, and I’m very aware of the fact that I use the term utopia a lot. I spell it with an E: eutopia.
Rob Wiblin: To try to get some distance from the associations of the concept.
Will MacAskill: Yeah. Though it sounds the same. But yeah, utopia is extremely unpopular as a term and as an idea. And honestly, I think for very good reason. Utopianism was much more popular at the end of the 19th century, early 20th century. And some things happened. It didn’t age well.
But lots of the depictions of utopia didn’t age well either. My partner and I have a bit of a shared hobby of reading old utopian fiction, and it’s really remarkable the extent to which those utopias now look like dystopias, even having not that much kind of moral progress that has happened. It’s like only a century, which is small in the grand scheme of things. But very often the societies are totalitarian; very often they bake in kind of moral blindspots of the time.
Thomas More’s utopia, the person who coined the term, had amazing abundance, this wonderful society in many ways. Every household owned two slaves. Not so appealing nowadays. Similarly, Aldous Huxley, who wrote Brave New World, also had a piece of utopian fiction called Island. Again, this very technologically advanced society, but also in touch with nature. The adults had sex with children. It’s just like, whoa.
Rob Wiblin: Wasn’t Aldous Huxley doing that in the ’30s? I’m surprised that was the kind of thing someone would write in the ’30s.
Will MacAskill: This is definitely a digression, but even in the ’60s and ’70s, the French existentialists and philosophers were signing an open letter to say that there should be no age of consent, or it should be much lower than 16. It’s really easy to forget how quickly moral attitudes change, and actually potentially how morally different even just 100 years ago was, or even less.
So depictions of utopia often, in fact, end up looking really quite dystopian. I think that’s true from Plato’s Republic onwards. And the lesson is we shouldn’t be trying to depict some particular vision of the future and aim directly towards that. In fact, that’s terrifying. That’s what we should be guarding against.
This is why I’m trying to develop and promote this idea of a viatopia. That’s a way station instead; that’s acknowledging we don’t know what the ideal future looks like, and instead is a state that we just think would be on track. It’s good enough that it’s on track to get into a really good future.
I’ve talked before about the long reflection as an idea, that would be one implementation or one proposal for what viatopia would look like. But there’s other potential proposals too, like the idea of a morally exploratory society, which I talk about in What We Owe the Future, or the idea of a grand bargain between different value systems.
And my hope at least is that this can give a positive vision — which is extremely lacking in the world today — for the post-AGI future, without the terrifying and often totalising kind of utopian impulse that we saw in the early 20th century in particular.
What conditions make eventual eutopia likely? [03:28:49]
Rob Wiblin: What would this viatopia most likely look like? You’re making me think that we’ll have a parliament or a congress where people will send delegates, and they’ll debate what would be the best future for a very long time. And I guess they won’t take any irrevocable action until there’s a supermajority or a super-supermajority in favour of some particular view. And if they can’t reach that, then they’ll bargain things out and split things up, but also try to make sure that no one’s doing anything that other groups particularly hate.
Are there any alternative visions for viatopia, or is that kind of the direction it would go?
Will MacAskill: Yeah, there’s just plenty of detail in there. And after this “Better futures” essay series is done, the next one I plan is on viatopia in particular. So this is looking ahead to work I haven’t done, and lots of things I still feel very unsure about.
But there’s tonnes of granularity in terms of, one, in order to get to a near-best future, do you need basically most people to converge onto the correct moral view or whatever your idealised preferences would be? Or do you just need some people to converge, and then to have the ability to trade between different parties?
That’s a really crucial distinction in my view. A case for thinking that you just need some people is if, perhaps optimistically, those who aren’t particularly morally motivated have comparatively small-scale preferences. Perhaps they really care what happens kind of close in space and time. The people with moral motivation, they care about what happens far in the future and far away. So both parties can get kind of what they want. That would be the optimistic case.
It’s less optimistic if lots of people have different moral views and don’t converge — so you’ve still then got a lot of conflict between the people who do care about things that are far away and distant, and they want to do very different things for those resources.
There are also ways in which that kind of setup could go badly wrong. Because if we’re all bargaining over resources, maybe I’m incentivised to modify my own preferences to make them more worrying from your perspective.
Rob Wiblin: I see, to make them more menacing.
Will MacAskill: Exactly. So let’s say you’re classical utilitarian, and maybe I just like having more wealth for status or something, I just care about something else. But then I can self-modify in whatever way, like, “Oh, I actually really like suffering now.” Well then, the threat point, the status quo if we don’t manage to make a deal, is much worse from your perspective.
Rob Wiblin: So it gives me a reason to concede more ground. Unless I object in principle somehow to the blackmailing situation — just like, “I will never accept being treated like this or being exploited like this.”
Will MacAskill: Yeah, exactly. But it at least gets hairy. And so there are some arguments for optimism, for thinking that even if people end up with super different moral perspectives, they all do trade. And we end up with, as a result of that, something close to as good as possible.
There’s some worrying dynamics, that I have not gone deep into this, that could actually mean that this ends up really quite bad. Because the worst case then is I self-modify to love suffering, and you say, “Well, I don’t trade with blackmailers.”
Rob Wiblin: Yeah, I’m not accepting this crap.
Will MacAskill: But I’ve already irrevocably self-modified, so then I go and do it. That’s not an outcome we want.
Rob Wiblin: Let’s not do that.
Will MacAskill: But then either in that case or in the case where we’re like, actually we want to try and get to something that is much closer — where basically most people, if there are more correct moral views, they kind of win out over time — then there’s a question of how do you even structure that to begin with?
There’s been some work in this with respect to deliberative democracy, which is pretty interesting. That perspective on democracy, the point is that people have to argue and reason things out, and as a result they improve their understanding and moral position over time. Again, this is all stuff I’ve not looked into, but the ideal is that you could get something that’s actually really quite concrete at the end of it. That’s one way of going.
There is a separate thing, which is just figuring out what’s feasible and building up there. You might say, let’s say the default is just that the US gets the most power. How bad a situation is that on the scale of things? How much better would it be if there’s an alliance of countries that have the power? And kind of build up from there, maybe imagine even what’s your best case in terms of what happens? How much do you have to change in order to get yourself to a viatopia?
Rob Wiblin: Are there any visions for a viatopia that aren’t like this deliberative, inclusive process? You were hinting towards, definitely not the best possible future, but one that might be a bit of a Hail Mary pass is to think that the AI might take over anyway, maybe it’s going to go rogue and end up disempowering us, but we should just rush to fill it with good values and the desire to reflect and so on. I guess that could be an alternative viatopia that we might not aspire to, but that we might get by accident.
Will MacAskill: Another version would be a more left-libertarian vision, where again we’ve got these huge resources, and also they’re even defence-dominant if we’re talking about different star systems. And that just gets equally parcelled out to everyone on Earth and they get to go and build whatever vision of society they want.
That has some attractive features on nonconsequentialist grounds. I think if you work it through, there’s actually not that many perspectives in which that looks really good. It feels fair — although there’s open questions when you’re talking about all resources about why have we privileged the people who happen to exist today?
But then it’s probably not going to be close to the best on this linear in resources view, because people will want all sorts of different things, so it’ll only be whatever faction holds that view, that’s the fraction of value you’ll get.
But then also, if you’ve got this more kind of common sense view where value is bounded above but not below — so there’s a limit to how good things can get, but things could get very bad — well, you might think it’s good in terms of the upsides because you’ve got many different shots on goal. That’s why I was attracted to this idea originally. You’re almost certain to get some —
Rob Wiblin: At least someone’s got to get something decent.
Will MacAskill: Exactly. And that’ll get you to near the best. However, just one person producing something that’s actively bad, just one of those 8 billion people being a sadist, for example, will be enough for the whole outcome in fact to be substantially negative, probably. Because again, the bads are very heavily outweighing the goods.
So it’s hairy, but I think there’s many possible visions. Maybe a final one would just be kind of business as usual. It’s just like kind of democracy, people vote. There’s nothing particularly special that happens. This is more like easy utopia kinds of views, but things look perhaps like a liberal democracy today, and we muddle along and that’ll be fine.
Rob Wiblin: Yeah, yeah. OK, I think we should probably push onto the final section. I feel like we’re both looking a little bit wiped out. I was saying when we were taking a break that this is a little bit like Hot Ones, where you throw people off-balance by giving them chilli. We’ve just been going for so many hours, I think we’re beginning to lose our disinhibition.
The new Forethought Centre for AI Strategy [03:37:21]
Rob Wiblin: The final section was going to be about this org, the Forethought Centre for AI Strategy. This is a new organisation. It’s got a new research agenda. Do you want to discuss how you decided to found this organisation? And was that a difficult decision compared to other things that you could do?
Will MacAskill: Yeah, sure. So I spent a lot of time thinking about what my next steps should be. In particular, in 2023, early 2024, I was extremely burned out.
It’s funny looking back, actually, because on our last podcast, I remember you were asking me about all the stuff I was doing, and you were like, “How do you do all this?” I was like, “Oh, I’m extremely burned out.” That was like March 2022. And I remember saying, “It’s OK. I’m taking a holiday literally next week.” It’s funny: I never took that holiday. There was this big crisis in the organisations I had to deal with.
So yeah, looking back, I think I was just extremely burned out through 2022. But that was like March of 2022. Things got a lot crazier after that point in many different ways. So I spent a lot of time thinking about what are my next steps? I did wonder about taking a year out from just doing good. As in trying to become a DJ or something.
Rob Wiblin: “Having fun better.”
Will MacAskill: Yeah, having fun better. I actually ended up — and I’m really happy I did this — concluding that I’d end up happier if I didn’t do that, and instead did research for the year.
Rob Wiblin: I’m not sure I would feel the same way. Why was that?
Will MacAskill: One thing was just about achievement or something. So I think I could spend a year and then I’d be a mediocre DJ. And I’d just look back at that, I think, and be like, why? What was the point? Whereas if I spent a year doing work that I particularly enjoyed — which is and always has been just research, learning — then that at least sets me up for other things. And I also just think maybe I just enjoy research and learning more than other hobbies that one could have.
Then the other thing, looking back, is actually just how useful it’s been to have some new project. I know other people coming from burnout who then just have taken a long sabbatical period, maybe they end up just feeling a bit aimless. Whereas I’m like, you end up just far too busy for that and excited about this new thing.
So yeah, what I did was start focusing a lot on research. I also did other things, like I did more holidays and a couple of meditation retreats that were amazing.
And then, as a result of that, I got the kind of discussion group going with Toby Ord and Fin Moorhouse, who have coauthored this “Preparing for the intelligence explosion” paper with me. Then that kind of grew a little bit in Oxford people, and I was driving forward things that I thought were particularly interesting. There started to be a little bit of excitement about that.
And then things really changed gear when Max Dalton was also burned out. We were chatting, like, “Hey, so yeah…”
Rob Wiblin: “We’re both hating things right now.”
Will MacAskill: Yeah. “What’s next for you?” He was deciding what he should do next, and I just said I would love to work with him. And then he ended up concluding, yeah, let’s make this. I’d started to have the seeds that actually this could be some sort of research institute, and he got on board.
And then, I was describing this as my holiday period, and I kind of got forcibly shunted to it being non-holiday a few months later, in a way that I’m very happy with. I was extremely happy with my work. But I also just do quite a sustained and formal kind of process for deciding our next project — where it was either continuing with the research institute, could also be a real focus on fundraising, more outreach. Could be more focused on having a podcast and YouTube and so on.
Rob Wiblin: Very lucky for me that you didn’t. Bad for the audience, but good for me not to have the competition.
Will MacAskill: Maybe we could have become cohosts. And then some other things as well. I actually just tested a lot of things. I had a week where I tested out doing a bunch of podcasts and video episodes.
I also got really extensive feedback and so on. I worked with a kind of executive coach who did this thing that was very impactful, but it’s quite intense. He asked 20 people, maybe over 20 people —
Rob Wiblin: I think I was one of them.
Will MacAskill: Oh, there we go. So I’ve seen your responses on this huge spreadsheet. He asked who were the people that I would most find it painful to get feedback from.
Rob Wiblin: And the questions were like, what are your strengths and weaknesses?
Will MacAskill: Strengths and weaknesses, primarily. This huge spreadsheet with just a list of all of my strengths and weaknesses.
Rob Wiblin: This was your break, right?
Will MacAskill: This was on holiday, actually, like literal holiday. I have work to do on improving my holidays still. But that was actually hugely impactful, because I do think, no matter how much you try and get feedback in your work life and one-on-ones and so on, people just go a little bit harder if it’s anonymous, directed via someone else.
Rob Wiblin: And that person’s really probing you, being like, “You suggested Will might have this weakness. Can you really put your finger on what the issue is?”
Will MacAskill: Yeah, exactly. So that was super helpful. And one of the big things that came up was focus. Looking back, by the time it was 2022, I’d just built up so many formal and informal responsibilities in a way that was basically impossible to maintain and stay sane. So I much more now try to be like, there’s going to be one thing that I’m really pushing forward, which is this research.
So that was the very personal angle on what you were asking. There were, at the same time, just the changes happening in AI. OK, if there’s a time to start working on this…
But then also, just the rise in AI safety as a legitimate field that now Yoshua Bengio and Geoff Hinton were talking about, and there’s AI safety institutes and so on. Very obvious thought of like, well, what else are we missing here? And at various conferences and meetings, it seemed like there was broad interest among people working in AI safety and the effective altruism movement. It looked like, yeah, this is kind of important. And I just started feeling excited about it.
Rob Wiblin: I guess we’ve given a broad overview of the portfolio of things that Forethought could end up working on. So you doing this better futures thing is going to be the main focus going forward, and we’re having this interview with Tom Davidson about the seizures of power. I’m not sure whether he’ll just keep focusing on that or is planning to move on to anything else.
Is there anything else that Forethought is doing that we haven’t talked about in this interview or the interview with Tom?
Will MacAskill: Yeah, Tom Davidson’s work on concentration of power is obviously super important and a big focus. I’m hopeful that Tom and I might coauthor a bit after that. This is tentative, but perhaps we’ll do some work on viatopia together, given just how broad what we’ve talked about is.
Rob Wiblin: It’s a pretty big remit.
Will MacAskill: Yeah. I think we have at least briefly covered most things. But what will happen is just much deeper dives into different areas. Lizka Vaintrob has gone very deep on differential AI development, diffusion, and deployment. Tom Davidson and Rose Hadshar in particular are doing more on the dynamics of the intelligence explosion. With Rose we had this paper on Intelsat.
And then in the future I can certainly imagine more depth on some of these particular issues. Maybe it’ll be digital rights, maybe it’ll be space governance or others as well.
Rob Wiblin: It sounds like another thing that might be coming up is reorienting the project around writing briefs for solving all of these problems to initiate with an AI model once it’s ready to actually begin to take on these projects.
Will MacAskill: Absolutely. I think it’s likely that we’ll have a bit of a sprint to see, given the current state of play. I’ve seen some of the outputs of deep research but haven’t used it myself yet, for example. I would also just love to spend more time seeing how much juice can you get out of these models now? Can we just start managing them, essentially? Yeah, that’s definitely on the table too.
Rob Wiblin: Yeah. As of the week of recording, it seems like deep research — the new thing from OpenAI, which I guess only became available a couple of days ago — is extremely good at summarising information across existing sources: reading them and synthesising them.
Somewhat weaker at having new insights based on that. But it’s possible people will figure out ways of making it do that, because it hasn’t really been designed for that purpose. It might be that it’s actually capable of looking at all of these things and thinking about what stuff is missed in this literature. And if it’s able to take that next step, then you’re really beginning to cook on this question of maybe it can actually advance the research.
Will MacAskill: For sure. I saw one example of a mild kind of result in economics — “mild” as in it’s a fairly moderate, small contribution, but did get published — that was based primarily from o1.
And I would love to have the challenge of trying to pick something within analytic philosophy that I really think it could do the best job of, the more formal end of ethics as something I know more about as well, and see, with a day or two days could I use o3 — probably by this time we’ll have direct access to that — to create something that would be publishable? That would be one kind of bit of a rubicon really.
Rob Wiblin: Yeah. Is Forethought hiring at the moment? Or fundraising?
Will MacAskill: We’re definitely fundraising at the moment. Forethought lies outside of the kind of core areas that Good Ventures are currently open to funding. Obviously that might change. We are not hiring right at this moment, but we’re definitely on the lookout. We do plan to stay small by default, but we can certainly imagine growing a little bit more compared to where we are, in particular for bright, promising researchers. If you’ve listened this far, that’s a good sign.
Rob Wiblin: You have that kind of patience.
Will MacAskill: Yeah, that’s a good sign. In particular, people willing to tackle these sorts of topics. And it is a very unusual style of research. You don’t get it really within academia. So we’re producing these things that are not designed for academic publication in particular, often involves covering many different disciplines, involves just having this very big-picture perspective.
But if that sounds like your cup of tea, I think we need enormously more research effort happening here.
Rob Wiblin: Yeah. We mentioned earlier another way that people could contribute, which is if you’re in one of the leading AI companies, then it would be great for you folks to have access to cutting-edge models that might be useful for doing this kind of research. Like to maybe get early access a couple of months ahead of what’s publicly available.
Will MacAskill: Yeah, absolutely.
Rob Wiblin: And I guess also to be informed if the models are getting to the point where they could be useful, because I suppose that could set a different strategy for you.
Will MacAskill: For sure.
Rob Wiblin: Is there anything else you want to say about Forethought before we wrap up?
Will MacAskill: No.
Rob Wiblin: OK, we’ve just about covered it.
How does Will resist hopelessness? [03:50:13]
Rob Wiblin: A final question I was going to ask was just like an emotional check-in on how we’re feeling about the state of the world at the moment. Because I feel, preparing for this interview, going through these litany of issues, it can lead to a degree of resignation.
I just feel like there’s so many things and it’s coming at us so quickly that I’m just like, what are the chances that we’re going to actually make this work well? I mean, I’m going to keep trying. But I suppose on some weeks I’m optimistic; this week I’m feeling just a bit like things are a little bit hopeless.
Do you want to make me feel any better, or are you feeling optimistic or resigned this hour or this week or this month?
Will MacAskill: I think right at the moment I’m definitely feeling stressed. My emotional attitude towards all of this was actually just — this is kind of my personality — just positive, excited. I remember actually you and I had this conversation. We even realised our probabilities and credences weren’t that different, but the affect that we were bringing —
Rob Wiblin: Could scarcely be more different.
Will MacAskill: Extremely different. It is true that as of November, December last year, I’m definitely feeling more stressed, and that has become more of a dominant emotion relative to excitement. It’s still the case that just working with the models, it’s just amazing.
Rob Wiblin: It’s such a pleasure.
Will MacAskill: I’m learning so much. Already, especially for the kind of weirdo multidisciplinary research, having the language models, you can just be like, “What areas of science would be benefited in space?” “Well, actually, material science: because you’ve got zero gravity, you can make perfect crystalline structures.” I’m like, I would never… It would take me weeks to figure this out. But I’ve got this incredible polymath on hand at any point.
Rob Wiblin: Yeah, it’s brilliant for inquiries like that. If people are not using it for stuff like that, they absolutely should. I guess not all jobs involve that. But if you’ve got a question like, “What are some great examples of disturbing utopian literature from the 18th century or the 19th century?,” it’s going to tell you.
Will MacAskill: Yeah, exactly. Which is really incredible. There are definite causes for optimism. So the fact that we’re getting these reasoning models — and in general, kind of more oracle language models — earlier is great news. It means we’re just going to be a lot wiser than we could be at the point of developing highly agentic AIs.
The fact that we’ve got this new paradigm of putting a lot of compute into inference is great news from my perspective, because it means that the transition from subhuman to AGI and from there to superintelligence will be more gradual — because to begin with, you’re spending hundreds of thousands of dollars to get the AGI-level output, and then that cost comes down more gradually over time.
Rob Wiblin: Still vertiginous, but better than it could be.
Will MacAskill: Yeah, absolutely. Also, this may be more temporary, but the fact that the models currently are reasoning out loud using human language I think has the potential to be very helpful from a safety perspective as well.
Rob Wiblin: At what point are we not going to be able to trust what the models are writing on the scratchpad? I think everyone suspects that it’s fine at the moment, but at some point…
Will MacAskill: My thought is a little more that they’ll just start reasoning in things that are not language.
Rob Wiblin: Oh, OK.
Will MacAskill: I mean, definitely by the point that you’re superintelligence — like if you think how limited we are in our conceptual vocabulary, and how important actually throughout scientific discovery that just good notation has been. My understanding is part of what helped Einstein come up with the general theory of relativity is just that he was really deep in this niche area of mathematics, and that helped him have a very compact notation to express some of the thoughts he wanted to express. So yeah, I imagine these AIs talking in like wingdings to each other.
Rob Wiblin: Hope they’re aligned!
Will MacAskill: We’re getting a little more negative again. So those things are positive. I think the fact that we will get amazing AI assistants, that is really good, really promising.
And one thing that has just been amazing, and you must feel like this too, is just seeing the kind of payoff from many years of kind of movement building and field building.
In my case that’s primarily people generally getting convinced of the ideas of effective altruism and now switching — because part of what’s amazing about the set of ideas is that people update and are willing to change as new information comes in. But then also seeing the people who are the early pioneers talking about the importance of AI too. That is something that really gives me hope, the fact that there are now — still not enough —
Rob Wiblin: But a lot more than there could be.
Will MacAskill: Yeah. I went to visit DC to get a sense of the policy, what people working in policy are up to. And I was in particular wondering, with my research, how policy informed I should be. And I came away just being like, “I’m not going to add value here.” Which is really good! These people are extremely smart, extremely hard working, very well motivated: yeah, they’ll do a much better job than I could. And I think that’s true in many different areas.
Rob Wiblin: It’s been challenging for me to see. I think that we were right about our AI worries all along, and I wish that we had been less timid and less anxious about the risk of being wrong. I wish that we had gone harder.
But on that point, we’ve mapped out a lot of different ways that people can contribute to increase our chances of making this go better. There’s a lot of different avenues that people can pursue. If you’re in the audience and have been listening to this, then seriously think about is there some area here where you might be able to do something? It’s getting late, but it’s not too late to potentially make a difference. Fingers crossed. That’s my inspiring message.
Will MacAskill: Thanks, Rob.
Rob Wiblin: Thanks so much. It’s been fun.
Will MacAskill: It’s been fun.