How to get sceptics to take safety seriously
Mustafa Suleyman: The first part of the book mentions this idea of “pessimism aversion,” which is something that I’ve experienced my whole career; I’ve always felt like the weirdo in the corner who’s raising the alarm and saying, “Hold on a second, we have to be cautious.” Obviously lots of people listening to this podcast will probably be familiar with that, because we’re all a little bit more fringe. But certainly in Silicon Valley, that kind of thing… I get called a “decel” sometimes, which I actually had to look up. I guess it’s a play on me being an incel, which obviously I’m not, and some kind of decelerationist or Luddite or something — which is obviously also bananas, given what I’m actually doing with my company.
Rob Wiblin: It’s an extraordinary accusation.
Mustafa Suleyman: It’s funny, isn’t it? So people have this fear, particularly in the US, of pessimistic outlooks. I mean, the number of times people come to me like, “You seem to be quite pessimistic.” No, I just don’t think about things in this simplistic “Are you an optimist or are you a pessimist?” terrible framing. It’s BS. I’m neither. I’m just observing the facts as I see them, and I’m doing my best to share for critical public scrutiny what I see. If I’m wrong, rip it apart and let’s debate it — but let’s not lean into these biases either way.
So in terms of things that I found productive in these conversations: frankly, the national security people are much more sober, and the way to get their head around things is to talk about misuse. They see things in terms of bad actors, non-state actors, threats to the nation-state. In the book, I’ve really tried to frame this as implications for the nation-state and stability — because at one level, whether you’re progressive or otherwise, we care about the ongoing stability of our current order. We really don’t want to live in this Mad Maxian, hyper-libertarian, chaos post-nation-state world.
The nation-state, I think we can all agree that a shackled Leviathan does a good job of putting constraints on the chaotic emergence of bad power, and uses that to do redistribution in a way that keeps peace and prosperity going. So I think that there’s general alignment around that. And if you make clear that this has the potential to be misused, I think that’s effective.
What wasn’t effective, I can tell you, was the obsession with superintelligence. I honestly think that did a seismic distraction — if not disservice — to the actual debate. There were many more practical things. because I think a lot of people who heard that in policy circles just thought, well, this is not for me. This is completely speculative. What do you mean, ‘recursive self-improvement’? What do you mean, ‘AGI superintelligence taking over’?” The number of people who barely have heard the phrase “AGI” but know about paperclips is just unbelievable. Completely nontechnical people would be like, “Yeah, I’ve heard about the paperclip thing. What, you think that’s likely?” Like, “Oh, geez, that is… Stop talking about paperclips!” So I think avoid that side of things: focus on misuse.
Is there a risk that Mustafa's company could speed up the race towards dangerous capabilities?
Rob Wiblin: On that general theme, a recurring question submitted by listeners was along these lines, basically: that you’re clearly alarmed about advances in AI capabilities in the book, and you’re worried that policy is lagging behind. And in the book you propose all kinds of different policies for containment, like auditing and using choke points to slow things down. And you say we need to find ways of, a literal quote: “Finding ways of buying time, slowing down, giving space for more work on the answers.”
But at the same time, your company is building one of the largest supercomputers in the world, and you think over the next 18 months you might do a language model training run that’s 10x or 100x larger than the one that produced GPT-4. Isn’t it possible that your own actions are helping to speed up the race towards dangerous capabilities that you wish were not going on?
Mustafa Suleyman: I don’t think that’s correct for a number of reasons. First, I think the primary threat to the stability of the nation-state is not the existence of these models themselves, or indeed the existence of these models with the capabilities that I mentioned. The primary threat to the nation-state is the proliferation of power. It’s the proliferation of power which is likely to cause catastrophe and chaos. Centralised power has a different threat — which is also equally bad and needs to be taken care of — which is authoritarianism and the misuse of that centralised power, which I care very deeply about. So that’s for sure.
But as we said earlier, I’m not in the AGI intelligence explosion camp that thinks that just by developing models with these capabilities, suddenly it gets out of the box, deceives us, persuades us to go and get access to more resources, gets to inadvertently update its own goals. I think this kind of anthropomorphism is the wrong metaphor. I think it is a distraction. So the training run in itself, I don’t think is dangerous at that scale. I really don’t.
And the second thing to think about is there are these overwhelming incentives which drive the creation of these models: these huge geopolitical incentives, the huge desire to research these things in open source, as we’ve just discussed. So the entire ecosystem of creation defaults to production. Me not participating certainly doesn’t reduce the likelihood that these models get developed. So I think the best thing that we can do is try to develop them and do so safely. And at the moment, when we do need to step back from specific capabilities like the ones I mentioned — recursive self-improvement and autonomy — then I will. And we should.
And the fact that we’re at the table — for example, at the White House recently, signing up to the voluntary commitments, one of seven companies in the US signing up to those commitments — means that we’re able to shape the distribution of outcomes, to put the question of ethics and safety at the forefront in those kinds of discussions. So I think you get to shape the Overton window when it’s available to you, because you’re a participant and a player. And I think that’s true for everybody. I think everybody who is thinking about AI safety and is motivated by these concerns should be trying to operationalise their alignment intentions, their alignment goals. You have to actually make it in practice to prove that it’s possible, I think.
Open sourcing frontier ML models
Mustafa Suleyman: I think I’ve come out quite clearly pointing out the risks of large-scale access. I think I called it “naive open source – in 20 years’ time.” So what that means is if we just continue to open source absolutely everything for every new generation of frontier models, then it’s quite likely that we’re going to see a rapid proliferation of power. These are state-like powers which enable small groups of actors, or maybe even individuals, to have an unprecedented one-to-many impact in the world.
Just as the last wave of social media enabled anybody to have broadcast powers, anybody to essentially function as an entire newspaper from the ’90’s: by the 2000’s, you could have millions of followers on Twitter or Instagram or whatever, and you’re really influencing the world — in a way that was previously the preserve of a publisher, that in most cases was licenced and regulated, that was an authority that could be held accountable if it really did something egregious. And all of that has now kind of fallen away — for good reasons, by the way, and in some cases with bad consequences.
We’re going to see the same trajectory with respect to access to the ability to influence the world. You can think of it as related to my Modern Turing Test that I proposed around artificial capable AI: like machines that go from being evaluated on the basis of what they say — you know, the imitation test of the original Turing test — to evaluating machines on the basis of what they can do. Can they use APIs? How persuasive are they of other humans? Can they interact with other AIs to get them to do things?
So if everybody gets that power, that starts to look like individuals having the power of organisations or even states. I’m talking about models that are two or three or maybe four orders of magnitude on from where we are. And we’re not far away from that. We’re going to be training models that are 1,000x larger than they currently are in the next three years. Even at Inflection, with the compute that we have, will be 100x larger than the current frontier models in the next 18 months.
Although I took a lot of heat on the open source thing, I clearly wasn’t talking about today’s models: I was talking about future generations. And I still think it’s right, and I stand by that — because I think that if we don’t have that conversation, then we end up basically putting massively chaotic destabilising tools in the hands of absolutely everybody. How you do that in practise, somebody referred to it as like trying to catch rainwater or trying to stop rain by catching it in your hands. Which I think is a very good rebuttal; it’s absolutely spot on: of course this is insanely hard. I’m not saying that it’s not difficult. I’m saying that it’s the conversation that we have to be having.
Voluntary vs mandatory commitments for AI labs
Rob Wiblin: In July, Inflection signed on to eight voluntary commitments with the White House](https://www.whitehouse.gov/briefing-room/statements-releases/2023/07/21/fact-sheet-biden-harris-administration-secures-voluntary-commitments-from-leading-artificial-intelligence-companies-to-manage-the-risks-posed-by-ai/), including things like committing to internal and external security testing, investing in cybersecurity and insider threat safeguards, and facilitating third-party discovery and reporting of vulnerabilities. Those are all voluntary, though. What commitments would you like to become legally mandatory for all major AI labs in the US and UK?
Mustafa Suleyman: That is a good question. I think some of those voluntary commitments should become legally mandated.
Number one would be scale audits: What size is your latest model?
Number two: There needs to be a framework for harmful model capabilities, like bioweapons coaching, nuclear weapons, chemical weapons, general bomb-making capabilities. Those things are pretty easy to document, and it just should not be possible to reduce the barriers to entry for people who don’t have specialist knowledge to go off and manufacture those things more easily.
The third one — that I have said publicly and that I care a lot about — is that we should just declare that these models shouldn’t be used for electioneering. They just shouldn’t be part of the political process. You shouldn’t be able to ask Pi who Pi would vote for, or what the difference is between these two candidates. Now, the counterargument is that many people will say that this might be able to provide useful and accurate and valuable information to educate people about elections, et cetera. Look, there is never going to be a perfect solution here: you have to take benefits away in order to avoid harms, and that’s always a tradeoff. You can’t have perfect benefits without any harms. That’s just a tradeoff. I would rather just take it all off the table and say that we —
Rob Wiblin: We can put some of it back later on, once we understand how to do it safely.
Mustafa Suleyman: That’s the best way. That is totally the best way. Now, obviously, a lot of people say that I’m super naive in claiming that this is possible because models like Stable Diffusion and Llama 2 are already out in open source, and people will certainly use that for electioneering. Again, this isn’t trying to resolve every single threat vector to our democracy, it’s just trying to say, at least the large-scale hyperscaler model providers — like Amazon, Microsoft, Google, and others — should just say, “This is against our terms of service.” So you’re just making it a little bit more difficult, and maybe even a little bit more taboo, if you don’t declare that your election materials are human-generated only.